Content-Based Filters in Image and Video Retrieval (ACM SIG-IR 2004)
Abstract: This paper investigates the level of metadata accuracy required for image filters to be valuable to users. Access to large digital image and video collections is hampered by ambiguous and incomplete metadata attributed to imagery. Though improvements are constantly made in the automatic derivation of semantic feature concepts such as indoor, outdoor, face, and cityscape, it is unclear how good these improvements should be and under what circumstances they are effective. This paper explores the relationship between metadata accuracy and effectiveness of retrieval using an amateur photo collection, documentary video, and news video. The accuracy of the feature classification is varied from performance typical of automated classifications today to ideal performance taken from manually generated truth data. Results establish an accuracy threshold at which semantic features can be useful, and empirically quantify the collection size when filtering first shows its effectiveness. [Download DOC]
Some of the interface designs I worked on:
We ran a study and wondered why nobody was using the filter. Now it seems pretty clear: it's too difficult to use! It's obviously an expert interface. I decided to create a filter interface for real people and leave the expert interface to the experts.
I tried to create an interface that lets them express priority ("Faces are most important to me"). When specifying multiple filters, the vast majority of instances will show that people instantly prioritize their feature requests. So, they might say "ok I want to see faces and they should be outdoor" - in that sentence there is a priority: it's most important to see faces, 2nd most important to see outdoor shots. So I created this: