Note: If you see this text you use a browser which does not support usual Web-standards. Therefore the design of Media Art Net will not display correctly. Contents are nevertheless provided. For greatest possible comfort and full functionality you should use one of the recommended browsers.

Themesicon: navigation pathMapping and Texticon: navigation pathImage Search
Image Search or Collection Guiding?
Stéphane Marchand-Maillet
GIFT; GNU Image Finding Tool (Viper), 2000


The volume of electronic documents generated is growing consistently and exponentially every day. It was reported recently that in 2002, 5 billions of GigaBytes of original data were created worldwide (thus filling 50 millions of current generic harddisk drives). Part of this massive volume corresponds to textual exchanges (such as emails) and thus may be managed at a semantic level by techniques parallel to text information retrieval, applied successfully in Google, for example. However, a significant part of this data corresponds to visual multimedia information such as images or videos. In the case of these visual documents, the management cannot be performed automatically with high accuracy. This is due to the well-known semantic gap, defined as the discrepancy between the capabilities of a machine and that of a human to perceive visual content. Despite several decades of research, automated image (and video) content analysis is still too poor to reliably replace humans in management tasks. Locally, in the Viper group [VIPER], we are following several research directions that should lead to complementary solutions to the problem of inferring semantically meaningful


interpretations of visual content. Our initial research on Contentbased Image Retrieval «GIFT» [GIFT] has led us to considering annotated image collections. The problem of annotation in itself is far from trivial and we look at how to assign textual labels to still pictures [ANNOTATION]. Note that this is very much in line with the Semantic Web initiative [SEMWEB]. We also look at how to extract automatically (using learning machines) text from visual content. [1] and describe our advances on multimedia data visualization.

We face a context where we need to automate the management of document collections as much as possible but where the presence of a human operator is made necessary to reach a sufficient level of efficiency and accuracy. The simplest example of a private user managing his/her own digital photo and video collection already calls for the use of a number of tools to efficiently keep track of all the content. Content-based tools come as solutions to such problems. They aim at facilitating document search and retrieval, solely based on the automated analysis and characterization of visual content. Whereas they do succeed in performing search tasks, this only offers a

icon: next page