Running Title

Issues for Image Database Prototypes

Table 1. Showing the difference between a conventional data and image data
	Conventional Data	Image Data
Type	Dynamic	Static
Variable Size	Similar Size	Variable Size
Size	Small	Large
Number of Devices Required	One	Many
Handling Data	Need to Manipulate Data	Insert/Delete
Interpretation of Data	One	Multiple: depending on application, image analysis, image interpretation algorithm, domain knowledge, and retrieval time
Interactivity	Less	More

Data Models and Meta Image Representation:

Most commercial systems use keywords which are associated with the images to retrieve and search images. It is considered very limited. We want to be able to select images because they are "similar" to some other image. This adds some semantic notions for images.

1. World-only data: media independent, domain specific        Models the relationship between objects.

2. Image-only data: media-specific, domain independent    These are images that are application independent so that many applications can share these images to represent different things. So, these images do not directly correspond to one particular world object. Image-only data includes:

     Uninterpreted raw image: e.g. binary file
     Registration information: e.g. compression method, digitization method, date or capture.
     Interpreted contents: aka meta data: e.g. image features, color, texture. Contents can be classified into three types:
                * structural: spatial and temporal
                * whole image features: encoding of the whole image: color, texture signatures etc...
                * image object features: encoding and identification of regions: e.g. color, shape, texture.
    Transformation relationships: e.g. raster image can be translated into a graph (but not vice versa)

3. Annotation data: media-specific, domain specific These are the relationships that link the world application objects with the images and their features. There are many ways to describe and image and its content. The three major categories of descriptive mechanisms are:

Textual descriptions - use keywords to describe the whole image or specific parts of the image.

Semantic descriptions - use higher level semantic representations (sementic networks or object models)

Content based - uses the actual content or the high-level representations of the content to classify the image objects.

Searching, Retrieval, and Browsing of Images and Image Objects

We need to address query issues. They are expressivity, formulation, optimization, presentation of results, and reformulation. Image should be retrievable by:

data type: e.g. "retrieve all jpeg files"
structure: e.g. "retrieve all the documents with an image."
association: e.g. "retrieve all the articles that are linked to XML in this document"
browsing:
content: e.g. "retrieve images which look like X" or "retrieve images containing a car"

Expressivity:

Exact feature match, only possible with user-generated textual descriptions. use keywords or some sort of picture description language.

Inexact feature match, need to be able to do feature extraction. it can use semantic description.

Structural match, requires spatial and temporal reasoning on visual information

Formulation and Presentation:

Expression of similarity distance. We have to define what is similar and rank the images based this measure.

Weightings and preferences. How to put a weight on a particular type of images in the query. e.g. "retrieve all the images of Babe Ruth [priority 1] and people [priority 3]"

Compositional queries. e.g. "Retrieve all the sequences that contain X"

Optimization:

Image transformation will require a lot of time, say during similarity matching. Optimizations will help speed up the process.

Reformulation:

Ranking result. Based on similarity distance or user defined

Statistical queries. Queries such as "how many images of X?" should return the number of such images.

Presentation of results. Results may require syncronization (movie).