Conventional Data | Image Data | |
Type | Dynamic | Static |
Variable Size | Similar Size | Variable Size |
Size | Small | Large |
Number of Devices Required | One | Many |
Handling Data | Need to Manipulate Data | Insert/Delete |
Interpretation of Data | One | Multiple: depending on application, image analysis, image interpretation algorithm, domain knowledge, and retrieval time |
Interactivity | Less | More |
Most commercial systems use keywords which are associated with the images to retrieve and search images. It is considered very limited. We want to be able to select images because they are "similar" to some other image. This adds some semantic notions for images.
1. World-only data: media independent, domain
specific Models the relationship
between objects.
2. Image-only data: media-specific, domain independent
These are images that are application independent so that many applications
can share these images to represent different things. So, these images
do not directly correspond to one particular world object. Image-only
data includes:
Uninterpreted raw image: e.g.
binary file
Registration information: e.g. compression
method, digitization method, date or capture.
Interpreted contents: aka meta
data: e.g. image features, color, texture. Contents can be
classified into three types:
* structural: spatial and temporal
* whole image features: encoding of the whole image: color, texture
signatures etc...
* image object features: encoding and identification of regions:
e.g. color, shape, texture.
Transformation relationships: e.g.
raster image can be translated into a graph (but not vice versa)
3. Annotation data: media-specific, domain specific These are the relationships that link the world application objects with the images and their features. There are many ways to describe and image and its content. The three major categories of descriptive mechanisms are:
Textual descriptions - use keywords to describe the whole image or specific parts of the image.
Semantic descriptions - use higher level semantic representations (sementic networks or object models)
Content based - uses the actual content
or the high-level representations of the content to classify the image
objects.
Searching, Retrieval, and Browsing of Images and Image
Objects
We need to address query issues. They are expressivity,
formulation, optimization, presentation of results, and reformulation.
Image should be retrievable by:
Exact feature match, only possible with user-generated
textual descriptions. use keywords or some sort of picture description
language.
Inexact feature match, need to be able to
do feature extraction. it can use semantic description.
Structural match, requires spatial and temporal reasoning on visual information
Formulation and Presentation:
Expression of similarity distance. We have to define what is similar and rank the images based this measure.
Weightings and preferences. How to put a weight on a particular type of images in the query. e.g. "retrieve all the images of Babe Ruth [priority 1] and people [priority 3]"
Compositional queries. e.g. "Retrieve all the sequences that contain X"
Optimization:
Image transformation will require a lot of time, say during similarity matching. Optimizations will help speed up the process.
Reformulation:
Ranking result. Based on similarity distance or user defined
Statistical queries. Queries such as "how many images of X?" should return the number of such images.
Presentation of results. Results may require syncronization (movie).