ISO/IEC-15938-3 › Information technology - Multimedia content description interface - Part 3: Visual
The following bibliographic material is provided to assist you with your purchasing decision:
Included in this current edition are the following subparts:
1ST EDITION CORRIGENDUM 1 - March 15, 2004
FOR 1ST EDITION AMENDMENT 1 SEE - Aug. 1, 2004
FOR 1ST EDITION AMENDMENT 2 SEE - April 1, 2006
FOR 1ST EDITION AMENDMENT 3 SEE - April 1, 2009
FOR 1ST EDITION AMENDMENT 4 SEE - Oct. 15, 2010
Organization of the document
The structure of this document is as follows. Clauses 2-4 specify the terms, abbreviations, symbols and conventions used throughout the document. Clauses 5-11 contain definitions of the description tools standardized by 15938-3 grouped by the visual features they are associated with, starting with basic structures and containers in Clause 5, through color, texture, shape, motion, localization in Clause 10. Clause 11 contains the remaining, unclassified items.
Each description tool is described by the following subclauses:
- Syntax: Normative DDL specification of the Ds or DSs.
- Binary Syntax: Normative binary representation of the Ds or DSs.
- Semantic: Normative definition of the semantics of all the components of the corresponding D or DS.
Overview of Visual Description Tools
This part of ISO/IEC 15938 specifies tools for description of visual content, including still images, video and 3D models. These tools are defined by their syntax in DDL and binary representations and semantics associated with the syntactic elements. They enable description of the visual features of the visual material, such as color, texture, shape and motion, as well as localization of the described objects in the image or video sequence. An overview of the visual description tools is shown in Figure 1.
The basic structure description tools include five supporting tools of visual descriptions defined in clauses 6–11. They are categorized into two groups, descriptor containers and basic supporting tools. The former consists of three datatypes, GridLayout providing efficient representations of visual features on grids, TimeSeries representing temporal arrays of several descriptions, and MultipleView describing a 3D object using several pictures captured from different view angles. The latter contains two tools, Spatial2DCoordinateSystem used to specify the 2D coordinate system and TemporalInterpolation indicating the interpolation method between two samples on a time axis.
The remaining description tools, except for the FaceRecognition descriptor, are associated with visual features and are grouped into five feature categories: Color, Texture, Shape, Motion and Localization.
The color description tools include four color descriptors to represent different aspects of color features: representative colors (DominantColor), color distribution (ScalableColor), spatial distribution of colors (ColorLayout and ColorStructure). It also contains two supporting tools, ColorSpace and ColorQuantization used in DominantColor and an extension of ScalableColor to a group of frames or pictures (GoFGoPColor). All the color descriptors can be extracted from arbitrarily shaped regions.
The texture description tools facilitate browsing (TextureBrowsing) and similarity retrieval (HomogeneousTexture and EdgeHistogram) using the texture of a still or moving image region. All the texture descriptors can be extracted from arbitrarily shaped regions.
The shape description tools include two descriptors that characterize different shape features of a 2D object or region. The RegionShape descriptor captures the distribution of all pixels within a region and the Contour Shape descriptor characterizes the shape properties of the contour of an object. The Shape3D descriptor provides an intrinsic shape characterization of 3D mesh models.
The motion description tools include four descriptors that characterize various aspects of motion. The CameraMotion descriptor specifies a set of basic camera operations such as, for example, panning and tilting. The motion of a key point (pixel) from a moving object or region can be characterized by the MotionTrajectory descriptor. The ParametricMotion descriptor characterizes an evolution of an arbitrarily shaped region over time in terms of a 2D geometric transformation. Finally, the MotionActivity descriptor captures the pace of the motion in the sequence, as perceived by the viewer. All motion descriptors except for CameraMotion can be extracted from arbitrarily shaped regions.
The localization description tools can be used to indicate regions of interest in the spatial (RegionLocator) and spatio-temporal (SpatioTemporalLocator) domains.
The FaceRecognition descriptor is not associated with any particular visual feature and can be used to describe a human face for applications requiring the matching and retrieval of face images.
To find similar documents by classification:
35.040.40 (Coding of audio, video, multimedia and hypermedia information)
This document comes with our free Notification Service, good for the life of the document.
This document is available in either Paper or PDF format.
Document Number
ISO/IEC 15938-3:2002
Revision Level
1ST EDITION
Status
Current
Publication Date
May 15, 2002
Committee Number
ISO/IEC JTC 1/SC 29