Specification of the Continuous Media Markup Language (CMML), Version 1.0Commonwealth Scientific and
Industrial Research Organisation CSIRO,
AustraliaLocked Bag 17North RydeNSW2113Australia+61 2 9325 3141Silvia.Pfeiffer@csiro.auhttp://www.cmis.csiro.au/Silvia.Pfeiffer/Commonwealth Scientific and
Industrial Research Organisation CSIRO,
AustraliaLocked Bag 17North RydeNSW2113Australia+61 2 9325 3133Conrad.Parker@csiro.auhttp://www.cmis.csiro.au/Conrad.Parker/This specification defines the Continuous Media Markup
Language (CMML), version 1.0, an XML-based markup language for
time-continuous data. It is a sister document to the
specification of the ANNODEX(TM)
annotation and indexing format for time-continuous data. The
CMML is an authoring language for annotating, indexing and
hyperlinking time-continuous data in the ANNODEX(TM) format. Its tags provide for the
creation of structured and unstructured annotations as well as
hyperlinks and addressable named anchor points for fragments of
time-continuous data. The tag names in use in CMML are similar
to the ones in XHTML.
At this point in time, the right to produce derivative works
is not granted to the IETF as the authors are uncertain about
the necessity to create a working group. The specification is
not encumbered by patents. The ANNODEX(TM) format is protected
by a trademark to prevent the use of the term "annodex" for any
related but non-conformant and therefore non-interoperable
technology.
Please note that this document assumes that the reader has a
fluent working knowledge of XML, HTML, XHTML and
the World Wide Web. Knowledge about the ANNODEX(TM) sister document is also
presumed.
Time-continuous data in the ANNODEX(TM) format contains
XML-based annotations and hyperlinking information that enables
it to be browsed by client applications, and crawled and indexed
by search engines. The Continuous Media Markup Language CMML is
a simple markup language for authoring the XML data to be
multiplexed with the time-continuous data given in binary
bitstreams. This process eventually creates ANNODEX(TM) format
bitstreams. The CMML has much in common with XHTML.
The CMML can describe one or several time-continuous media
bitstreams. It is used to create all the tags required for
authoring the annotation information for the ANNODEX(TM)
format. It therefore basically contains the same tags as the
annotation bitstream in ANNODEX(TM) format bitstreams, but also
has some additional tags required for identifying and
synchronising one or several time-continuous bitstreams that
will be multiplexed together for the creation of one conherent
ANNODEX(TM) format bitstream.
The following picture illustrates the multiplexing
activity:
-<-------
|
Multiplexing
|
v
----------------------------------------------------------------------
|stream|head|anchor_1| media packets |anchor_2| media packets ...
----------------------------------------------------------------------
]]>The file extension of CMML files is ".cmml". This document
also applies for registration of the mime-type "text/cmml" for
CMML files.
The CMML is technically fully specified through its DTD as
given in the Appendix. The semantic meaning of each of the tags,
their content and their attributes is specified in the following
sections. The Appendix also contains an example of a CMML
(instance) document.
At the beginning of the CMML DTD, several parameter entities
are defined that are used throughout the DTD as data types. This
section gives a brief overview of them and refers to the
relevant standards in which they are defined.
A "URI" is a character string that conforms to the
specification of the Uniform Resource Identifier as defined in
RFC 2396. The currently proposed
temporal URI fragment identifier
specification is supported, too. A URI generally points
to a Web resource.
The "LanguageCode" defines a collection of constant strings
that each identify a specific language as defined in RFC 1766. It is used to provide
internationalisation support. To that end, the i18n entity
draws together a language given by a "LanguageCode" with the
directionality of that language in "dir" given either as ltr
(left-to-right) or rtl (right-to-left).
There are three different time specifications in use in
CMML: "Timestamp", "Playbacktime" and "UTCtime".
A "Timestamp" is generally a name-value pair which defines
a time point. The time point value is interpreted according to
the time scheme given in the name. If the name is ommitted, it
defaults to "npt=". The available time specifications stem
from different sources:
"npt" is "normal playback time" as used in the RTSP standard.
"smpte" are several frame-based time labels as defined
by the Society of Motion Pictures and
Television Engineers. As fractional frames are
meaningless for video and ambiguous for audio in the
drop-frame situations, they are not used. The drop-frame
algorithms for calculating the exact times can be found in
the mentioned SMPTE standard.
"utc" is the "universal time code" as specified in the
ISO 8601 standard.
Thus, the available time schemes are:
"npt=" NPT time with a second or subsecond basisSpecification as BNF: "smpte-24=" SMPTE time with a 24 fps basis "smpte-24-drop=" SMPTE time with a 24/1.001 fps basis "smpte-25=" SMPTE time with a 25 fps basis "smpte-30=" SMPTE time with a 30 fps basis "smpte-30-drop=" SMPTE time with a 30/1.001 fps basis "smpte-50=" SMPTE time with a 50 fps basis "smpte-60=" SMPTE time with a 60 fps basis "smpte-60-drop=" SMPTE time with a 60/1.001 fps basisSpecification as BNF:"clock=" UTC time with a second or subsecond basisSpecification as BNF:The "Playbacktime" entity is a data type that just
specifies a SMPTE or a NPT time. It is therefore equal to the
Timestamp entity without the UTC specification.
The "UTCtime" entity is a data type that just specifies a
UTC time without an identifier. UTC time is specified as in
the Timestamp entity, but without the "clock=" identifier.
A CMML file is an XML instance document of the CMML DTD. An
example is given in the Appendix. It starts with the usual xml
directive and the DTD specification (see
http://www.w3.org/TR/REC-xml#sec-prolog-dtd). This is an example
preamble:
]]>After the preamble, the CMML tags follow. A CMML file has a
"cmml" tag as the root element. It embraces all the other tags.
]]>The "cmml" tag encloses at most one "stream" element,
exactly one "head" element, and as many "a" elements as the
document author requires. An "a" element describes a fragment
of the to be created ANNODEX(TM) bitstream. The ANNODEX(TM)
bistream is created by multiplexing the bitstreams given in
the "location" attributes of the "media" tags of the "stream"
element together with the CMML annotations in a
time-synchronous manner, as specified in the ANNODEX(TM) format.
Attributes of the "cmml" element are the usual xml root tag
attributes: an identifier "id" and a namespace "xmlns".
The "stream" element contains information about the input
time-continuous bitstreams that are to be multiplexed together
on authoring the ANNODEX(TM) format bitstreams.
]]>The "timebase" attribute contains a playback time in
seconds associated with that first data packet. All other
times in the CMML file MUST be calculated relative to this
timebase. For example, a timebase of 300 seconds npt for a
video file implies that the first frame is related to a play
time of 300 seconds, and an anchor with a start time of 350
seconds is to be included 50 seconds into the ANNODEX(TM)
bitstream. If no timebase is given, the timebase defaults to
0 npt. The timebase can be given as a SMPTE or NPT time, not
as a utc time.
The "utc" attribute associates a calendar date and a
wall-clock time with the timebase. It therefore provides a
mapping of the timebase to a real-world clock time and is
given as a UTC time. If it is omitted, the start attribute in
the media tag, and the start and end attributes in anchor tags
MUST NOT be specified as UTC times.
The content model of the "stream" tag then proposes an
arbitrary number of input bitstreams. These are described one
by one in the "media" element.
A "media" tag contains information on one of the input
bitstreams for the multiplexing process. The relevant
bitstream (fragment) is referenced through the "location"
attribute. The location is a URI and may thus also contain a
temporal URI fragment specification which narrows down the
input file to that given subpart. That resource is multiplexed
into the ANNODEX(TM) format bitstream starting at the time
given in the "start" attribute and ending at the latest at the
time given in the "end" attribute. The "start" and "end"
attributes are interpreted relative to the timeline of the
ANNODEX(TM) format bitstream.
]]>The "granulerate" attribute contains the base temporal
resolution in Hz of the input bitstream refered in the
"location" attribute. It depends on the encoding format of the
input bitstream and typically contains the framerate for video
(e.g. 25 frames/sec) and the samplerate for audio (e.g. 44100
samples/sec), but may contain any rational number given with
an integer denominator larger than 1 sec (e.g. 25 frames on 2
seconds). Each bitstream has its own granulerate dependent on
its specific encoding. This attribute is implied as it can be
determined automatically during the multiplexing process when
the headers of the encoded media bitstream contain this
information. For bitstreams without header, such as
uncompressed audio, the author of the CMML file can provide
the granulerate to the multiplexer in this attribute.
The "mimetype" attribute specifies the MIME type of the input bitstream refered
in the "location" attribute. It is optional as the MIME type
can often be derived easily from the file name or file header
of the media source during multiplexing.
The "location" attribute specifies a URI to the input
bitstream. Commonly used URI schemes are "file" and "http".
For specifying temporal subsets of the input bitstream, use
the temporal URI fragment
specification.
The "start" attribute specifies a time in the output
ANNODEX(TM) bitstream at which the media bitstream will be
inserted. This time is specified with respect to the
"timebase" attribute given in the "stream" element.
The "end" attribute specifies a time in the output
ANNODEX(TM) bitstream at which the media bitstream stop at the
latest. This time is also specified with respect to the
"timebase" attribute given in the "stream" element. This
attribute is not required when the full bitstream is used.
The CMML "head" element contains annotation information on
the complete ANNODEX(TM) bitstream, which the CMML file is used
to create. It therefore contains header-type information such as
a title for the bitstream, and meta information describing the
bitstream.
The "head" element is declared as the following:
]]>The "head" tag must contain a "title" tag. It may contain one
"base" tag before or after the "title" tag and any number of
"meta" tags at any position.
The "%i18n;" attribute specifies the base language of the
"head" tag's attribute values.
The "defltlang" and "defltdir" attributes specify the default
language (language and directionality) of the anchor tags.
The value of the "profile" attribute is a space-separated
list of base URIs specifying locations of "meta" tag
schemes. These schemes may be used in the "meta" elements of the
"head" or the "a" tags.
The "title" tag gives a descriptive title for the
ANNODEX(TM) bitstream. The "title" element is declared as the
following:
]]>The "%i18n;" attribute specifies the base language of the
"title" text.
The "base" element defines the base URI of the ANNODEX(TM)
bitstream. All relative URIs of the bitstream get interpreted
relative to this base. The "base" element is empty, but its
attributes contain the base URI. It is declared as follows:
]]>The "href" attribute contains the base URI.
The "meta" element defines structured annotations for the
complete ANNODEX(TM) bitstream. A "meta" element is
empty, but its attributes contain the name-value pairs of a
structured annotation. The "meta" element is declared as
follows:
]]>The "%i18n;" attribute specifies the default language of
the meta attribute and content texts.
The "name" attribute identifies a property name. It does
not list legal values for this attribute.
The "content" attribute specifies a property's value. It
does not list legal values for this attribute.
The "scheme" attribute names a scheme to be used to
interprete the property's value. The scheme can be located via
the "profile" attribute in the "head" element.
A CMML file typically contains a number of anchors given in
"a" tags. The CMML "a" tag contains information about a fragment
of the ANNODEX(TM) bitstream. This is expressed in a number of
elements and attributes annotating, indexing, and hyperlinking
the fragment. The "start" and "end" attributes are used to give
the insertion time for the anchor into the ANNODEX(TM)
bitstream.
]]>Any number of "meta" and "desc" elements may appear in an
anchor page, but the "meta" elements must all appear first and
en bloc, while the "desc" elements must all appear last and also
en bloc.
An "a" element defines a name for the fragment in the "id"
attribute. This name can be used in URIs that point either to
the CMML file or the ANNODEX(TM) bitstream created from it. It
will be used as a fragment identifier and point straight to the
fragment defined by the "a" tag.
The "%i18n;" attribute specifies the default language used by
all the "desc" elements of the "a" tag.
The "track" attribute specifies the track that this anchor
belongs to. An annotation track is a set of "a" pages that
belong together from a semantic point of view. Anchors in the
same track must not overlap temporally. A default track must be
available always. This track is the one a client (such as a Web
browser plugin) will display by default. Other annotation tracks
may be created by the document author to describe a more
specific content. An example use are different annotation tracks
for each speaker in an audio recording of a meeting.
The "href" attribute specifies the location of a Web resource
given by a URI. It thus defines a link between the current
fragment and a resource which the author believes to be
connected closely to this fragment's content. This might be a
html page or another ANNODEX(TM) bitstream fragment or an image
etc.
The "hrefdesc" attribute gives a short textual description of
the link specified through the "href" attribute. It explains why
the connection between the current fragment and the destination
URI is made. It may e.g. encourage the viewer to follow the link
to "Get more information on blah". This attribute value can be
specified only if the "href" attribute has been specified.
The "image" attribute specifies the location of an image on
the Web given by a URI. This image should be quite small as it
is the representative image (known as "keyframe") for the
current fragment. This image may be used to visually summarise
the content of the fragment when a link to it is displayed,
e.g. by a search engine or in a table of contents.
The "start" and "end" attributes specify the time range
during which the anchor element is defined. This time range is
specified with respect to the "timebase" and "utc" attributes
given in the "stream" tag. If the "stream" tag does not contain
a "utc" specification, "start" and "end" times are not allowed
to be given in UTC time. "start" is a required attribute because
an achor without a start time is useless. "end" is optional and
only required where anchors cannot continue on to the following
anchor.
The "meta" element is specified above in the "head"
section. While a "meta" element in the "head" tag provides
meta information for the complete ANNODEX(TM) bitstream, the
"meta" elements in an "a" tag only provide meta information
for the anchor.
The "desc" tag contains a human readable, textual
description (or annotation) of the content of the
fragment. The "desc" element is declared as the following:
]]>For extracting a short text from the "desc" element as
needs to be displayed in a table of contents or as caption,
the first few characters of the description will be taken. It
therefore is recommended to place a short meaningful summary
sentence at the beginning of the description when authoring
annotations.
The "%i18n;" attribute specifies the actual language of the
text in the description. So, if it is required to give a mixed
language description, the default language will be given in
the "%i18n;" attribute of the "a" tag and the actually used
language in a specific "desc" tag is given there.
As CMML is an authoring format for ANNODEX(TM) format
bitstreams, there is a simple way to map the annotations and
meta information contained in a CMML instance document to the
annotation bitstreams and header fields of an ANNODEX(TM) format
bitstream.
There is a direct mapping between a CMML "head" element and
an ANNODEX(TM) "head" page as they both contain the same
elements and the same attributes. The additional namespace
attribute "xmlns" in the "head" page of an ANNODEX(TM) format
bitstream will be filled from the "xmlns" attribute of the
"cmml" tag of the CMML file and defaults to the same namespace
default.
There is also a direct mapping between a CMML "a" element and
an ANNODEX(TM) "a" page as they also both contain the same
elements and the same attributes, except for the "start" and
"end" attributes. The "start" attribute tells the multiplexer
that creates the ANNODEX(TM) format bitstream at what time to
insert the "a" page into the bitstream. The "end" attribute (if
present) leads to the creation of an "empty" "a" page on the
same track at the given time in the ANNODEX(TM) format bitstream
unless another "a" page apears on the same track beforehand.
The "empty" "a" page contains no attribute values for any of the
implied attributes and no "meta" or "desc" elements, but has a
copy of the "track" attribute. Again, the "xmlns" attribute is
filled from the "xmlns" attribute of the "cmml" tag of the CMML
file and defaults to the same namespace default.
The information contained in a CMML "stream" element is
partly relevant to authoring only and partly required in
different binary header fields of an ANNODEX(TM) format
bitstream. The "stream" attributes "timebase" and "utc" are
stored in the bos page of the ANNODEX(TM) media mapping
bitstream. Each of the encapsulated media bitstreams is
described by one of the "media" tags in the CMML. Their "id",
"granulerate" and "mimetype" attributes are stored in the bos
page of the respective bitstreams. The other attributes of the
"media" tag are used for authoring only and therefore not mapped
to a field in the ANNODEX(TM) format bitstream.
This section contains the registration information for the
'text/cmml' media type. While this media type is not approved by
the IANA, 'text/x-cmml' may be used to identify CMML instance
documents.
To: ietf-types@iana.org
Subject: Registration of MIME media type 'text/cmml'
MIME media type name: text
MIME subtype name: cmml
Required parameters: none
Optional parameters: charset (as in the text/xml media type).
Encoding Considerations: as appropriate for the charset and
the transport mechanism (see text/xml
media type).
Security considerations: see next section.
Interoperability considerations: CMML is a free specification
that is independent of any media encoding format. It is designed
to provide interoperability with existing XML tools and
systems. Its specification is not patented and can be
implemented by third parties without patent considerations.
Additional information:Magic numbers: none. However, CMML files start with the XML
preamble as any XML document)
and will also have the string near the
beginning of the file.File extension: .cmmlMacintosh File Type Code: "TEXT"Intended usage: COMMONFragment identifiers: Any named element, i.e. element that
contains an "id" attribute, may be referenced through a fragment
identifier of a URI. However, the values of the id attribute of
the anchor tags are the most important ones used for addressing
media fragments. Also, the generic
temporal addressing scheme proposed for standardisation
can be used as a fragment address and then relates to the last
anchor whose start time is just before the given temporal
offset.
As CMML is a markup language created by using XML, the same
security considerations that apply to XML, apply to CMML.
As the CMML is an authoring language for ANNODEX(TM) format
bitstreams, there is no executable code attached to this
language. The implementation of a multiplexer to actually create
an ANNODEX(TM) bitstream must be careful when handling input
bitstreams, which are binary data.
Extensible Markup Language (XML) 1.0World Wide Web ConsortiumMIT Laboratory for Computer Science545 Technology SquareCambridgeMA02139US+ 1 617 253 2613+ 1 617 258 5999timbl@w3.orghttp://www.w3c.orgHTML 4.01 SpecificationWorld Wide Web ConsortiumMIT Laboratory for Computer Science545 Technology SquareCambridgeMA02139US+ 1 617 253 2613+ 1 617 258 5999timbl@w3.orghttp://www.w3c.orgXHTML(TM) 1.0 The Extensible Hyper Text Markup LanguageWorld Wide Web ConsortiumMIT Laboratory for Computer Science545 Technology SquareCambridgeMA02139US+ 1 617 253 2613+ 1 617 258 5999timbl@w3.orghttp://www.w3c.orgUniform Resource Identifiers (URI): Generic SyntaxWorld Wide Web ConsortiumMIT Laboratory for Computer Science545 Technology SquareCambridgeMA02139US+1 617 253 5702+1 617 258 8682timbl@w3.orgUniversity of California, IrvineDepartment of Information and Computer ScienceUniversity of California, IrvineIrvineCA92697-3425US+1 949 824 7403+1 949 824 1715fielding@ics.uci.eduXerox PARC3333 Coyote Hill RoadPalo AltoCA94304US+1 650 812 4365+1 650 812 4333masinter@parc.xerox.comReal Time Streaming Protocol (RTSP)Columbia UniversityDept. of Computer Science1214 Amsterdam AvenueNew YorkNY10027USschulzrinne@cs.columbia.eduNetscape Communications Corp.501 E. Middlefield RoadMountain ViewCA94043USanup@netscape.comRealNetworks1111 Third Avenue Suite 2900SeattleWA98101USrobla@real.comTags for the Identification of LanguagesUNINETTPb. 6883 ElgeseterTrondheim7002NorwayHarald.T.Alvestrand@uninett.noMultipurpose Internet Mail Extensions (MIME) Part Two: Media TypesInnosoft Internationl, Inc.1050 East Garvey Avenue SouthWest CovinaCA91790USAned@innosoft.comFirst Virtual Holdings25 Washington AvenueMorristownNJ07960USAnsb@nsb.fv.comXML Media TypesUniversity of California, IrvineDepartment of Information and Computer ScienceIrvineCA92697-3425USAejw@ics.uci.eduFuji Xerox Information SystemsKSP 9A7, 2-1, Sakado 3-chome, Takatsu-kuKawasaki-shiKanagawa-ken213Japanmurata@fxis.fujixerox.co.jpSMPTE STANDARD for Television, Audio and Film - Time and Control Code The Society of Motion Picture and Television Engineers595 W. Hartsdale Ave.White PlainsNY10607USAsmpte@smpte.orgData elements and interchange formats -- Information interchange -- Representation of dates and times International Organization for Standardization1 rue de VarembreCase Postale 56Geneva201211CHcentral@iso.orgSyntax of temporal URI fragment specifications (work in progress)Commonwealth Scientific and Industrial Research OrganisationLocked Bag 17North RydeNSW2113Australia+ 61 2 9325 3100+ 61 2 9325 3200Silvia.Pfeiffer@csiro.auhttp://www.annodex.netCommonwealth Scientific and Industrial Research OrganisationLocked Bag 17North RydeNSW2113Australia+ 61 2 9325 3100+ 61 2 9325 3200Conrad.Parker@csiro.auhttp://www.annodex.netSpecification of the ANNODEX(TM) annotation and indexing format for time-continuous data files, Version 1.0 (work in progress)Commonwealth Scientific and Industrial Research OrganisationLocked Bag 17North RydeNSW2113Australia+ 61 2 9325 3100+ 61 2 9325 3200Silvia.Pfeiffer@csiro.auhttp://www.annodex.netCommonwealth Scientific and Industrial Research OrganisationLocked Bag 17North RydeNSW2113Australia+ 61 2 9325 3100+ 61 2 9325 3200Conrad.Parker@csiro.auhttp://www.annodex.net
]]>The MatrixThere is no spoon: Neo is waiting to see the Oracle in a room
full of children doing seemingly impossible things. One is making
spoons bend through telekenesis. Neo tries to do it himself, but
fails. Spoon boy: "Do not try and bend the spoon that's impossible,
instead only try to realize the truth." Neo: "What truth?" Spoon
boy: "There is no spoon." Neo: "There is no spoon?" Spoon boy: "Then
you'll see that it is not the spoon that bends, it is only
yourself." Neo tries again...
Den Löffel gibt es nicht: Neo entdeckt beim Besuch
des Orakels wie unwirklich seine Welt ist. Beim Versuch, einen
Löffel durch Telekinese zu verbiegen, bekommt er von dem Kind den
Rat: "Den Löffel gibt es nicht."
]]>A subpart of a resource covering some
temporal interval.XML tags and their content used to
describe a media document.the task of giving textual
descriptions to fragments of media documents.the task of identifying index points
for media documents or fragments thereof.the task of linking from one Web
resource to another. If a link has a fragment offset into
the resource, this is sometimes called deep
hyperlinking.a set of Anchor pages representing
semantically correlated annotations of a time-continuous
resource.A specific file format
for storing annotation, hyperlinking, and indexing
information multiplexed together with the time-continuous
data bitstreams they describe.a sequence of data containing
samples of time-continous data.Annotated and indexed bitstream format.Continuous Media Markup Language.Document Type Declaration.eXtensible Markup Language.World Wide Web.Unified Resource Identifier.The authors greatly acknowledge the contributions of Andre
Pang, Andrew Nesbit, and Simon Lai in developing this standard.