Sophie

Sophie

distrib > Mageia > 1 > i586 > media > core-release > by-pkgid > 1b812d1b0e765bcc0430721ff58676d2 > files > 373

libid3_3.8_3-devel-3.8.3-19.mga1.i586.rpm

$Id: musicmatch.txt,v 1.4 2000/09/09 23:02:37 eldamitri Exp $


                    MusicMatch (TM) tag format description

Status of this document

    This document is a description of a deprecated tagging format.  The
    information contained herein is not a specification; its intent is to
    interpret the format based on hundreds of examples.  It also relies heavily
    on information obtained by others who have done similar investigations.  It
    is not based on any official documentation of the format, as such
    documentation is not publicly available.  Therefore the contents of this
    document may change to adjust for newly-discovered information, but the
    format itself is unlikely to change due to its deprecation.

    Distribution of this document is unlimited.


Abstract

    This document describes the MusicMatch tagging format present in some
    digital audio files.  This format, like other tagging specifications,
    provides a method for storing information about an audio file within itself
    to document its contents.  This format was developed by MusicMatch and
    used exclusively by older versions of Jukebox, their popular, "all-in-one"
    MP3 application.

1.  Table of contents

        Status of this document
        Abstract
   1.   Table of contents
   2.   Introduction
   3.   Conventions in this document
   4.   Tagging format
     4.1.   Header
     4.2.   Image extension
     4.3.   Image binary
     4.4.   Unused
     4.5.   Version information
     4.6.   Audio meta-data
       4.6.1.   Single-line text fields
       4.6.2.   Non-text fields
       4.6.3.   Multi-line text fields
       4.6.4.   Internet addresses
       4.6.5.   Padding
     4.7.   Data offsets
     4.8.   Footer
   5.   Identifying and parsing a MusicMatch tag
   6.   Converting to ID3v2
   7.   Copyright
   8.   References
   9.   Author's Address


2.  Introduction

    The following document describes the structure of the tagging format used
    by MusicMatch (TM) Jukebox, prior to version 4.0 of that application.  This
    program is a so-called "All-In-One" MP3 program and provides a CD-Ripper,
    WAV-to-MP3 converter, database, and MP3 player.

    The MusicMatch tagging format has gone through several incremental
    iterations in its format, although the basic structure has remained fairly
    constant throughout its history.  The various formats of the MusicMatch tag
    have been tightly coupled with the version of Jukebox that created it.  As
    such, this document will refer to the Jukebox version and tagging format
    version interchangeably.

    As of version 4.0, MusicMatch has deprecated the use of this format in
    their own Jukebox application, transitioning instead to ID3v2, an open
    standard for tagging digital audio.  Unfortunately, despite repeated
    requests, MusicMatch has not provided to the public any documents
    describing this format, and the MusicMatch Jukebox is the only
    widely-distributed software application that can read and write these tags.

    As such, this text may not be completely accurate and is surely incomplete,
    but it covers enough to the format to enable one to write robust software
    to find and parse tags in this format.  For example, the id3lib tagging
    library's MusicMatch parsing routines were written solely based on the
    information found in this document.  However, the authors cannot be held
    responsible for any inaccuracies or any harm caused by using this
    information.  One can assume that the specifition is unlikely to change,
    given MusicMatch's own abandonment of the format.  It should also be noted
    that incoporating functionality into applications to write tags in this
    format is discouraged, as the format has been officially deprecated by
    MusicMatch themselves.

3.  Conventions in this document

    This document borrows heavily from specifications written by Martin
    Nillson, author of the ID3v2 tagging standard.  Much of the structure,
    formatting, and other such conventions used in the ID3v2 specifications are
    carried over into this document.  

    Text within "" is a text string exactly as it appears in a tag.  Numbers
    preceded with $ are hexadecimal and numbers preceded with % are binary. $xx
    is used to indicate a byte with unknown content.

4.  Tag overview

    The MusicMatch Tagging Format was designed to store specific types of audio
    meta-data inside the audio file itself.  As the format was used exclusively
    by the MusicMatch Jukebox application, it is used only with MPEG-1/2 layer
    III files encoded with that program.  However, its tagging format is not
    inherently exclusive of other audio formats, and could conceivably be
    used with other types of encodings.

    MusicMatch tags were originally designed to come at the very end of MP3
    files, after all of the MP3 audio frames.  Starting with Jukebox version
    3.1, the application became more ID3-friendly and started placing ID3v1
    tags after the MusicMatch tag as well.  In practice, since very few
    applications outside of the MusicMatch Jukebox are capable of reading and
    understanding this format, it is not unusual to find MusicMatch tags
    "buried" within mp3 files, coming before other types of tagging formats in
    a file, such as Lyrics3 or ID3v2.4.0.  Such "relocations" are not uncommon,
    and therefore any software application that intends to find, read, and
    parse MusicMatch tags should be flexible in this endeavor, despite the
    apparent intentions of the original specification.

    Although various sections of a MusicMatch tag are fixed in length, other
    sections are not, and so tag lengths can vary from one file to another.  A
    valid MusicMatch tag will be at least 8 kilobytes (8192 bytes) in length.
    Those tags with image data will often be much larger.

    The byte-order in 4-byte pointers and multibyte numbers for MusicMatch tags
    is least-significant byte (LSB) first, also known as "little endian".  For
    example, $12345678 is encoded as $78 56 34 12.

    Overall tag structure:

      +-----------------------------+
      |           Header            |
      |    (256 bytes, OPTIONAL)    |
      +-----------------------------+
      |  Image extension (4 bytes)  |
      +-----------------------------+
      |        Image binary         |
      |  (var. length >= 4 bytes)   |
      +-----------------------------+
      |      Unused (4 bytes)       |
      +-----------------------------+
      |  Version info (256 bytes)   |
      +-----------------------------+
      |       Audio meta-data       |
      | (var. length >= 7868 bytes) |
      +-----------------------------+
      |   Data offsets (20 bytes)   |
      +-----------------------------+
      |      Footer (48 bytes)      |
      +-----------------------------+

    This document will describe the various sections of the tag in the order
    listed above (that is, in the sequential order that they appear when
    reading the tag from beginning to end).  However, due to the nature of the
    tag's format, in practice the tag's sections will often be parsed in the
    reverse order.  A robust parsing algorithm will be suggested and described
    later in the document.

4.1.  Header

    An optional tag header often precedes the tag data in a MusicMatch tag.
    Although the rules that determine this header's required presence are
    unknown, the header is usually found in tag versions up to and including
    2.50, and is usually lacking otherwise.  Luckily, its format is rigid and
    therefore its presence is easy to determine.  The data in the header are
    not vital to the correct parsing of the rest of the tag and can thus be
    discarded.  The header is the only optional section in a MusicMatch tag.
    All other sections are required to consider the tag valid.

    The header section is always 256 bytes in length.  It begins with three
    10-byte subsections, and ends with 226 bytes of space ($20) padding.  Each
    of the first three subsections contains an 8-byte ASCII text string
    followed by two bytes of null ($00) padding.

    The first subsection serves as a sync string: its 8-byte string is always
    "18273645".

    The second subsection's 8-byte string is the version of the Xing encoder
    used to encode the mp3 file.  The last four bytes of this string are
    usually '0' ($30).  An example of this string is "1.010000".

    The third and final 10-byte subsection is the version of the MusicMatch
    Jukebox used to encode the mp3 file.  The last four bytes of this string
    are usually '0' ($30).  An example of this string is "2.120000".

      Sync string                "18273645"
      Null padding               $00 00
      Xing encoder version       <8-byte numerical ASCII string>
      Null padding               $00 00
      MusicMatch version         <8-byte numerical ASCII string>
      Null padding               $00 00
      Space padding        226 * $20

4.2.  Image extension

    MusicMatch tags can contain at most one image.  This first required section
    is the extension of the image when saved as a file (for example, "jpg" or
    "bmp").  This section is 4 bytes in length, and the data is padded with
    spaces ($20) if the extension doesn't use all 4 bytes (in practice, 3-byte
    extensions are the most prevalent).  Likewise, tags without images have all
    spaces for this section (4 * $20).

      Picture extension           $xx xx xx xx

4.3.  Image binary

    When an image is present in the tag, the image binary section consists of
    two fields.  The first field is the size of the image data, in bytes.  The
    second is the actual image data.

      Image size                  $xx xx xx xx
      Image data                  <binary data>

    If no image is present, the image binary section consists of exactly four
    null bytes ($00 00 00 00).

4.4.  Unused

    This section is never used, to the best of the author's knowledge.  It is
    always 4 null ($00) bytes.

      Null padding               $00 00 00 00

4.5.  Version information

    This section of the tag has the exact same format as the header.  Unlike
    the header, this section is required for the tag to be considered valid.

      Sync string                "18273645"
      Null padding               $00 00
      Xing encoder version       <8-byte numerical ASCII string>
      Null padding               $00 00
      MusicMatch version         <8-byte numerical ASCII string>
      Null padding               $00 00
      Space padding        226 * $20

4.6.  Audio meta-data

    The audio meta-data is the heart of the MusicMatch tag.  It contains most
    of the pertinent information found in other tagging formats (song title,
    album title, artist, etc.) and some that are unique to this format (mood,
    prefernce, situation).

    In all versions of the MusicMatch format up to and including 3.00, this
    section is always 7868 bytes in length.  All subsequent versions allowed
    three possible lengths for this section: 7936, 8004, and 8132 bytes.  The
    conditions under which a particular length from these three possibilities
    was used is unknown.  In all cases, this section is padded with dashes 
    ($2D) to achieve this constant size.

    Due to the great number of fields in this portion of the tag, they are
    divided amongst the next four sections of the document: single-line text
    fields, non-text fields, multi-line text fields, and internet addresses.
    This clarification is somewhat arbitrary and somewhat inaccurate (some of
    the fields described as "non-text" are indeed ASCII strings).  However, the
    clarification does allow for easier description of the meta-data as a
    whole.  At any rate, the actual fields in this section of the tag appear
    sequentially in the order presented.
    

4.6.1.  Single-line text fields

    The first group entries in this section of the tag are variable-length
    ASCII text strings.  Each of these strings are preceded by a two-byte field
    describing the size of the following string (again, in LSB order).
    Multiple entries in a text field are separated by a semicolon ($3B).  An
    empty (and non-existant) text field is indicated by a size field of 0 ($00
    00).

    The first three of these entries are fairly-self explanatory: song title,
    album title, and artist name.

    The final five entries are a little less common: Genre, Tempo, Mood,
    Situation, and Preference.  These fields can contain any information, but
    do to the interface and default set-up for the Jukebox application, they
    typically are limited to a subset of possibilities.

    The Genre entry differs from the ID3v1 tagging format in that it allows
    a full-text genre description, whereas ID3v1 maps a number to a list of
    genres.  Again, the genre description could be anything, but the interface
    in Jukebox typically limited most users to the standard ID3v1 genres.

    The Tempo entry is intended to describe the general tempo of the song.  The
    Jukebox application provided the following defaults: None, Fast, Pretty
    fast, Moderate, Pretty slow, and Slow.

    The Mood entry describes what type of mood the audio establishes: Typical
    values include the following: None, Wild, Upbeat, Morose, Mellow, Tranquil,
    and Comatose.

    The Situation entry describes in which situation this music is best played.
    Expect the following: None, Dance, Party, Romantic, Dinner, Background, 
    Seasonal, Rave, and Drunken Brawl.

    The Preference entry allows the user to rate the song.  Possible values
    include the following: None, Excellent, Very Good, Good, Fair, Poor, and
    Bad Taste.

      Song title length          $xx xx
      Song title                 <ASCII string>
      Album title length         $xx xx
      Album title                <ASCII string>
      Artist name length         $xx xx
      Artist name                <ASCII string>
      Genre length               $xx xx
      Genre                      <ASCII string>
      Tempo length               $xx xx
      Tempo                      <ASCII string>
      Mood length                $xx xx
      Mood                       <ASCII string>
      Situation length           $xx xx
      Situation                  <ASCII string>
      Preference length          $xx xx
      Preference                 <ASCII string>

4.6.2.  Non-text fields

    The next group of fields is described here as "non-text".  They are 
    probably better described as entries that are auto-created (i.e., not
    entered in by a user), although this isn't entirely accurate, either, as
    the track number field is determined by user input.  At any rate, they've
    been separated to clarify the presentation of the material.

    The "Song duration" entry consists of two fields: a size and text.  The
    text is formatted as "minutes:seconds", and thus the size field is 
    typically 4 ($04 00).

    The only field that is neither a string nor a LSB numerical value is the
    creation date.  It is 8-byte floating-point value.  It can be interpreted
    as a TDateTime in the Delphi programming language, where the integral 
    portion is the number of elapsed days since 1899-12-30, and the mantissa 
    portion represents the fractional portion of that day, where .0 would be 
    midnight, .5 would be noon, and .99999... would be just before midnight 
    of the next day.  In practice, this field is typically unused and will be
    filled with 8 null ($00) bytes.
    
    The next field is the play counter, presumably maintained by the Jukebox
    application.  Most of the time this field is unused, and is typically 0
    ($00 00 00 00).

    The next entry is a size/text combo and represents the original filename
    and path.  As these tags were created almost universally on Windows
    machines, the entries are typically in the form of "C:\path\to\file.mp3".

    The next size/text entry is the album serial number fetched from the online
    CDDB when a track is ripped with MusicMatch.

    The final field is the track number, usually entered automatically when
    ripping, encoding, and tagging the audio off from a CD using CDDB.

      Song duration length       $xx xx
      Song duration              <ASCII string>
      Creation date              <8-byte IEEE-64 float>
      Play counter               $xx xx xx xx
      Original filename length   $xx xx
      Original filename          <ASCII string>
      Serial number length       $xx xx
      Serial number              <ASCII string>
      Track number               $xx xx

4.6.3.  Multi-line text fields

    The next three entries are typically multi-line entries.  All line
    separators use the Windows-standard carriage return ($0D 0A).  As with the
    single-line text entries, the text fields are preceded by LSB size fields
    which indicate their length.

      Notes length               $xx xx
      Notes                      <ASCII string>
      Artist bio length          $xx xx
      Artist bio                 <ASCII string>
      Lyrics length              $xx xx
      Lyrics                     <ASCII string>

4.6.4.  Internet addresses

    The final group of meta-data are internet addresses.  As with other text
    entries, the text fields are preceded by LSB size fields.

      Artist URL length          $xx xx
      Artist URL                 <ASCII string>
      "Buy CD" URL length        $xx xx
      "Buy CD" URL               <ASCII string>
      Artist email length        $xx xx
      Artist email               <ASCII string>

4.6.5.  Padding
 
    The data fields are then followed by 16 null ($00) bytes.  Presumably these
    were intended for (up to 8) future text fields.

    The remainder of this section is padded with '-' ($2D) characters.

4.7.  Data offsets

    This section of the tag was intended to give offsets into the file for each
    of the five major required sections of the tag.  The offsets, however, are
    off by 1; for searching a file where the first position is offset 0, the
    offset given here must be reduced by 1.  In practice, however, these
    offsets can often be invalid, since the data that comes before may be
    increased or reduces (such as when an ID3v2 tag is appended to the file).
    Therefore these offsets are best used to calculate the size of the sections
    by finding the difference of two consecutive offsets.  Obviously, the size
    of the audio meta-data section must be calculated in a different manner.

      Image extension offset     $xx xx xx xx
      Image binary offset        $xx xx xx xx  
      Unused offset              $xx xx xx xx
      Version info offset        $xx xx xx xx
      Audio meta-data offset     $xx xx xx xx

4.8.  Footer

    Unlike the header, the footer is a required section of any MusicMatch tag,
    and checking for its existance is an easy way to determine if a file has a
    MusicMatch tag.  It is always 48 bytes in length.  The first 19 bytes is
    the company name "Brava Software Inc." (Note: it seems that the company
    name has officially changed to MusicMatch, as "Brava Software" is not
    mentioned anywhere on their website), followed by 13 bytes of space ($20)
    padding.  The next 4 bytes is the tag version as a numerical ASCII string
    (e.g., "3.05"), and should match the version string found in the Version
    section and the (optional) header.  This is followed by 12 bytes of space
    ($20) padding.

      Signature                  "Brava Software Inc."
      Space padding         13 * $20
      Tag version                <4-byte numerical ASCII string>
      Space padding         12 * $20

5.   Identifying and parsing a MusicMatch tag

    Finding and parsing a MusicMatch tag is not difficult to do, but due to 
    lack of foresight and questionable design decisions by MusicMatch, care 
    must be taken to ensure it is done correctly.

    <unfinished />

6.   Converting to ID3v2

    As of Jukebox 4.0, MusicMatch has abandoned the MusicMatch tagging format
    in favor of the open standard ID3v2.  The Jukebox application will convert
    old tags to ID3v2 upon request, but as this is a closed application that
    serves a limited number of platforms (currently on Windows and Macintosh),
    having a public specification for performing this mapping is necessary.  As
    ID3v2 can encapsulate all of the information found in the original
    MusicMatch format while being infinitely more flexible, the decision to
    convert shouldn't be a difficult one.

    <unfinished />

7.   Copyright

    Copyright (C) Scott Thomas Haug 2000. All Rights Reserved.

    This document and translations of it may be copied and furnished to others,
    and derivative works that comment on or otherwise explain it or assist in
    its implementation may be prepared, copied, published and distributed, in
    whole or in part, without restriction of any kind, provided that a
    reference to this document is included on all such copies and derivative
    works. However, this document itself may not be modified in any way and
    reissued as the original document.

    The limited permissions granted above are perpetual and will not be
    revoked.

    This document and the information contained herein is provided on an 'AS
    IS' basis and THE AUTHORS DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
    INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
    HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
    MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


8.   References

    [MMTrailer] Peter "The Videoripper" Luijer, 
    'Description of the MusicMatch trailer in MP3 files'

      <url:http://members.xoom.com/videoripper/warez/mmtrailer.txt>

    [ID3v2] Martin Nilsson, 'ID3v2 informal standard'.

      <url:http://www.id3.org/id3v2.3.0.txt>

    [id3lib] Scott Thomas Haug, 'The ID3v1/ID3v2 Tagging Library'
 
      <url:http://www.id3lib.org>

    [ISO-8859-1] ISO/IEC DIS 8859-1.
    '8-bit single-byte coded graphic character sets, Part 1: Latin
    alphabet No. 1.' Technical committee / subcommittee: JTC 1 / SC 2

    [JFIF] 'JPEG File Interchange Format, version 1.02'

      <url:http://www.w3.org/Graphics/JPEG/jfif.txt>

    [MPEG] ISO/IEC 11172-3:1993.
    'Coding of moving pictures and associated audio for digital storage
    media at up to about 1,5 Mbit/s, Part 3: Audio.'
    Technical committee / subcommittee: JTC 1 / SC 29
     and
    ISO/IEC 13818-3:1995
    'Generic coding of moving pictures and associated audio information,
    Part 3: Audio.'
    Technical committee / subcommittee: JTC 1 / SC 29
     and
    ISO/IEC DIS 13818-3
    'Generic coding of moving pictures and associated audio information,
    Part 3: Audio (Revision of ISO/IEC 13818-3:1995)'

    [URL] T. Berners-Lee, L. Masinter & M. McCahill, 'Uniform Resource
    Locators (URL)', RFC 1738, December 1994.

      <url:ftp://ftp.isi.edu/in-notes/rfc1738.txt>

    [UTF-8] F. Yergeau, 'UTF-8, a transformation format of ISO 10646',
    RFC 2279, January 1998.
   
      <url:ftp://ftp.isi.edu/in-notes/rfc2279.txt>


9.   Author's Address

    Written by

      Scott Thomas Haug
      Seattle, WA
      USA