January 29, 2007

deciphering the iTunes .itc file format

iTunes 7 incorporated a technology called CoverFlow which allows you to browse through your albums by flipping through images of the album covers. It has the ability to download the album images from the internet, but frankly, it's not very good at it, because your track metadata has to pretty closely match the iTunes Music Store's version, and I have developed my own system for labeling things (especially remixes). Sometimes it'll download the artwork for compilations instead of the album, or for an album of the same name by a different artist, etc. You can add the artwork manually, but then it inserts it into the audio file. The problem here for me is that I don't want the artwork added to each audio file. It also maintains an additional copy of the image in a directory structure designed for fast access in a proprietary file format, .itc, which appears to be a wrapper around a standard image stream.

Instead, I've been using an older program called Clutter. It automatically downloads images from Amazon.com and if it gets the wrong one, you can trigger a Google image search instead. Ideally, I'd like to mod this program to insert the file downloaded from Amazon into the proprietary iTunes/CoverFlow format and directory structure. To that end, I started trying to decipher the .itc file format. I've seen a few efforts; you may want to examine these three links. Here are my results so far:


ITC file format specification (reverse engineered) version 0.1:

Section 0:
	Filename is two sixteen character hexadecimal strings tied together with a hyphen
	followed by the filename extension .itc. The first hexadecimal string is always
	the Library Persistent ID (can be seen in the iTunes Music Library.xml file).
	The second hexadecimal string is the Track Persistent ID _if_ the file is located
	in the Local subfolder hierarchy of the Album Artwork folder.
	
	Directory structure: On a Mac, the files will reside in: ~/Music/iTunes/Album Artwork/
	This folder has two subfolders, Download/ and Local/. Inside each of these folders
	will be one folder with the Library Persistent ID as its name. I assume that if
	you have multiple iTunes libraries, there will be additional folders with the
	corresponding Library Persistent IDs.
	
	Then you must traverse three layers of folders, each with two digit decimal labels,
	corresponding in reverse order to the last three hexadecimal digits prior to the
	.itc filename extension. e.g., if your downloaded .itc file ends with A01.itc, it
	will reside in:
	~/Music/iTunes/Album Artwork/Download/"Library Persistent ID"/01/00/10/fooA01.itc

Section 1:
	bytes 1-4 (unsigned 32-bit integer): self-describes the length of section 1.
		In all cases seen so far, it is 00 00 01 1C, or 284 bytes.
		
	bytes 5-8 (chars): purpose unknown.
		Spells out "itch" (itc header?).

	bytes 9-24: purpose unknown.
		In all cases seen so far, it is 00 00 00 02 00 00 00 02 00 00 00 02 00 00 00 00

	bytes 25-28 (chars): purpose unknown.
		Spells out "artw" (artwork?).

	bytes 29-284: purpose unknown.
		In all cases seen so far, it is 256 consecutive null bytes (00).

Section 2:
	bytes 285-288 (unsigned 32-bit integer): variable that self-describes the length of
		the remainder of the file, including these four bytes.
		
		In all cases seen to date, equals a number that is 284 less than the total .itc
		file size in bytes (section 1 has always been 284 bytes).

	bytes 289-292 (chars): purpose unknown.
		Spells out "item".

	bytes 293-296 (unsigned 32-bit integer): variable that self-describes the offset to
		the beginning of the image stream from the beginning of section 2.
		
		In most cases seen it is 00 00 00 D0 (208 bytes), although some reports on the
		internet have shown other values.
		See http://www.waldoland.com/dev/Articles/ITCFileFormat.aspx.

	bytes 297-312: purpose unknown.

	bytes 313-320: This encodes the Library Persistent ID.

	bytes 321-328: This encodes the second hexadecimal string of the filename.
		Often this is the Track Persistent ID.

	bytes 329-332: purpose unknown, but corresponds to whether file was downloaded or local.
		In all cases seen to date, spells out "down" or "locl", and this does appear to
		correspond to which subfolder it is in.

	bytes 333-336: file format indicator?
		In many files, this is pretty uninformative, but in at least one file that I have
		examined, these bytes spell "PNGf" and the data is in fact in PNG format.

	bytes 337-340: purpose unknown.

	bytes 341-344 (unsigned 32-bit integer): describes the width of the image in pixels

	bytes 345-348 (unsigned 32-bit integer): describes the height of the image in pixels

	bytes 349-360: purpose unknown.
	
	bytes 361-488: purpose unknown.
		In most cases seen so far, this is a string of null bytes.

	bytes 489-492: purpose unknown.
		In all cases viewed, spells "data". Probably used as the header end marker.
	
	bytes 493-???: In the instance I have seen where bytes 293-296 gave a value greater than
		208 bytes, there were sufficent null bytes here to fill out the value.

Section 3:
	Standard image stream.

Feedback:

TrackBack URL for this entry:
http://www.falsecognate.org/cgi-bin/mt-tb.cgi/755

Comments:

  1. I've been trying to add transparency to some artwork, for fun. When I drag the image in (I've tried .tif and .png), it displays beautifully. When I close iTunes and reopen the image appears without transparency. I'm wondering if playing around with bytes 333-336 or 349-360 could indicate alpha channel? Any thoughts?

  2. I wouldn't think that transparency would be coded for in the .itc header, as the remainder of the file is a standard image stream. My guess is that it's a quirk of iTunes' rendering; I'd have to poke around at it for myself.

  3. Thanks for the response. That would be an interesting quirk, considering that the image honors the transparency when loaded into memory. I still suspect either the .itc file itself, or something that happens in the transaction between the two. I'll be very intersted to see what you discover.

  4. This format looks very much like the atom format that the .m4a files use.

    You can see that there is a 4 byte length, followed by a 4 byte "type". If the length value is "1", then the next 8 bytes are length.

    The data inside the atom will probably not change too much, so you can use the above to find the item atom.

  5. nice work!
    I am wondering whether there is a way to find out which .itc file corresponds to a iTunes track. If I know the persistent track ID and persistend library ID, how do I find the artwork if it is located in the download folder?
    Thanks for your help
    Daniel

Post a comment

You must sign in using either TypeKey or OpenID to comment on this entry.

OpenID/LiveJournal:

TypeKey:

Meta-info

Categorical

Temporal