11.2 Writable CD Formats
The physical and logical format used by
writable CDs is defined in the rainbow books described in Chapter 10. The following sections provide an overview of
how data is physically and logically stored on writable CDs. For
further detail, refer to the rainbow books.
 |
CD-R discs are manufactured with a pregroove
track that is 600 nanometers (nm) wide with a 1,600 nm pitch. The
pregroove includes an impressed timing wobble of
±30nm radial excursion at 22.05 KHz, with an FM carrier
modulated at 1 KHz superimposed on the pregroove. This modulation
provides an absolute clock signal (called absolute time in
pregroove, or ATIP) that provides an absolute location
reference for any sector on the CD-R disc. Absolute addresses on the
CD-R disc are specified in the form HH:MM:SS using ATIP information.
Audio CDs are addressable in this manner with resolution of 1 second
(75 sectors). Data CDs are addressable to the individual sector
level.
|
|
11.2.1 Physical Formats
Because they must be readable in a
standard CD-ROM drive or CD player, writable CDs use a physical
format nearly identical to pressed CDs. The dimensions of a CD are
120.00mm in diameter (60.00mm radius) with a 15.00mm diameter central
hole that accommodates the rotating center spindle of the drive.
Beginning at the edge of the center hole (radius 7.50mm) and
proceeding outward, a CD-R disc is divided into the following
areas:
- Clamping Area
-
The
Clamping Area is that portion of the disc that
the drive spindle grasps to rotate the disc. On a pressed CD, this
area extends from radius 7.50mm to 23.00mm. On a writable CD, this
area occupies radius 7.50mm to 22.35mm.
- System Use Area
-
The System Use
Area (SUA) is present only on
writable discs, occupies radius 22.35mm to 23.00mm, and can be
thought of as equivalent to the boot sector of a hard disk. The SUA
contains data that tells a CD drive or player what kind of
information is stored on the disc, where it is located, and what
format it uses. The SUA is inside the radius readable by standard
CD-ROM drives and CD players, so only CD recorders can read and write
to this area. The SUA is divided into two subareas:
- Optimal Power Calibration Area
-
The Optimal
Power Calibration Area (OPCA), often
called the Power Calibration Area
(PCA) for short, is used by the CD writer as a
testing area to decide the best write schema to
use when writing to that disc. Each time you insert a disc into a
CD-R drive, the drive fires its writing LASER at the PCA to calibrate
that disc against the drive. Each such calibration uses one ATIP
frame. Only 99 PCA ATIP frames are available, which limits a CD-R
disc to 99 or fewer recording sessions.
Many variables determine how the drive should best write to that
disc—the type of dye and reflective backing material the disc
uses, the proposed write speed, the firmware level of the drive, and
so on. From this calibration testing, the drive decides the power
level to use when writing, and whether to use a short write schema
(typical for cyanine-based discs) or a long write schema (typical for
pthalocyanine- and azo-based discs). The PCA begins at radius 22.35mm
(ATIP -00:00:36 relative to the 23.00mm beginning of the Lead-in Area
described later in this section).
- Program Memory Area
-
The Program Memory
Area (PMA) begins where the PCA ends,
and extends to the beginning of the Lead-in Area at radius 23.00mm.
The PMA is used to store a temporary TOC until the disc is
finalized or closed.
Closing a disc writes the temporary TOC stored in the PMA to the
Lead-in Area. That makes the TOC (and therefore the disc) readable by
a CD-ROM drive or CD player, but also means that the disc can no
longer be written to by a CD recorder. The PMA can store location
information for up to 99 track numbers, including the start and stop
times for each track (for audio) or the sector addresses for data.
- Information Area
-
The Information
Area (IA) occupies a width of 35.00mm
to 35.50mm, beginning at radius 23.00mm and ending between radius
58.00mm and 58.50mm. This area provides the general storage space to
which user data is written. The IA is the only area of the CD that is
visible to standard CD-ROM drives and CD players, and includes the
following subareas:
- Lead-in Area
-
The
Lead-in Area occupies radius 23.0mm to 25.0mm on
both pressed and writable CDs. This area contains digital silence in
the main channel, as well as control information in various subcode
channels that can be used to provide additional information to the
drive or reader about the content of the disc. The most important of
the subcode channel data is the TOC for the disc, which is stored in
the Q-channel. The length of the Lead-in Area is determined by the
space required to store up to 99 TOCs for the 99 tracks that may
potentially be written to the Program Area.
 |
A CD has a main data channel—which stores audio and/or computer
data—and eight interleaved subcode channels, designated P
through W, that can store supplemental control data that can be read
by CD-ROM drives and CD players. When the CD format was originally
designed, it was intended that the main channel would contain only
data and that subcode channels would be used to store administrative
information. Nowadays, such supplemental information is usually
encoded within the main data channel, and the only subchannels that
are generally used are the P channel, which specifies the start and
end of each track, and the Q channel, which stores the TOC, the track
type/catalog number, and the timecodes (in HH:MM:SS and frames) used
to locate data on the disc. Subchannels R through W were formerly
sometimes used to store graphics and other supplemental data, but are
now seldom used. The DVD specification eliminates subchannel coding
as superfluous.
If you've ever wondered why a CD-R
disc that has been written to but not closed can be read in a CD
recorder but not in a standard CD-ROM drive or CD player, this is
why. Standard readers look for the TOC in the Lead-in Area, where it
has not yet been written for a disc that is not yet closed. CD
recorders can read the temporary TOC stored in the PMA, which allows
them to read that disc. The PMA is invisible to standard CD-ROM
drives and CD players, so as far as they're
concerned, that disc has no TOC.
|
|
- Program Area
-
The
Program Area (PA) occupies
a width of 33.00mm to 33.50mm, beginning at radius 25.00mm and ending
between radius 58.00mm and 58.50mm. The PA is where actual user data
(audio or computer data) is stored. The PA varies in capacity
according to the CD-R disc you use. Discs are available that store 63
minutes of audio (which corresponds to about 600 MB of data), 74
minutes (~650 MB), and 80 minutes (~700 MB). Different brands of
discs also have minor variations from nominal capacity. Some
nominally 74-minute discs, for example, can store as much as 76.5
minutes.
- Lead-out Area
-
The Lead-out Area occupies a radius of 0.50mm to
1.00mm, which begins between radius 58.00mm and 58.50mm, and ends
between radius 59.00mm and 59.50mm. The Lead-out Area is created when
the disc is closed, and defines the end of the Information Area.
- Edge
-
The remaining 0.50mm to 1.00mm at the outer edge of the disc is
unused. This area has no formal name that we know of, and exists
simply to protect the outer portion of the track from
damage.
The preceding assumes that the data on the disc exists as one
session, which is nearly always true for commercially-pressed CDs, as
well as for writable CDs produced using Disc-at-Once recording
(described in a later section). But Orange Book defines a concept
called multisession for CD-R discs.
With multisession recording, the overall disc layout remains the
same. As with a single-session disc, a multisession disc contains a
Lead-in Area, a Program Area, and a Lead-out Area. The difference is
that the Program Area on a multisession disc stores more than one
session, each of which contains its own session-based Lead-in Area,
Program Area, and Lead-out Area.
Like the disc itself, a session can be opened, written to, and
closed. When a session is closed, that session can no longer be
written to, but additional sessions can be added to the disc. In
fact, closing a session on a multisession disc automatically opens a
new session to which additional data can be written. Closing the
session writes the session TOC to the PMA. This session TOC includes
pointers to the start of the session Program Area for the new session
and to the start time of the last-used (outermost) Lead-out Area.
Closing the session does not close the disc, however, which means
that until the disc itself is closed, sessions on a multisession disc
can be read only by a CD recorder (which can read the temporary TOC
in the PMA) and by some recent CD-ROM drives. When the disc itself is
closed, all sessions are closed and the temporary TOC is written to
the Lead-in Area, allowing the disc to be read in any CD-ROM drive
and most CD players.
 |
Although the PMA makes provision for 99 tracks or sessions, in
practice the number of sessions that can be recorded on a CD-R disc
is much lower because of the overhead required for each session. When
writing multiple sessions to a disc, the Lead-in Area for each
session occupies 4,500 sectors (60 seconds or 9000 KB). The Lead-out
Area for the first session occupies 6,750 sectors (90 seconds or
13,500 KB). The Lead-out Area for the second and subsequent sessions
occupies 2,250 sectors (30 seconds or 4500 KB).
|
|
11.2.2 Logical Formats
The logical format of a CD specifies
how data is arranged on the CD, and largely determines how data may
be structured on the disc and what operating systems will be able to
access it. CDs commonly use one of the logical formats described in
the following sections.
11.2.2.1 ISO-9660
Most
data CDs use the ISO-9660 format or one of its
variants. ISO-9660 is based on the de facto standard High Sierra
format that was developed by the CD-ROM industry as a cooperative
effort because of the lack of formal standards that then existed for
writing data to CDs. In the days before High Sierra came into use, it
was quite common to find that you could not read the data on a
particular CD-ROM because that CD was incompatible with your
software.
The primary purpose of ISO-9660, which was adopted in 1984, was to
standardize a common logical data format for data CDs and, at the
same time, to facilitate data exchange among different computing
platforms. As a least-common-denominator format, the original
ISO-9660 format is feature-poor because it supports only features
that are common across many platforms. For example, the MS-DOS 8.3
filenaming convention limited ISO-9660 to using 8.3 filenames.
At the time ISO-9660 was adopted, these limitations were not much of
a problem. Most people ran either MS-DOS or a Mac using floppy disks
or small hard disks, and the limitations of ISO-9660 were not onerous
in those environments. But the world soon changed, and the strict
limits enforced by ISO-9660 became a problem, particularly for those
who wanted to use deeply nested directories and long filenames.
Accordingly, the ISO-9660 specification was expanded to include three
ISO-9660 Interchange Levels for naming files and
directories on disc. From most to least restrictive, these include:
- ISO-9660 Level 1
-
ISO-9660 Level 1 is the least-common-denominator
level, developed to accommodate DOS filename limitations. Each file
must be written to disc as a single, continuous stream of bytes,
called an extent. Files may not be fragmented or
interleaved. Filenames may contain from one to eight d-characters
(see following section). Filename extensions may contain from zero to
three d-characters. Directory names may contain from one to eight
d-characters, and may not have an extension.
- ISO-9660 Level 2
-
ISO-9660 Level 2 also requires that files be
written to disc as a single extent, but filenames may be up to 255
d-characters long, with an extension from zero to three d-characters.
ISO-9660 Level 2 discs are unreadable by some operating systems,
notably DOS.
- ISO-9660 Level 3
-
ISO-9660 Level 3 allows a file to be written in
multiple extents, and so is used for packet writing. Filenames may be
up to 255 characters long, with the same limitations as ISO-9660
Level 2.
 |
Strictly interpreted, ISO-9660 filenames must end with a semicolon
followed by the version number—e.g.,
FILENAME.TXT;1. Most operating systems ignore
these final two characters when they access files or display
directory listings. Versions of the Macintosh OS prior to 7.5 and
some versions of Unix do not suppress the semicolon and version
number, which causes problems if they attempt to access
FILENAME.TXT rather than the actual filename of
FILENAME.TXT;1.
|
|
The various ISO-9660
levels vary significantly in which characters are legal. In
ISO-9660-speak, these characters are designated as follows:
- d-characters
-
For strict compliance with ISO-9660 Level 1 file and directory naming
conventions, only this character set may be used (and only in 8.3
format). d-characters include uppercase A through Z, digits 0 through
9, and the underscore character.
- a-characters
-
The character set usable for ISO Volume Descriptors (discussed next).
a-characters include all d-characters as well as the following
symbols: space; comma; semicolon; colon; period; question mark;
exclamation point; right and left parentheses; single and double
quotes; greater-than and less-than symbols; percent; ampersand;
equals; asterisk; plus and minus (hyphen) symbols; and forward slash.
ISO-9660 Volume Descriptors are optional
information fields recorded at the beginning of the data area on the
disc. Volume Descriptors were originally intended for use by CD
publishers, but may be used by anyone who creates an ISO-9660 disc,
assuming the mastering software supports assigning ISO Volume
Descriptors (some don't, or support only some of the
available volume descriptors). ISO-9660 Volume Descriptors include
the following, with allowable sizes in parentheses:
- System Name
-
The operating system for which the disc is intended (0 to 32
a-characters).
- Volume Name
-
The disc name, displayed by the OS when the disc is mounted (0 to 32
a-characters).
- Volume Set Name
-
Used in multidisc sets to assign a common group name to each disc in
the set (0 to 32 d-characters).
- Publisher's Name
-
The publisher of the disc (0 to 128 a-characters).
- Data Preparer's Name
-
The author of the disc content (0 to 128 a-characters).
- Application Name
-
The name of the program, if any, needed to access data on the disc (0
to 128 a-characters).
- Copyright File Name
-
Points to a file (which, if present, must reside in the root
directory of the disc) that contains copyright information (maximum
8.3 d-characters).
- Abstract File Name
-
Points to a file (which, if present, must reside in the root
directory of the disc) that contains text describing the contents of
the disc (maximum 8.3 d-characters).
- Bibliographic File Name
-
Points to a file (which, if present, may reside in any directory on
the disc) that contains bibliographic information, such as ISBN
number (maximum 8.3 d-characters).
- Date Fields
-
Four Volume Descriptor fields exist for dates: Creation Date;
Modification Date; Expiration Date; and Effective Date. Each of these
fields, if present, stores a date and time in the following format,
with size given in bytes in parentheses: Year (4); Month (2); Day
(2); Hour (2); Minute (2); Second (2); Hundredths of a second (2);
Timezone (1 byte, signed integer; specifies the number of 15-minute
increments from UCT from -48 West to +52 East).
11.2.2.2 ISO-9660 Variants
The
very real limitations of ISO-9660 formatted discs gave rise to
several alternative formats, all of which were based on ISO-9660:
- Rock Ridge
-
The
Rock Ridge format is an extension of the
ISO-9660 format, intended for use on Unix systems, which have much
more liberal restrictions on the length of and characters used in
filenames and directory names, as well as the depth of directories.
Using Rock Ridge allows a CD to support long mixed-case filenames,
symbolic links, and other conventions common to Unix systems.
Although full Rock Ridge support is available only on Unix systems, a
system running MS-DOS, Windows, or the Mac OS can still access the
data on a Rock Ridge disc, but not the long filenames and other
extended information. The Rock Ridge standard is available at
ftp://ftp.ymi.com/pub/rockridge
if you want to learn more about it.
- Romeo
-
The
Romeo format is an obsolete extension to
ISO-9660, developed by Adaptec as a stopgap measure for early
versions of its EasyCD premastering software. The raison
d'être for the cutely named
Romeo format was that Windows NT 3.5a did not support the proprietary
Microsoft Joliet format, described next. Romeo supports filenames of
up to 128 characters, including spaces. However, unlike Joliet, Romeo
supports neither the Unicode character set nor associated short
(MS-DOS 8.3) filenames. Romeo-formatted discs can be read under
Windows NT 3.51 and 4.0, Windows 98/SE/Me, and Windows 2000/XP.
Because there is no associated short filename, Romeo-formatted discs
cannot be read under MS-DOS. Romeo-formatted discs can be read on a
Macintosh to the extent that they do not use filenames that exceed 31
characters. The Romeo format was essentially overtaken by events, was
seldom used even when current, and is almost never encountered today.
- Joliet
-
Joliet is an
extension of ISO-9660, developed by Microsoft to allow CDs to support
long filenames, the Unicode character set, and associated short
(MS-DOS 8.3) filenames. Joliet allows filenames up to 64 characters,
including spaces. When read on a system running Windows 9X, Windows
NT 4, Windows 2000/XP, or recent releases of Linux, a
Joliet-formatted disc displays long file and directory names. When
read on a system running an operating system that does not support
Microsoft long filename standards, the Joliet-formatted disc is
recognized as a standard ISO-9660 disc. Full information about the
Joliet standard is available at http://www-plateau.cs.berkeley.edu/people/chaffee/jolspec.html.
 |
Consider logical formatting issues carefully if you plan to use CD-R
premastering software to back up a hard disk that uses Windows long
filenames and long folder names. ISO formatting restrictions mean
that it's quite possible to have multiple
subdirectories in one directory (or multiple files in one directory)
whose long names are unambiguous, but whose truncated names are not.
That means you might be unable to copy all files to CD unless you are
very careful about using filenames and directory names that will
truncate to unambiguous short names.
|
|
11.2.2.3 Universal Disc Format (UDF)
ISO-9660 and its variants
were designed for duplicating or premastering discs, but were never
intended to allow incrementally adding small amounts of data to a
disc. Although ISO-9660 allows adding data to a disc (until that disc
has been closed), the only way to do so is by opening a new session
on that disc. That means that writing even one new file incurs the
overhead required for a new session, which ranges from 13 MB to 22
MB.
In part to address these ISO-9660 limitations, OSTA defined a new
logical format for optical discs. The official designation of this
format is ISO 13346 but the common name is
Universal Disc Format
(UDF). UDF is an operating system-independent
logical formatting standard that defines how data is written to
various types of optical discs, including CD-R, CD-(M)RW, DVD-ROM,
DVD-Video, and DVD-Audio. UDF uses a redesigned directory structure
that allows small amounts of data (called
packets) to be written incrementally and
individually to disc without incurring the large overhead associated
with writing a new session under ISO-9660.
In effect, with UDF each packet is written as a subsession within a
standard session, incurring the standard session overhead only when
that standard session is closed. Packet-writing software typically
closes the session automatically when the disc is ejected using the
eject feature of the software. As with ISO-9660, an open session on a
UDF-formatted disc can be read only by a CD recorder. Closing the
session allows the disc to be read by a standard CD-ROM drive or CD
player. It's possible, however, subsequently to open
a new session and add additional packet data to the disc.
In addition to session overhead, UDF addresses another issue that
makes ISO-9660 completely inappropriate for packet writing. ISO-9660
must know, in advance, exactly which files are to be written during a
session. It uses this information to create and write the
Path Tables and Primary Volume
Descriptors that point to the physical locations of the
files on disc. Because packet writing allows any arbitrarily selected
file to be written to disc at any time, the information that ISO-9660
requires is not available before the write occurs.
UDF solves this problem by accumulating data about the physical
locations of files as they are written. At the end of a
packet-writing session, UDF consolidates these location pointers and
writes them to disc as the Virtual Allocation
Table (VAT). The VAT address of a
file remains the same, even if it is overwritten. At the end of each
packet-writing session, UDF creates a new VAT that includes not just
the pointers for newly created or modified files, but also the
pointers stored in the old VAT. That means the current VAT always
includes pointers to every file that has been written to the disc
since it was originally formatted.
 |
The advantages of packet writing come at the cost of reduced
capacity. A typical CD-R/RW disc stores about 650 MB with ISO-9660
formatting, but stores only about 500 MB with UDF formatting. About
100 MB of that reduced capacity is accounted for by the complex UDF
directory and control structures that allow data to be added and
deleted incrementally. The remaining 50 MB or so is used to implement
various measures to distribute wear evenly across the CD-RW disc,
preventing some areas from being overused and thereby rendered
unwritable while other areas remain lightly used.
|
|
Two versions
of UDF are in common use:
- UDF 1.02
-
UDF 1.02
was
adopted in August 1996, and is the finalized version of the October
1995 UDF 1.0 specification. UDF 1.02 specifies standards for DVD and
DVD-ROM, but does not support writable optical media. Windows NT 4,
Windows 98/SE/Me, and Windows 2000/XP include native UDF 1.02 support
that allows them to access DVD video and DVD-ROM discs natively.
- UDF 1.5
-
UDF 1.5 was adopted in February 1997, and addresses the requirements
of sequential recorded media, including CD-R, CD-RW, and DVD-RAM. UDF
1.5 adds the VAT that is analogous to the DOS File Allocation Table,
and, optionally, the Sparing Table that allows
bad sectors to be marked as unusable and replaced by spare sectors.
Windows 2000/XP includes native UDF 1.5 support, but Windows NT and
Windows 9X do not. You can download UDF 1.5 reader software for these
versions of Windows from http://www.roxio.com.
The UDF 2.0 and 2.01 specifications are available, but not yet
commonly used in commercial products. For more information about UDF,
see http://www.osta.org.
|