Optical Disc and Disk Image Formats
Comprehensive reference for disk image formats covering ISO 9660, Joliet, Rock Ridge, UDF, El Torito, WIM, and hybrid disc structures. All struct layouts and edge cases documented here are verified against the Sector Sorcery parser implementation.
ISO 9660: Mastering the Standard
ISO 9660 (also known as ECMA-119) appears simple on the surface. However, real-world implementations reveal a rich ecosystem of vendor-specific extensions and creative solutions.
Note: ISO 9660 and ECMA-119 are technically the same standard, with ECMA-119 being freely available here.
Basic Structure
Every ISO starts with 16 sectors (32KB) of zeroes for system use. The real action begins at sector 16 with the Volume Descriptors:
-- At offset 0x8000 (sector 16 * 2048 bytes)
struct PrimaryVolumeDescriptor {
type : u8 -- 0x01 = Primary VD
identifier : char[5] -- "CD001" (magic)
version : u8 -- always 1
_unused : u8
system_id : char[32] -- system identifier (often garbage)
volume_id : char[32] -- volume label
_unused2 : u8[8]
volume_space_size : both32 -- total sectors (LE + BE)
_unused3 : u8[32]
volume_set_size : both16 -- usually 1
volume_seq_number : both16 -- usually 1
logical_block_size: both16 -- always 2048
path_table_size : both32
path_table_le : u32le -- LBA of LE path table
opt_path_table_le : u32le
path_table_be : u32be -- LBA of BE path table
opt_path_table_be : u32be
root_directory : DirectoryRecord -- 34 bytes, inline
-- ... 2041 more bytes of metadata
}
-- "both32" = same value stored LE then BE (8 bytes total)
struct both32 {
le : u32le
be : u32be
}
Quirks and Edge Cases
ISO 9660 stores all multi-byte integers in BOTH little-endian and big-endian format. A 32-bit integer takes 8 bytes: 4 for LE, 4 for BE. This clever approach ensures compatibility across all architectures.
This is actually specified in ECMA-119 section 7.2.3 and 7.3.3 as "both-byte order" format.
Directory records MUST start on even byte boundaries. If a record ends on an odd byte, add a padding byte. Implementations vary: some use 0x00, others repeat the last byte of the filename.
ECMA-119 section 9.1.12 specifies padding but not the value.
struct DirectoryRecord {
length : u8 -- total record length
ext_attr_length : u8 -- extended attribute length
extent_lba : both32 -- starting sector
data_length : both32 -- file size in bytes
recording_date : u8[7] -- years-since-1900, month, day, h, m, s, tz
flags : u8 -- bit 1 = directory, bit 0 = hidden
file_unit_size : u8 -- interleave (usually 0)
interleave_gap : u8
volume_seq : both16
name_length : u8
name : char[name_length]
_pad : u8 -- if name_length is even (align to word)
-- system use area follows (Rock Ridge lives here)
}
-- Next record starts at: offset + length
-- If length is odd, skip one padding byte (value undefined by spec)
Many PS1 games implement copy protection by intentionally crafting malformed ISO 9660 structures. They reference non-existent sectors, create circular directory structures, or include file entries pointing to the disc's lead-out area.
Examples include LibCrypt, APv1/APv2 protection schemes. See PSXDev forums for documented cases.
Joliet: Unicode Done Right
Joliet adds Unicode support to ISO 9660 using UCS-2 encoding (not UTF-16!) with a byte order mark that many implementations ignore.
-- Joliet uses a Supplementary VD (type 0x02) with special escape sequences
struct JolietVolumeDescriptor : PrimaryVolumeDescriptor {
type : u8 -- 0x02 = Supplementary VD
identifier : char[5] -- "CD001"
version : u8 -- 1
_flags : u8
-- ...
escape_sequences : u8[32] -- at offset 88
-- Joliet level detection:
-- bytes [0..2] = 0x25 0x2F 0x40 -> Level 1 (%/@)
-- bytes [0..2] = 0x25 0x2F 0x43 -> Level 2 (%/C)
-- bytes [0..2] = 0x25 0x2F 0x45 -> Level 3 (%/E)
}
-- All strings in Joliet records use UCS-2 Big Endian encoding
-- Filename limits: Level 1-2 = 64 chars (128 bytes), Level 3 = 128 chars
-- Key insight: some tools count bytes, others count characters
- Level 1: 64 characters
- Level 2: 64 characters
- Level 3: 128 characters
Key insight: That's 64 UNICODE characters, which means 128 bytes. Some implementations count bytes, others count characters.
See also: OSDev Joliet page.
Rock Ridge: POSIX Power
Rock Ridge adds POSIX attributes to ISO 9660 through System Use fields in directory entries. Implementation variations across systems create an interesting challenge for parser developers.
| Field | Signature | Purpose | Notes |
|---|---|---|---|
| SUSP Indicator | SP | Indicates SUSP in use | First directory entry only |
| Rock Ridge ID | RR | Rock Ridge in use | Version field varies |
| POSIX Name | NM | Real filename | Can span multiple NM entries |
| Symlink | SL | Symbolic link target | Component area parsing |
UDF: Universal Disk Format
UDF aimed to replace ISO 9660 with a more flexible format. The result is a sophisticated system where different implementations support different feature subsets, creating compatibility challenges.
Version Compatibility
- 1.02: DVD-ROM (most compatible)
- 1.50: DVD-R/RW (adds VAT)
- 2.00: Added Named Streams
- 2.01: Fixed 2.00 bugs
- 2.50: Blu-ray (metadata partition)
- 2.60: Blu-ray fixes
Windows XP reads up to 2.01, macOS has best support for 1.50, Linux varies by kernel version.
See UDF compatibility matrix for details.
Bridge Discs: Dual Format
Bridge discs contain both ISO 9660 and UDF filesystems. While the spec suggests they should share data, real-world implementations take creative approaches.
-- UDF Anchor at sector 256 (and last sector - 256)
struct AnchorVolumeDescriptorPointer {
tag : DescriptorTag -- tag.id = 2
main_vds_extent : Extent -- location + length of main VDS
reserve_vds_extent: Extent -- backup VDS location
}
struct DescriptorTag {
id : u16le -- 1=PVD, 2=AVDP, 5=Partition, 6=LogicalVol
version : u16le
checksum : u8 -- sum of bytes 0..3 and 5..15
serial : u16le
crc : u16le
crc_length : u16le
location : u32le -- sector number of this descriptor
}
struct Extent {
length : u32le -- in bytes
location : u32le -- sector number
}
-- Bridge disc detection: check for both "CD001" at sector 16
-- AND a valid AVDP tag (id=2) at sector 256
Hybrid Formats: Multi-Platform
Mac/PC hybrid discs, HFS+/ISO hybrids, and other multi-format discs demonstrate creative engineering to support multiple platforms seamlessly.
Mac hybrid discs start with an Apple Partition Map, followed by HFS+, with ISO 9660 structures carefully positioned to avoid conflicts. The first 16 sectors contain partition data instead of being empty as ISO 9660 expects.
The APM at sector 0 technically violates ISO 9660 but works in practice.
El Torito: Bootable CD Engineering
El Torito enables bootable CDs by embedding floppy or hard disk images. Implementation requires careful attention to detail for compatibility.
-- Boot Record Volume Descriptor (type 0x00) among the VD set
struct ElToritoBootRecord {
type : u8 -- 0x00 = Boot Record
identifier : char[5] -- "CD001"
version : u8 -- 1
boot_system_id : char[32] -- "EL TORITO SPECIFICATION" (padded)
_unused : u8[32]
catalog_sector : u32le -- LBA of boot catalog
}
struct BootCatalogEntry {
boot_indicator : u8 -- 0x88 = bootable, 0x00 = not
boot_media_type : u8 -- 0=no emulation, 1=1.2M floppy, 2=1.44M, 4=HDD
load_segment : u16le -- 0x0000 = default 0x07C0
system_type : u8
_unused : u8
sector_count : u16le -- sectors to load (512-byte)
load_lba : u32le -- start sector of boot image
}
UEFI bootable ISOs require an EFI System Partition formatted as FAT32 inside the ISO. This creates a FAT32 filesystem within an ISO 9660 filesystem, referenced by the El Torito boot catalog.
See Rod Smith's guide or xorriso documentation.
Windows Imaging Format
WIM files showcase elegant design: deduplicated file data, XML metadata, and solid compression.
struct WIMHeader {
magic : char[8] -- "MSWIM\0\0\0"
header_size : u32le -- size of this header
version : u32le
flags : u32le -- see compression flags below
chunk_size : u32le -- compression chunk (usually 32768)
guid : u8[16]
part_number : u16le
total_parts : u16le
image_count : u32le
offset_table : ResourceEntry -- offset table location
xml_data : ResourceEntry -- XML metadata location
boot_metadata : ResourceEntry
integrity_table : ResourceEntry
}
-- Compression type from flags bits [16..19]:
-- 0 = None
-- 1 = XPRESS (fast, moderate ratio)
-- 2 = LZX (slow, best ratio)
-- 3 = LZMS (Win8+, solid compression)
struct ResourceEntry {
size_and_flags : u64le -- low 56 bits = compressed size, high 8 = flags
offset : u64le -- absolute file offset
original_size : u64le -- uncompressed size
}
The XML metadata in WIM files may contain inconsistencies. File counts, compression types, and timestamps should be verified against actual resource entries for robust parsing.
Further Reading
Specifications
- ISO 9660 / ECMA-119: Volume and File Structure of CDROM
- UDF Specification: OSTA Universal Disk Format
- El Torito: Bootable CD-ROM Format Specification v1.0
Implementations
- libcdio: GNU's CD-ROM I/O library
- 7-Zip source: Igor Pavlov's implementation handles many edge cases
- Linux kernel: fs/isofs/ and fs/udf/ for real-world implementations