Portable Executable (PE) Format
The Portable Executable (PE) format is the native executable format for Windows. This reference covers PE32 and PE32+ (64-bit) with complete struct layouts, verified against working parser and emulator implementations.
DOS Header: The Legacy Gate
Every PE file starts with a DOS header for backward compatibility. The key field
is e_lfanew at offset 0x3C, pointing to the PE signature.
struct DOSHeader {
magic : u16le -- 0x5A4D "MZ"
last_page_bytes : u16le -- bytes on last page
page_count : u16le -- pages in file (512-byte)
relocations : u16le
header_paragraphs : u16le -- header size in 16-byte paragraphs
min_extra : u16le
max_extra : u16le
initial_ss : u16le
initial_sp : u16le
checksum : u16le
initial_ip : u16le
initial_cs : u16le
reloc_offset : u16le
overlay_number : u16le
_reserved : u16le[4]
oem_id : u16le
oem_info : u16le
_reserved2 : u16le[10]
e_lfanew : u32le -- offset 0x3C: pointer to PE signature
} -- total: 64 bytes
The DOS Stub
Between the DOS header and PE signature lies the DOS stub - a small program that prints "This program cannot be run in DOS mode" when executed in 16-bit DOS.
The DOS stub is optional but almost always present. Some packers and protectors
replace it with custom code. For emulation, skip directly to e_lfanew.
PE Signature
At the offset specified by e_lfanew, the PE signature
is the bytes PE\0\0 (0x00004550).
| Offset | Size | Field | Value |
|---|---|---|---|
| 0x00 | 4 | Signature | 0x00004550 ("PE\0\0") |
COFF File Header
Immediately after the PE signature comes the COFF header (20 bytes). This contains the machine type, section count, and characteristics.
-- Immediately follows the "PE\0\0" signature
struct COFFHeader {
machine : u16le -- target architecture
number_of_sections: u16le
time_date_stamp : u32le -- seconds since 1970-01-01
symbol_table_ptr : u32le -- usually 0 for executables
number_of_symbols : u32le
optional_hdr_size : u16le -- PE32: 224, PE32+: 240
characteristics : u16le -- EXECUTABLE_IMAGE, DLL, etc.
} -- total: 20 bytes
Machine Types
| Value | Machine | Description |
|---|---|---|
| 0x014c | IMAGE_FILE_MACHINE_I386 | x86 (32-bit) |
| 0x8664 | IMAGE_FILE_MACHINE_AMD64 | x64 (64-bit) |
| 0x01c4 | IMAGE_FILE_MACHINE_ARMNT | ARM Thumb-2 LE |
| 0xaa64 | IMAGE_FILE_MACHINE_ARM64 | ARM64 |
Optional Header
Despite its name, the Optional Header is required for executables. It contains the image base, entry point, and data directories. PE32 uses 224 bytes, PE32+ uses 240.
struct OptionalHeader32 {
magic : u16le -- 0x10B = PE32, 0x20B = PE32+
linker_major : u8
linker_minor : u8
size_of_code : u32le
size_of_init_data : u32le
size_of_uninit : u32le
entry_point : u32le -- RVA of entry point
base_of_code : u32le
base_of_data : u32le -- PE32 only (absent in PE32+)
image_base : u32le -- preferred load address (PE32+: u64le)
section_alignment : u32le -- usually 0x1000 (4KB page)
file_alignment : u32le -- usually 0x200 (512 bytes)
os_version : u16le[2] -- major, minor
image_version : u16le[2]
subsystem_version : u16le[2]
_reserved : u32le
size_of_image : u32le -- total virtual size
size_of_headers : u32le
checksum : u32le
subsystem : u16le -- 2=GUI, 3=Console
dll_characteristics: u16le -- DYNAMIC_BASE, NX_COMPAT, etc.
stack_reserve : u32le -- (PE32+: u64le)
stack_commit : u32le
heap_reserve : u32le
heap_commit : u32le
_loader_flags : u32le -- obsolete
data_dir_count : u32le -- usually 16
data_directories : DataDirectory[data_dir_count]
}
struct DataDirectory {
rva : u32le -- relative virtual address
size : u32le
}
-- Data directory indices:
-- 0 Export Table 1 Import Table
-- 2 Resource Table 3 Exception Table
-- 4 Certificate Table 5 Base Relocation Table
-- 6 Debug 7 Architecture (reserved)
-- 8 Global Pointer 9 TLS Table
-- 10 Load Config 11 Bound Import
-- 12 IAT 13 Delay Import
-- 14 CLR Runtime 15 Reserved
Check the magic number: 0x10B for PE32, 0x20B for PE32+. The key difference is pointer sizes: PE32 uses 32-bit addresses, PE32+ uses 64-bit.
Section Table
Following the Optional Header is the Section Table. Each entry (40 bytes) describes a section: name, virtual address, size, and characteristics.
struct SectionHeader {
name : char[8] -- null-padded (e.g. ".text\0\0\0")
virtual_size : u32le -- size in memory
virtual_address : u32le -- RVA of section start
raw_data_size : u32le -- size on disk (file-aligned)
raw_data_ptr : u32le -- file offset of section data
reloc_ptr : u32le -- 0 for executables
line_numbers_ptr : u32le -- deprecated
reloc_count : u16le
line_numbers_count: u16le
characteristics : u32le -- CODE, EXECUTE, READ, WRITE, etc.
} -- total: 40 bytes
| Section | Purpose | Typical Characteristics |
|---|---|---|
| .text | Executable code | CODE | EXECUTE | READ |
| .data | Initialized data | INITIALIZED_DATA | READ | WRITE |
| .rdata | Read-only data | INITIALIZED_DATA | READ |
| .bss | Uninitialized data | UNINITIALIZED_DATA | READ | WRITE |
| .rsrc | Resources | INITIALIZED_DATA | READ |
| .reloc | Relocations | INITIALIZED_DATA | READ | DISCARDABLE |
Import Directory
The Import Directory lists DLLs and functions the executable needs. Each DLL has an Import Descriptor pointing to the Import Name Table (INT) and Import Address Table (IAT).
-- Array of descriptors, terminated by an all-zero entry
struct ImportDescriptor {
original_first_thunk : u32le -- RVA of Import Name Table (INT)
time_date_stamp : u32le -- 0 unless bound
forwarder_chain : u32le -- -1 if no forwarders
name : u32le -- RVA of DLL name (null-terminated ASCII)
first_thunk : u32le -- RVA of Import Address Table (IAT)
}
-- INT/IAT entries: array of u32le (PE32) or u64le (PE32+)
-- High bit set = ordinal import, otherwise RVA to ImportByName
struct ImportByName {
hint : u16le -- index into export name table (optimization)
name : char[] -- null-terminated function name
}
At load time, the loader resolves each import and writes the actual function address to the IAT. For emulation, intercept these addresses and redirect to HLE implementations.
Export Directory
DLLs export functions through the Export Directory. It contains three parallel arrays: addresses, names, and ordinals. Exports can be by name or by ordinal number.
struct ExportDirectory {
_characteristics : u32le -- reserved, usually 0
time_date_stamp : u32le
version : u16le[2] -- major, minor
name : u32le -- RVA of DLL name
ordinal_base : u32le -- starting ordinal number
address_count : u32le -- entries in address table
name_count : u32le -- entries in name/ordinal tables
address_table_rva : u32le -- RVA of u32le[] function addresses
name_table_rva : u32le -- RVA of u32le[] name string RVAs
ordinal_table_rva : u32le -- RVA of u16le[] ordinal indices
}
-- Lookup by name: find name in name_table, get index,
-- read ordinal_table[index], use as index into address_table
-- Lookup by ordinal: address_table[ordinal - ordinal_base]
Base Relocations
When a PE can't load at its preferred base address, the loader applies relocations. Each relocation block covers a 4KB page and lists offsets that need patching.
-- Relocation table = sequence of blocks until end of .reloc section
struct BaseRelocationBlock {
page_rva : u32le -- base RVA for this 4KB page
block_size : u32le -- total size including header
entries : RelocEntry[(block_size - 8) / 2]
}
struct RelocEntry { -- packed u16le
type : u4 -- bits [15..12]
offset : u12 -- bits [11..0], added to page_rva
}
-- Relocation types:
-- 0 = ABSOLUTE (padding, skip)
-- 3 = HIGHLOW (add delta to u32 at offset) -- PE32
-- 10 = DIR64 (add delta to u64 at offset) -- PE32+
Modern Windows uses Address Space Layout Randomization (ASLR). Executables marked with DYNAMIC_BASE must include relocations to support loading at random addresses.
Resources
The Resource Directory stores icons, dialogs, version info, and more in a three-level tree structure: Type > Name > Language.
struct ResourceDirectory {
_characteristics : u32le -- reserved
time_date_stamp : u32le
version : u16le[2]
named_entry_count : u16le -- entries identified by name
id_entry_count : u16le -- entries identified by integer ID
-- followed by (named + id) DirectoryEntry structs
}
struct DirectoryEntry {
name_or_id : u32le -- high bit: 1=name offset, 0=integer ID
data_or_subdir : u32le -- high bit: 1=subdirectory offset, 0=data entry
}
struct ResourceDataEntry {
data_rva : u32le -- RVA of actual resource data
size : u32le
code_page : u32le
_reserved : u32le
}
-- Tree: Level 1 = resource type (RT_ICON=3, RT_STRING=6, RT_VERSION=16)
-- Level 2 = resource name or ID
-- Level 3 = language ID (LCID)
Further Reading
- Microsoft PE Format Specification
- OSDev PE Wiki
- Matt Pietrek's classic articles on PE internals