Portable Executable (PE) Format

WIN32 PE

The Portable Executable (PE) format is the native executable format for Windows. This reference covers PE32 and PE32+ (64-bit) with complete struct layouts, verified against working parser and emulator implementations.

DOS Header: The Legacy Gate

Every PE file starts with a DOS header for backward compatibility. The key field is e_lfanew at offset 0x3C, pointing to the PE signature.

DOS Header
struct DOSHeader {
  magic             : u16le       -- 0x5A4D "MZ"
  last_page_bytes   : u16le       -- bytes on last page
  page_count        : u16le       -- pages in file (512-byte)
  relocations       : u16le
  header_paragraphs : u16le       -- header size in 16-byte paragraphs
  min_extra         : u16le
  max_extra         : u16le
  initial_ss        : u16le
  initial_sp        : u16le
  checksum          : u16le
  initial_ip        : u16le
  initial_cs        : u16le
  reloc_offset      : u16le
  overlay_number    : u16le
  _reserved         : u16le[4]
  oem_id            : u16le
  oem_info          : u16le
  _reserved2        : u16le[10]
  e_lfanew          : u32le       -- offset 0x3C: pointer to PE signature
}                                   -- total: 64 bytes

The DOS Stub

Between the DOS header and PE signature lies the DOS stub - a small program that prints "This program cannot be run in DOS mode" when executed in 16-bit DOS.

> Implementation Detail

The DOS stub is optional but almost always present. Some packers and protectors replace it with custom code. For emulation, skip directly to e_lfanew.

PE Signature

At the offset specified by e_lfanew, the PE signature is the bytes PE\0\0 (0x00004550).

Offset Size Field Value
0x00 4 Signature 0x00004550 ("PE\0\0")

COFF File Header

Immediately after the PE signature comes the COFF header (20 bytes). This contains the machine type, section count, and characteristics.

COFF File Header
-- Immediately follows the "PE\0\0" signature

struct COFFHeader {
  machine           : u16le       -- target architecture
  number_of_sections: u16le
  time_date_stamp   : u32le       -- seconds since 1970-01-01
  symbol_table_ptr  : u32le       -- usually 0 for executables
  number_of_symbols : u32le
  optional_hdr_size : u16le       -- PE32: 224, PE32+: 240
  characteristics   : u16le       -- EXECUTABLE_IMAGE, DLL, etc.
}                                   -- total: 20 bytes

Machine Types

Value Machine Description
0x014c IMAGE_FILE_MACHINE_I386 x86 (32-bit)
0x8664 IMAGE_FILE_MACHINE_AMD64 x64 (64-bit)
0x01c4 IMAGE_FILE_MACHINE_ARMNT ARM Thumb-2 LE
0xaa64 IMAGE_FILE_MACHINE_ARM64 ARM64

Optional Header

Despite its name, the Optional Header is required for executables. It contains the image base, entry point, and data directories. PE32 uses 224 bytes, PE32+ uses 240.

Optional Header (PE32)
struct OptionalHeader32 {
  magic             : u16le       -- 0x10B = PE32, 0x20B = PE32+
  linker_major      : u8
  linker_minor      : u8
  size_of_code      : u32le
  size_of_init_data : u32le
  size_of_uninit    : u32le
  entry_point       : u32le       -- RVA of entry point
  base_of_code      : u32le
  base_of_data      : u32le       -- PE32 only (absent in PE32+)
  image_base        : u32le       -- preferred load address (PE32+: u64le)
  section_alignment : u32le       -- usually 0x1000 (4KB page)
  file_alignment    : u32le       -- usually 0x200 (512 bytes)
  os_version        : u16le[2]    -- major, minor
  image_version     : u16le[2]
  subsystem_version : u16le[2]
  _reserved         : u32le
  size_of_image     : u32le       -- total virtual size
  size_of_headers   : u32le
  checksum          : u32le
  subsystem        : u16le       -- 2=GUI, 3=Console
  dll_characteristics: u16le      -- DYNAMIC_BASE, NX_COMPAT, etc.
  stack_reserve     : u32le       -- (PE32+: u64le)
  stack_commit      : u32le
  heap_reserve      : u32le
  heap_commit       : u32le
  _loader_flags     : u32le       -- obsolete
  data_dir_count    : u32le       -- usually 16
  data_directories  : DataDirectory[data_dir_count]
}

struct DataDirectory {
  rva               : u32le       -- relative virtual address
  size              : u32le
}

-- Data directory indices:
--   0  Export Table        1  Import Table
--   2  Resource Table      3  Exception Table
--   4  Certificate Table   5  Base Relocation Table
--   6  Debug               7  Architecture (reserved)
--   8  Global Pointer      9  TLS Table
--  10  Load Config        11  Bound Import
--  12  IAT                13  Delay Import
--  14  CLR Runtime        15  Reserved
! PE32 vs PE32+

Check the magic number: 0x10B for PE32, 0x20B for PE32+. The key difference is pointer sizes: PE32 uses 32-bit addresses, PE32+ uses 64-bit.

Section Table

Following the Optional Header is the Section Table. Each entry (40 bytes) describes a section: name, virtual address, size, and characteristics.

Section Header
struct SectionHeader {
  name              : char[8]     -- null-padded (e.g. ".text\0\0\0")
  virtual_size      : u32le       -- size in memory
  virtual_address   : u32le       -- RVA of section start
  raw_data_size     : u32le       -- size on disk (file-aligned)
  raw_data_ptr      : u32le       -- file offset of section data
  reloc_ptr         : u32le       -- 0 for executables
  line_numbers_ptr  : u32le       -- deprecated
  reloc_count       : u16le
  line_numbers_count: u16le
  characteristics   : u32le       -- CODE, EXECUTE, READ, WRITE, etc.
}                                   -- total: 40 bytes
Section Purpose Typical Characteristics
.text Executable code CODE | EXECUTE | READ
.data Initialized data INITIALIZED_DATA | READ | WRITE
.rdata Read-only data INITIALIZED_DATA | READ
.bss Uninitialized data UNINITIALIZED_DATA | READ | WRITE
.rsrc Resources INITIALIZED_DATA | READ
.reloc Relocations INITIALIZED_DATA | READ | DISCARDABLE

Import Directory

The Import Directory lists DLLs and functions the executable needs. Each DLL has an Import Descriptor pointing to the Import Name Table (INT) and Import Address Table (IAT).

Import Directory
-- Array of descriptors, terminated by an all-zero entry

struct ImportDescriptor {
  original_first_thunk : u32le  -- RVA of Import Name Table (INT)
  time_date_stamp      : u32le  -- 0 unless bound
  forwarder_chain      : u32le  -- -1 if no forwarders
  name                 : u32le  -- RVA of DLL name (null-terminated ASCII)
  first_thunk          : u32le  -- RVA of Import Address Table (IAT)
}

-- INT/IAT entries: array of u32le (PE32) or u64le (PE32+)
-- High bit set = ordinal import, otherwise RVA to ImportByName

struct ImportByName {
  hint                 : u16le  -- index into export name table (optimization)
  name                 : char[] -- null-terminated function name
}
> Binding and the IAT

At load time, the loader resolves each import and writes the actual function address to the IAT. For emulation, intercept these addresses and redirect to HLE implementations.

Export Directory

DLLs export functions through the Export Directory. It contains three parallel arrays: addresses, names, and ordinals. Exports can be by name or by ordinal number.

Export Directory
struct ExportDirectory {
  _characteristics     : u32le  -- reserved, usually 0
  time_date_stamp      : u32le
  version              : u16le[2] -- major, minor
  name                 : u32le  -- RVA of DLL name
  ordinal_base         : u32le  -- starting ordinal number
  address_count        : u32le  -- entries in address table
  name_count           : u32le  -- entries in name/ordinal tables
  address_table_rva    : u32le  -- RVA of u32le[] function addresses
  name_table_rva       : u32le  -- RVA of u32le[] name string RVAs
  ordinal_table_rva    : u32le  -- RVA of u16le[] ordinal indices
}

-- Lookup by name: find name in name_table, get index,
-- read ordinal_table[index], use as index into address_table
-- Lookup by ordinal: address_table[ordinal - ordinal_base]

Base Relocations

When a PE can't load at its preferred base address, the loader applies relocations. Each relocation block covers a 4KB page and lists offsets that need patching.

Base Relocation Block
-- Relocation table = sequence of blocks until end of .reloc section

struct BaseRelocationBlock {
  page_rva             : u32le  -- base RVA for this 4KB page
  block_size           : u32le  -- total size including header
  entries              : RelocEntry[(block_size - 8) / 2]
}

struct RelocEntry {                -- packed u16le
  type                 : u4     -- bits [15..12]
  offset               : u12    -- bits [11..0], added to page_rva
}

-- Relocation types:
--   0 = ABSOLUTE  (padding, skip)
--   3 = HIGHLOW   (add delta to u32 at offset) -- PE32
--  10 = DIR64     (add delta to u64 at offset) -- PE32+
i ASLR and Relocations

Modern Windows uses Address Space Layout Randomization (ASLR). Executables marked with DYNAMIC_BASE must include relocations to support loading at random addresses.

Resources

The Resource Directory stores icons, dialogs, version info, and more in a three-level tree structure: Type > Name > Language.

Resource Directory
struct ResourceDirectory {
  _characteristics     : u32le  -- reserved
  time_date_stamp      : u32le
  version              : u16le[2]
  named_entry_count    : u16le  -- entries identified by name
  id_entry_count       : u16le  -- entries identified by integer ID
  -- followed by (named + id) DirectoryEntry structs
}

struct DirectoryEntry {
  name_or_id           : u32le  -- high bit: 1=name offset, 0=integer ID
  data_or_subdir       : u32le  -- high bit: 1=subdirectory offset, 0=data entry
}

struct ResourceDataEntry {
  data_rva             : u32le  -- RVA of actual resource data
  size                 : u32le
  code_page            : u32le
  _reserved            : u32le
}

-- Tree: Level 1 = resource type (RT_ICON=3, RT_STRING=6, RT_VERSION=16)
--        Level 2 = resource name or ID
--        Level 3 = language ID (LCID)

Further Reading