PE Format
Table of Content
Basic Structure
1. PE File
PE stands for Portable Executable, it’s a file format for executables used in Windows operating systems, it’s based on the COFF file format (Common Object File Format).
Commonly used extensions for PE format
Name | Usage |
---|---|
.exe | executable files |
.dll | dynamic-link library files |
.sys | system files |
.ocx | ActiveX control files |
.scr | screensaver files |
.cpl | Control Panel extension files |
.mui | Multilingual User Interface (MUI) files |
In general, A PE file is a data structure storing basic information for OS to load the file, and it also contains some meta data for non-standard usage.
2. MS-DOS stub
MS-DOS stub includes DOS Header and DOS stub.
DOS Header
It is a small data structure that appears at the beginning of most executable files in the Portable Executable (PE) format. It serves historical and compatibility purposes.
DOS stub
MS-DOS program stub, this stub might display a message such as “This program cannot be run in MS-DOS mode” or perform other simple tasks. Its purpose is to provide a message to users attempting to run the executable in MS-DOS mode, as well as to ensure compatibility with older versions of Windows that boot with MS-DOS as the underlying platform.
Rich Header
The above info is the documented info from MSFT, but actually there is an undocumented section in PE format, which is called Rich Header.
It is a section created by MSFT compilation toolchain, it located between the MS-DOS stub and NT Header. The content of it is obfuscated.
3. NT Header
NT Signature
It is a 4 bytes long signature, typically the characters “PE\0\0” (0x50 0x45 0x00 0x00 in hexadecimal), indicating that the file is a PE format executable.
File Header
It provides general information about the structure and properties of the PE file. This includes details such as the target machine architecture, the number of sections in the file, the timestamp of when the file was created, and other flags and characteristics.
Optional Header
It contains additional information specific to the PE format. This includes details such as the size of various sections, the entry point of the executable, the preferred base address for loading the executable into memory, the address of the import table, the size of the stack and heap, and more.
Inside the Optional Header, there is a very important part called Data Directory. It specify the locations and sizes of various data structures used by the executable. These data structures include the import table, export table, resource table, debug information, and more.
4. Section Header
It contains detailed information about each section present in the file. Each section header describes a specific segment of the PE file, such as code, data, resources, or other types of data.
5. Sections
Standard Sections
Not all of these sections will appear in the PE file. The appearence of them depends on whether the PE file needs them or not.
Text (or Code) Section: This section contains executable code that the program will run. It typically has read and execute permissions but not write permissions.
Data Section: This section contains initialized global and static variables used by the program. It typically has read and write permissions but not execute permissions.
Resource Section: This section contains resources such as icons, images, dialogs, strings, and other data used by the program. Resources can be accessed by the program at runtime.
Import Section: This section contains information about external functions and symbols that the program imports from DLLs or other executables. It includes import tables that specify the names and addresses of imported functions.
Export Section: This section contains information about functions and symbols that the program exports for use by other programs. It includes export tables that specify the names and addresses of exported functions.
Relocation Section: This section contains information about positions in the code and data sections that need to be adjusted if the executable is loaded at a different base address in memory. It facilitates the process of relocating the executable.
Debug Section: This section contains debugging information used by debuggers and other development tools. It includes symbols, line numbers, and other metadata for debugging purposes.
TLS (Thread Local Storage) Section: This section contains data structures used for thread-local storage, which allows variables to have unique values per thread in a multi-threaded program.
Load Configuration Section: This section contains information about the preferred base address, size of the stack and heap, and other memory allocation preferences specified by the program.
Self-defined Sections
Developer can self-define any sections they want and combine it into the PE file. But there’s also an unwritten routine for the naming of them, for better understanding it please refer to real world PE files.
6. Appending Content
Basically this part is not a part of standard PE file. But for many famous products, you can find that the PE file doesn’t end after all sections are loaded. There is an appending content at the end of the content of the sections.
Normally this part is used to store some important information of the file, but the information is too long that not able to be stored at any of the original data structure in a PE file.
C++ code of the data structures above
You can find all of the following code in header <winnt.h>.
DOS Header
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
typedef struct _IMAGE_DOS_HEADER {
WORD e_magic; /* 00: MZ Header signature */
WORD e_cblp; /* 02: Bytes on last page of file */
WORD e_cp; /* 04: Pages in file */
WORD e_crlc; /* 06: Relocations */
WORD e_cparhdr; /* 08: Size of header in paragraphs */
WORD e_minalloc; /* 0a: Minimum extra paragraphs needed */
WORD e_maxalloc; /* 0c: Maximum extra paragraphs needed */
WORD e_ss; /* 0e: Initial (relative) SS value */
WORD e_sp; /* 10: Initial SP value */
WORD e_csum; /* 12: Checksum */
WORD e_ip; /* 14: Initial IP value */
WORD e_cs; /* 16: Initial (relative) CS value */
WORD e_lfarlc; /* 18: File address of relocation table */
WORD e_ovno; /* 1a: Overlay number */
WORD e_res[4]; /* 1c: Reserved words */
WORD e_oemid; /* 24: OEM identifier (for e_oeminfo) */
WORD e_oeminfo; /* 26: OEM information; e_oemid specific */
WORD e_res2[10]; /* 28: Reserved words */
DWORD e_lfanew; /* 3c: Offset to extended header */
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;
NT Header
There are two types of NT Headers, 32-bits and 64-bits version. The only difference between them is the structure of Optional Header.
NT Header 64 bits
1
2
3
4
5typedef struct _IMAGE_NT_HEADERS64 {
DWORD Signature;
IMAGE_FILE_HEADER FileHeader;
IMAGE_OPTIONAL_HEADER64 OptionalHeader;
} IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;NT Header 32 bits
1
2
3
4
5typedef struct _IMAGE_NT_HEADERS {
DWORD Signature; /* "PE"\0\0 */ /* 0x00 */
IMAGE_FILE_HEADER FileHeader; /* 0x04 */
IMAGE_OPTIONAL_HEADER32 OptionalHeader; /* 0x18 */
} IMAGE_NT_HEADERS32, *PIMAGE_NT_HEADERS32;
NT Signature
1 |
File Header
1 | typedef struct _IMAGE_FILE_HEADER { |
Optional Header
There are 32 bits and 64 bits Optional Header. The former one contains one more variable:
1 | DWORD BaseOfData |
32 bits Optional Header
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40typedef struct _IMAGE_OPTIONAL_HEADER {
/* Standard fields */
WORD Magic; /* 0x10b or 0x107 */ /* 0x00 */
BYTE MajorLinkerVersion;
BYTE MinorLinkerVersion;
DWORD SizeOfCode;
DWORD SizeOfInitializedData;
DWORD SizeOfUninitializedData;
DWORD AddressOfEntryPoint; /* 0x10 */
DWORD BaseOfCode;
DWORD BaseOfData;
/* NT additional fields */
DWORD ImageBase;
DWORD SectionAlignment; /* 0x20 */
DWORD FileAlignment;
WORD MajorOperatingSystemVersion;
WORD MinorOperatingSystemVersion;
WORD MajorImageVersion;
WORD MinorImageVersion;
WORD MajorSubsystemVersion; /* 0x30 */
WORD MinorSubsystemVersion;
DWORD Win32VersionValue;
DWORD SizeOfImage;
DWORD SizeOfHeaders;
DWORD CheckSum; /* 0x40 */
WORD Subsystem;
WORD DllCharacteristics;
DWORD SizeOfStackReserve;
DWORD SizeOfStackCommit;
DWORD SizeOfHeapReserve; /* 0x50 */
DWORD SizeOfHeapCommit;
DWORD LoaderFlags;
DWORD NumberOfRvaAndSizes;
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES]; /* 0x60 */
/* 0xE0 */
} IMAGE_OPTIONAL_HEADER32, *PIMAGE_OPTIONAL_HEADER32;64 bits Optional Header
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32typedef struct _IMAGE_OPTIONAL_HEADER64 {
WORD Magic; /* 0x20b */
BYTE MajorLinkerVersion;
BYTE MinorLinkerVersion;
DWORD SizeOfCode;
DWORD SizeOfInitializedData;
DWORD SizeOfUninitializedData;
DWORD AddressOfEntryPoint;
DWORD BaseOfCode;
ULONGLONG ImageBase;
DWORD SectionAlignment;
DWORD FileAlignment;
WORD MajorOperatingSystemVersion;
WORD MinorOperatingSystemVersion;
WORD MajorImageVersion;
WORD MinorImageVersion;
WORD MajorSubsystemVersion;
WORD MinorSubsystemVersion;
DWORD Win32VersionValue;
DWORD SizeOfImage;
DWORD SizeOfHeaders;
DWORD CheckSum;
WORD Subsystem;
WORD DllCharacteristics;
ULONGLONG SizeOfStackReserve;
ULONGLONG SizeOfStackCommit;
ULONGLONG SizeOfHeapReserve;
ULONGLONG SizeOfHeapCommit;
DWORD LoaderFlags;
DWORD NumberOfRvaAndSizes;
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER64, *PIMAGE_OPTIONAL_HEADER64;
Section Header
1 | typedef struct _IMAGE_SECTION_HEADER { |
For the comments of the variables in the code above, you can find them easily in the MSFT official document.
Parse PE files
From image
- Use file handler to read the file into the memory.
- fwrite() the first 64 bits into an instance of data structure IMAGE_DOS_HEADER.
- Use IMAGE_DOS_HEADER.e_lfanew to get the location of the starting byte of NT Header.
- Use fseek() to put the file handler at that byte.
- fwrite() the size of NT Header into an instance of data structure IMAGE_NT_HEADER.
- Instancialize IMAGE_FILE_HEADER and IMAGE_OPTIONAL_HEADER to store File Header and Optional Header in the instance of NT Header.
- Get the value of IMAGE_FILE_HEADER.NumberOfSections
- Load the content left from the file handler to a pointer of PIMAGE_SECTION_HEADER type.
- Traverse the Section Headers by the pointer.
- Traverse the Section Content by the IMAGE_SECTION_HEADER.PointerToRawData and IMAGE_SECTION_HEADER.SizeOfRawData variables.
From Memory
- Use LoadLibrary() to get the handler of a DLL file.
- Initialize the pointers of IMAGE_DOS_HEADER, let’s say
1
dosHeader = LoadLibrary("Your DLL Name");
- Find the address of IMAGE_NT_HEADER, let’s say the instance of it is ntHeader
1
ntHeader = dosHeader->e_lfanew + dosHeader;
- Use IMAGE_FILE_HEADER (fileHeader) and IMAGE_OPTIONAL_HEADER (optionalHeader) to point to the corresponding address
1
2
3fileHeader = ntHeader + sizeof(DWORD);
optioanalHeader = fileHeader + sizeof(IMAGE_FILE_HEADER); - Get the address of the first Section Header (sectionHeader).
1
sectionHeader = optionalHeader + fileHeader->SizeOfOptionalHeader;
- Browse the content of sections by Section Header
1
sectionAddress = optionalHeader->ImageBase + sectionHeadr->VitualAddress;