GraymanRE - PE Headers for Malware Analysts: From File Structure to Suspicious Indicators

Introduction

This blog post aims to highlight my understanding of the headers within a PE-COFF file and how they can be extracted, and why they matter for malware research and reverse engineering.

In preparation for this post, and to demonstrate software development proficiency as well as a decent understanding of the PE-COFF file headers, I developed a C++ project named PEDetect, which parses the relevant headers from a given executable.
PEDetect is a self-made attempt at mirroring the features and capabilities of multiple well-known industry tools by aiming to develop an advanced understanding of the PE-COFF file structure. With PEDetect the aim is to understand how these tools work, and more importantly how the PE-COFF and other headers in executables are structured. This understanding should lead to an increase in knowledge about the information that these headers contain and how they can be extracted and correlated.

During the description of the PE file structure in this first blog post, the HxDSetup executable will be used to guide the reader through the different headers and structures. However, as this project is approached from the perspective of a malware reverse engineer, we will also highlight the relevance of certain headers and fields from a malware research perspective.

1. Understanding the PE-COFF File Structure
2. DOS Header
3. DOS Stub
4. Rich Header
5. PE Header
6. COFF Header
7. Optional Header
8. Data Directories
9. Sections
10. Conclusion

Understanding the PE-COFF File Structure

PEDetect works by reading the input file byte by byte, starting at offset 0 and will obtain its information from the DOS Header and build upon this information to learn more about the subsequent headers: DOS Header, DOS Stub, COFF header, Optional Header, Data Directory and sections.

DOS Header

The DOS Header is the first header in a PE file and is 64 bytes long. It starts with the infamous magic bytes MZ or in hex 0x4D5A. The header structure itself is defined in the winnt.h header file and appropriately named the IMAGE_DOS_HEADER struct.


typedef struct _IMAGE_DOS_HEADER {      
    WORD   e_magic;                 // Magic number (MZ)
    WORD   e_cblp;                  // Bytes on last page of file
    WORD   e_cp;                    // Pages in file
    WORD   e_crlc;                  // Relocations
    WORD   e_cparhdr;               // Size of header in paragraphs
    WORD   e_minalloc;              // Minimum extra paragraphs needed
    WORD   e_maxalloc;              // Maximum extra paragraphs needed
    WORD   e_ss;                    // Initial (relative) SS value
    WORD   e_sp;                    // Initial SP value
    WORD   e_csum;                  // Checksum
    WORD   e_ip;                    // Initial IP value
    WORD   e_cs;                    // Initial (relative) CS value
    WORD   e_lfarlc;                // File address of relocation table
    WORD   e_ovno;                  // Overlay number
    WORD   e_res[4];                // Reserved words
    WORD   e_oemid;                 // OEM identifier (for e_oeminfo)
    WORD   e_oeminfo;               // OEM information; e_oemid specific
    WORD   e_res2[10];              // Reserved words
    LONG   e_lfanew;                // File address of COFF header
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;

Often the most important field in the DOS header is e_lfanew. This field contains the file offset of the PE signature, also called the start of the NT Headers. At that offset, the loader expects PE\0\0, followed by the COFF/File Header and the Optional Header.
The image below depicts the output of PEDetect, which displays the size of the header and the location of the COFF header.

Example:
  4D 5A 50 00 02 00 00 00  04 00 0F 00 FF FF 00 00  MZP.........ÿÿ..
  B8 00 00 00 00 00 00 00  40 00 1A 00 00 00 00 00  ¸.......@.......
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
  00 00 00 00 00 00 00 00  00 00 00 00 00 01 00 00  ................
  ...

Output from PEDetect:
  ---- Parsing the DOS Header ----
  [+] Size of header in paragraphs: 64
  [+] PE Header is located at offset: 0x000000B0

DOS Stub

The DOS Stub is located directly after the DOS Header and is a small legacy code region which will print a default error message along the lines of This program cannot be run in DOS mode or This program must be run under Win32 if the executable is loaded in MS-DOS. In essence, the stub is actually a piece of machine code which you can disassemble and analyze using a disassembler like GHIDRA/IDA. To do so, you can extract the stub bytes and load them into a disassembler as 16-bit code. The DOS Stub is a 16-bit MS-DOS program native to Intel 8086 processors. Once loaded you will notice that it contains a few simple instructions along the lines of obtaining the address of the error message and printing it using a DOS interrupt API call before exiting with an error.

Example:
  BA 10 00 0E 1F B4 09 CD  21 B8 01 4C CD 21 90 90  º....´.Í!¸.LÍ!..
  54 68 69 73 20 70 72 6F  67 72 61 6D 20 6D 75 73  This program mus
  74 20 62 65 20 72 75 6E  20 75 6E 64 65 72 20 57  t be run under W
  69 6E 36 34 0D 0A 24 37  00 00 00 00 00 00 00 00  in64..$7........
  ...

Relevance for Malware Research

Relatively recent research specifically demonstrated how malware performs PE-format manipulations. These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section. The research goes on to show that these attacks outperform existing ones in both white-box and black-box scenarios, achieving a better trade-off in terms of evasion rate and size of the injected payload, while also enabling evasion of models that have been shown to be robust to previous attacks.

RICH Header

Executables compiled using the Microsoft Visual Studio toolset will have a populated RICH header. The RICH header is an officially undocumented structure, however, over the years researchers have been able to 'decode' the specific items in this header. You will notice, that in the executable that we have used so far, the RICH header is fully nulled. A missing, zeroed, or invalid Rich Header may suggest the binary was not produced by the Microsoft linker, but it can also be stripped, altered, packed, or intentionally corrupted. Treat it as a clue, not proof.

The RICH header begins with a chunk of XOR-ed data, a signature and a 32-bit checksum which simultaneously acts as the XOR key. Let's start with the most straightforward part, the signature: the signature is a 4-byte object containing the string Rich (0x52696368).

The last part of the RICH Header, is a 32-bit checksum. This checksum is simultaneously the XOR key to decode the data before the signature. Once decoded, the data will contain a signature containing the string DanS. Subsequently, there is likely to be padding in the form of zeroed DWORD values. Lastly, the data contains DWORD key-value pairs which each represent a tool name, the build number of the tool and the number of times the tool has been used. PEDetect will read the XORed data blob into a fixed 88-byte array, it will then parse the signature to determine if a proper RICH header blob has been read in. Subsequently, it will skip over the padding and parse every 8 bytes, corresponding to the DWORD key-value pairs to decode the build number, tool number and count.

Example:
  52 81 77 D6 16 E0 19 85  16 E0 19 85 16 E0 19 85  R.wÖ.à.….à.….à.…
  06 64 1A 84 12 E0 19 85  06 64 1D 84 1D E0 19 85  .d.„.à.….d.„.à.…
  06 64 1C 84 37 E0 19 85  06 64 18 84 10 E0 19 85  .d.„7à.….d.„.à.…
  5D 98 18 84 13 E0 19 85  16 E0 18 85 D1 E0 19 85  ]˜.„.à.….à.…Ñà.…
  5D 65 1C 84 18 E0 19 85  5D 65 E6 85 17 E0 19 85  ]e.„.à.…]eæ….à.…
  5D 65 1B 84 17 E0 19 85  52 69 63 68 16 E0 19 85  ]e.„.à.…Rich.à.…
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
  00 00 00 00 00 00 00 00  ...                      ........
  ...

Output from PEDetect:
  ---- Parsing the RICH Header ----
  [+] The XOR key value is: 16 E0 19 85 
  [+] The decoded block looks like:
      44 61 6e 53 00 00 00 00 00 00 00 00 00 00 00 00  |DanS............|
      10 84 03 01 04 00 00 00 10 84 04 01 0b 00 00 00  |................|
      10 84 05 01 21 00 00 00 10 84 01 01 06 00 00 00  |....!...........|
      4b 78 01 01 05 00 00 00 00 00 01 00 c7 00 00 00  |Kx..........Ç...|
      4b 85 05 01 0e 00 00 00 4b 85 ff 00 01 00 00 00  |K.......K.ÿ.....|
      4b 85 02 01 01 00 00 00                          |K.......|

Relevance for Malware Research

ESET has performed Rich Header work which explicitly discusses malware families such as Dridex and Industroyer and shows how Rich Header features can support clustering, hunting, and anomaly detection. It also makes the important point that Rich Headers are useful but not absolute proof of authorship.
A Rich Header can support "this was built in a similar environment" or "this resembles a known cluster", but it should not be treated as direct attribution proof.

PE Signature

Now we get to the PE Signature which is located before the COFF header. We previously obtained the starting address of this header in the DOS Header by reading the value located in the e_lfanew field. The PE Signature marks the start of the PE file. This value is always \x50\x45\x00\x00 and represents the ASCII string PE\0\0.

Example:
  50 45 00 00  PE.. 
  ...

COFF Header

The COFF Header, also commonly referred to as the PE File Header or IMAGE_FILE_HEADER, contains seven fields.

Offset	Size	Field	Description
0	2	Machine	The target architecture for which the executable was built. There's a subset of predefined examples, such as: 0x014C -> Intel 386 0x01C0 -> ARM 0xAA64 -> ARM64 0x8664 -> AMD64
2	2	NumberOfSections	The number of sections that are present in the PE file. The sections, which we will dive into later, represent different parts of the file that contains code, data or other resources the executable requires.
4	4	TimeDateStamp	The number of seconds indicating when the file was created. This can be used in forensics and malware reverse engineering cases. For example, it allows for speculation regarding when certain campaigns were developed and can be used to differentiate between samples and progressions in development of specific adversarial campaigns. Note that although it can support timeline analysis, it should not be treated as reliable on its own. Malware authors and packers can tamper with it.
8	4	PointerToSymbolTable	The file offset of the COFF symbol table. Note that this value, especially in executables and DLL files is often set to 0x00000000. Therefore, we will not discuss this any further at this point.
12	4	NumberOfSymbols	The number of entries in the symbol table. As the symbol table is often not present, this value is likely to be 0x00000000 as well.
16	2	SizeOfOptionalHeader	The size of the Optional Header. In normal PE triage, the two values you will most commonly care about are 0x10B for PE32 and 0x20B for PE32+.
18	2	Characteristics	The Characteristics field is a combination of one or multiple flags that indicate the attributes and characteristics of the executable. Microsoft has disclosed the full list of Flags and their corresponding values and descriptions.

Using PEDetect, we extract all of these values and display them accordingly. A more detailed explanation of all fields will be given below.

Example (includes PE Signature for clarity):
  50 45 00 00 64 86 09 00  F9 4D 24 60 00 00 00 00  PE..d†..ùM$`....
  00 00 00 00 F0 00 23 00                           ....ð.# 
  ...

Output from PEDetect:
  ---- Parsing the COFF File Header ----
  [+] CPU Type: Intel 386 or later
  [+] We are dealing with a total of 2 sections
  [+] Creation/Last modification timestamp: 2008-01-06 14:51:31
  [+] In executables and DLL files, the Pointer to the Symbol Table is set to 0x00000000
      [+] We are dealing with a total of 0 symbols
  [+] The Optional Header is ffe0 bytes in size
      [+] Based on the Optional Header, the file is likely 32-bit.
  [+] Characteristics value: 0x10F. This value indicates the attributes and characteristics of the file:
      [+] Matched Flags:
      > IMAGE_FILE_RELOCS_STRIPPED
      > IMAGE_FILE_EXECUTABLE_IMAGE
      > IMAGE_FILE_LINE_NUMS_STRIPPED
      > IMAGE_FILE_LOCAL_SYMS_STRIPPED
      > IMAGE_FILE_32BIT_MACHINE

Relevance for Malware Research

Timestamps are one of the aspects that is used to perform attribution. However, a timestamp should be treated as a clue, not truth. Compare COFF timestamp against Rich Header, debug directory, certificate timestamp, resource timestamps, and campaign context. The timestamp is one of several PE locations that may carry time information, but malware authors and packers can easily tamper with it.

Optional Header

Despite its name, the optional header is present in every image file and provides information to the loader. As defined by Microsoft, the optional header is only optional in object files. The first step is to validate that optional header magic number and ensure our previous assumption in terms of 32-bit and 64-bit executables is correct.

Depending on the format, 32-bit or 64-bit, the Optional Header will have one more field. For 32-bit files, the BaseOfData field exists, and doesn't exist in 64-bit executables. Furthermore, some fields might be 8 bytes in size instead of 4.

Within the code of PEDetect, we account for this accordingly and have developed a PE32 and PE64 parser. Using PEDetect, the output will display all significant values that can be found in the header.

Offset (PE/PE32+)	Size (PE/PE32+)	Field	Description
0	2	Magic	Specifies file format. If the value is 0x10B it represents a 32-bit executable, if the value is 0x20B it represents a 64-bit executable.
2	1	MajorLinkerVersion	The linker major version number.
3	1	MinorLinkerVersion	The linker minor version number.
4	4	SizeOfCode	The size of the code (text) section(s).
8	4	SizeOfInitializedData	The size of the initialized data section(s) (.data, .rdata, resources, etc.).
12	4	SizeOfUninitializedData	The size of the uninitialized data section(s) (BSS).
16	4	AddressOfEntryPoint	The value in this field represents the Relative Virtual Address where the Windows loader transfers control after mapping the image and performing loader-managed initialization. It is not necessarily main, WinMain, or the first code that runs in the process. TLS callbacks and runtime startup code may execute before developer-controlled logic.
20	4	BaseOfCode	The Relative Virtual Address of the beginning-of-code section.
24	4	BaseOfData	The Relative Virtual Address of the beginning-of-data section. Note that this field does not exist in PE32+ executables.
28/24	4/8	ImageBase	The preferred address of the first byte of image when loaded into memory
32/32	4	SectionAlignment	The alignment in bytes of sections when they are loaded into memory.
36/36	4	FileAlignment	The alignment in bytes that is used to align the raw data of sections.
40/40	2	MajorOperatingSystemVersion	The major version number of the required operating system.
42/42	2	MinorOperatingSystemVersion	The minor version number of the required operating system.
44/44	2	MajorImageVersion	The major version number of the image.
46/46	2	MinorImageVersion	The minor version number of the image.
48/48	2	MajorSubsystemVersion	The major version number of the subsystem.
50/50	2	MinorSubsystemVersion	The minor version number of the subsystem.
52/52	4	Win32VersionValue	By default this value must be zero.
56/56	4	SizeOfImage	The size (in bytes) of the image, including all headers, as the image is loaded in memory.
60/60	4	SizeOfHeaders	The combined size of an MS-DOS stub, PE header and section headers.
64/64	4	Checksum	The image file checksum. Important for drivers and some system images; often zero or ignored for ordinary user-mode executables.
68/68	2	Subsystem	The subsystem that is required to run this image. The full list of values and corresponding descriptions can be found on the Microsoft website. Examples: 0x02 -> Windows GUI, 0x03 -> Windows Console
70/70	2	DllCharacteristics	The DllCharacteristics field defines security and execution characteristics for a binary, such as whether it supports Address Space Layout Randomization (ASLR) or Data Execution Prevention (DEP). From a malware perspective, this is a valuable section because it shows which mitigations the binary opts into or avoids. Common flags include: HighEntropyVirtualAddressSpace (0x0020): Used for 64-bit images to support high-entropy ASLR. DynamicBase (0x0040): Enables ASLR, allowing the image to be relocated at load time. ForceIntegrity (0x0080): Enforces code integrity checks. NxCompat (0x0100): Indicates the image is compatible with Non-eXecutable (NX) memory protection. NoSeh (0x0400): Specifies that the image does not use Structured Exception Handling (SEH). AppContainer (0x1000): Requires the image to run inside an AppContainer. ControlFlowGuard (0x4000): Indicates support for Microsoft Control Flow Guard security mitigation.
72/72	4/8	SizeOfStackReserve	The size of the stack to reserve. The default size of the stack reserve is 1MB for PE32 and 4MB for PE32+.
76/80	4/8	SizeOfStackCommit	The size of the stack to commit.
80/88	4/8	SizeOfHeapReserve	The size of the heap to reserve.
84/96	4/8	SizeOfHeapCommit	The size of the heap to commit.
88/104	4	LoaderFlags	Reserved, must be zero.
92/108	4	NumberOfRvaAndSizes	The number of data-directory entries in the remainder of the optional header. By using the value in this field, we can determine the size in bytes we have to read to capture all the fields of the Data Directory. Most often, the value of this field is 0x10 or 16 which covers the standard PE directories like the Import and Export Tables and Import Address Table.

Example:
  0B 02 08 00 00 6E 53 00  00 EE 15 00 00 00 00 00  .....nS..î......
  30 7B 53 00 00 10 00 00  00 00 40 00 00 00 00 00  0{S.......@.....
  00 10 00 00 00 02 00 00  05 00 01 00 05 00 02 00  ................
  05 00 01 00 00 00 00 00  00 80 6A 00 00 04 00 00  .€j.....Î•i...@.
  CE 95 69 00 02 00 40 01  00 00 10 00 00 00 00 00  .........@......
  00 40 00 00 00 00 00 00  00 00 10 00 00 00 00 00  ......... ......
  00 20 00 00 00 00 00 00  00 00 00 00 10 00 00 00  ................
  ...

Output from PEDetect:
  ---- Parsing the Optional Header ----
  [+] File is 32-bit executable format - determined based on the Magic of the Optional Header
  [+] Major Linker Version number is: 0x5
      [+] Microsoft Linker 5.x - Visual Studio 97
  [+] Minor Linker Version number is: 0xC
      [+] Microsoft Linkver Version 7.0 - corresponding to Visual Studio .NET 2002
  [+] The total size of all sections that contain executable code is: 200 bytes
  [+] The total size of all sections that contain initialized data is: 1800 bytes
  [+] The total size of all sections that contain uninitialized data is: 0 bytes
  [+] The entry point is located at address: 0x00000208
  [+] The base of code (in memory) is located at address: 0x00000200
  [+] The base of data (in memory) is located at address: 0x00000400
  [+] The preferred memory address at which the image should be loaded is 0x00400000
  [+] The alignment of sections in memory is: 512 bytes
  [+] The alignment of sections in the file is: 512 bytes
  [+] The alignment of sections in memory is greater than or equal to the sections in the file on disk
  [+] The OS version number (major.minor) is: 4.0
  [+] The Image version number (major.minor) is: 4.0
  [+] Win32VersionValue field is reserved and as such is set to 0
  [+] The total size of the image is: 7168 bytes
  [+] The size of all headers is: 512 bytes
  [+] The checksum value is 0x84e6
  [+] The subsystem required to run the image is: Windows GUI
  [+] The size of memory reserved for the stack is: 0
  [+] The size of memory initially committed for the stack is: 4096 bytes
  [+] The size of memory reserved for the heap is: 0
  [+] The size of memory initially committed for the stack is: 4096 bytes
  [+] Loader Flags are reserved for system use and is set to 0, as expected
  [+] The number of data directories following this field is: 16.

Relevance for Malware Research

The Optional Header tells you how the loader will map and start the image. For malware analysis, AddressOfEntryPoint, ImageBase, SectionAlignment, FileAlignment, Subsystem, and DllCharacteristics deserve special attention.

Data Directories

The data directory is a set of pointers that are part of the Optional Header.

Offset (PE32/PE32+)	Size	Field name	Description
96/112	8	Export Table	The export table address and size (.edata section)
104/120	8	Import Table	The import table address and size (.idata section)
112/128	8	Resource Table	The resource table address and size (.rsrc section)
120/136	8	Exception Table	The exception table address and size (.pdata section)
128/144	8	Certificate Table	The certificate table address and size
136/152	8	Base Relocation Table	The base relocation table address and size (.reloc section)
144/160	8	Debug	The debug data starting address and size (.debug section)
152/168	8	Architecture	Reserved, must be zero
160/176	8	Global Ptr	The RVA of the value to be stored in the global pointer register
168/184	8	TLS Table	The thread local storage table address and size (.tls section)
176/192	8	Load Config Table	The load configuration table address and size
184/200	8	Bound Import	The bound import table address and size
192/208	8	Import Address Table	The import address table address and size
200/216	8	Delay Import Descriptor	The delay import descriptor address and size
208/224	8	CLR Runtime Header	The CLR runtime header address and size (.cormeta section)
216/232	8	Not specified	Reserved, must be zero

For most Data Directory entries, the first four bytes represent an RVA and the second four bytes represent the size. The Certificate Table is an important exception: its address is a file offset because certificate data is not mapped into memory like normal image data.
Each binary can and likely will have a different Data Directory layout because not every directory needs to be present in every binary.
PEDetect reads each directory entry and checks whether the RVA and size are present. If a directory is present, the tool can attempt to map the RVA to a section and parse the corresponding structure.

Example:
  00 00 00 00 00 00 00 00  00 D0 60 00 F0 59 00 00  .........Ð`.ðY..
  00 30 65 00 00 6E 05 00  00 60 61 00 20 C1 03 00  .0e..n...`a. Á..
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................ 
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
  00 00 00 00 00 00 00 00  00 50 61 00 28 00 00 00  .........Pa.(...
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
  88 E6 60 00 20 15 00 00  00 30 61 00 8A 0D 00 00  ˆæ`. ....0a.Š...
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
  ...

Output from PEDetect:
  ---- Parsing the Data Directory Fields ----
  [+] The Import Table is located at: 0x60d000, and has a size of: 0x59f0 bytes.
  [+] The Resource Table is located at: 0x653000, and has a size of: 0x56e00 bytes.
  [+] The Exception Table is located at: 0x616000, and has a size of: 0x3c120 bytes.
  [+] The TLS Table is located at: 0x615000, and has a size of: 0x28 bytes.
  [+] The Import Address Table is located at: 0x60e688, and has a size of: 0x1520 bytes.
  [+] The Delay Import is located at: 0x613000, and has a size of: 0xd8a bytes.

Relevance for Malware Research

Mandiant documented a Ursnif/Gozi-ISFB sample that manipulated TLS callbacks while injecting into a child process. Their report also explains the key teaching point: TLS callbacks can execute before the normal AddressOfEntryPoint, meaning analysts and automated tools can miss the real first malicious code if they only break at the entry point.

Furthermore, Data directories are where PE structure starts to become behaviorally meaningful. Imports suggest capability, TLS changes execution order, resources may hide payload/configuration, relocations affect mapping, and the certificate table affects trust decisions.

Sections

We saw in the Data Directories that most tables correspond to a specific section. A section in a PE file contains code or data that linkers and Microsoft Win32 loaders process without special knowledge of the section contents.
With PEDetect, a best-effort attempt has been made at reading and parsing the sections. Mainly the most common sections, such as .text, .rdata, .data and a few others were prioritized. A sample output of PEDetect is displayed below for a few of the parsed sections.

Each section is 40 bytes long and contains the 10 fields outlined below:

Offset	Size	Field name	Description
0	8	Name	An ASCII string representing the section name
8	4	VirtualSize	The total size of the section in memory. Note that a section may be larger than the size on disk due to alignment
12	4	VirtualAddress	The RVA of the section, relative to the image base
16	4	SizeOfRawData	The size of the section data in the file, aligned to the File Alignment
20	4	PointerToRawData	The file offset where the section's data starts
24	4	PointerToRelocations	The file offset of the relocation entries for the section
28	4	PointerToLinenumbers	The file offset of the line number entries for the section
32	2	NumberOfRelocations	The number of relocation entries for the section
34	2	NumberOfLinenumbers	The number of line number entries for the section
36	4	Characteristics	Flags indicating attributes for the section

Example:
  2E 74 65 78 74 00 00 00 C0 6C 53 00 00 10 00 00  .text...ÀlS.....
  00 6E 53 00 00 04 00 00 00 00 00 00 00 00 00 00  .nS.............
  00 00 00 00 20 00 00 60 2E 64 61 74 61 00 00 00  .... ..`.data...
  30 77 0C 00 00 80 53 00 00 78 0C 00 00 72 53 00  0w...€S..x...rS.
  00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 C0  ............@..À
  2E 62 73 73 00 00 00 00 7C C3 00 00 00 00 60 00  .bss....|Ã....`.
  00 00 00 00 00 EA 5F 00 00 00 00 00 00 00 00 00  .....ê_.........
  00 00 00 00 00 00 00 C0 2E 69 64 61 74 61 00 00  .......À.idata..
  F0 59 00 00 00 D0 60 00 00 5A 00 00 00 EA 5F 00  ðY...Ð`..Z...ê_.
  00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 C0  ............@..À
  2E 64 69 64 61 74 61 00 8A 0D 00 00 00 30 61 00  .didata.Š....0a.
  00 0E 00 00 00 44 60 00 00 00 00 00 00 00 00 00  .....D`.........
  00 00 00 00 40 00 00 C0 2E 74 6C 73 00 00 00 00  ....@..À.tls....
  CC 02 00 00 00 40 61 00 00 00 00 00 00 52 60 00  Ì....@a......R`.
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 C0  ...............À
  2E 72 64 61 74 61 00 00 28 00 00 00 00 50 61 00  .rdata..(....Pa.
  00 02 00 00 00 52 60 00 00 00 00 00 00 00 00 00  .....R`.........
  00 00 00 00 40 00 00 40 2E 70 64 61 74 61 00 00  ....@..@.pdata..
  20 C1 03 00 00 60 61 00 00 C2 03 00 00 54 60 00   .Á...`a..Â..T`.
  00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 40  ............@..@
  2E 72 73 72 63 00 00 00 94 48 05 00 00 30 65 00  .rsrc..."H...0e.
  00 4A 05 00 00 16 64 00 00 00 00 00 00 00 00 00  .J....d.........
  00 00 00 00 40 00 00 40                          ....@..@
  ...

Output from PEDetect:
  ---- Processing the section information ----
  [+] Parsing section: .text
  [+] The total size of the section in memory is: 0x536cc0
  [+] The relative virtual address of the section in memory is: 0x1000
  [+] The size of the section data in the file is: 0x536e00
  [+] The section's data starts at offset: 0x400
  [+] The relocation entries start at offset: 0x0
  [+] The line-number entries for the section start at offset: 0x0
  [+] There are a total of 200000 line-number entries for the section.
  [+] This section has the following attributes:
      [+] IMAGE_SCN_CNT_CODE - Contains executable code

  [+] Parsing section: .data
  [+] The total size of the section in memory is: 0xc7730
  [+] The relative virtual address of the section in memory is: 0x538000
  [+] The size of the section data in the file is: 0xc7800
  [+] The section's data starts at offset: 0x537200
  [+] The relocation entries start at offset: 0x0
  [+] The line-number entries for the section start at offset: 0x0
  [+] There are a total of 400000 line-number entries for the section.
  [+] This section has the following attributes:
      [+] IMAGE_SCN_CNT_INITIALIZED - Contains initialized data

  [+] Parsing section: .bss
  [+] The total size of the section in memory is: 0xc37c
  [+] The relative virtual address of the section in memory is: 0x600000
  [+] The size of the section data in the file is: 0x0
  [+] The section's data starts at offset: 0x5fea00
  [+] The relocation entries start at offset: 0x0
  [+] The line-number entries for the section start at offset: 0x0
  [+] There are a total of 0 line-number entries for the section.
  [+] This section has the following attributes:
      [-] No flags have been found

  [+] Parsing section: .idata
  [+] The total size of the section in memory is: 0x59f0
  [+] The relative virtual address of the section in memory is: 0x60d000
  [+] The size of the section data in the file is: 0x5a00
  [+] The section's data starts at offset: 0x5fea00
  [+] The relocation entries start at offset: 0x0
  [+] The line-number entries for the section start at offset: 0x0
  [+] There are a total of 400000 line-number entries for the section.
  [+] This section has the following attributes:
      [+] IMAGE_SCN_CNT_INITIALIZED - Contains initialized data

  [+] Parsing section: .didata
  [+] The total size of the section in memory is: 0xd8a
  [+] The relative virtual address of the section in memory is: 0x613000
  [+] The size of the section data in the file is: 0xe00
  [+] The section's data starts at offset: 0x604400
  [+] The relocation entries start at offset: 0x0
  [+] The line-number entries for the section start at offset: 0x0
  [+] There are a total of 400000 line-number entries for the section.
  [+] This section has the following attributes:
      [+] IMAGE_SCN_CNT_INITIALIZED - Contains initialized data

  [+] Parsing section: .tls
  [+] The total size of the section in memory is: 0x2cc
  [+] The relative virtual address of the section in memory is: 0x614000
  [+] The size of the section data in the file is: 0x0
  [+] The section's data starts at offset: 0x605200
  [+] The relocation entries start at offset: 0x0
  [+] The line-number entries for the section start at offset: 0x0
  [+] There are a total of 0 line-number entries for the section.
  [+] This section has the following attributes:
      [-] No flags have been found

  [+] Parsing section: .rdata
  [+] The total size of the section in memory is: 0x28
  [+] The relative virtual address of the section in memory is: 0x615000
  [+] The size of the section data in the file is: 0x200
  [+] The section's data starts at offset: 0x605200
  [+] The relocation entries start at offset: 0x0
  [+] The line-number entries for the section start at offset: 0x0
  [+] There are a total of 400000 line-number entries for the section.
  [+] This section has the following attributes:
      [+] IMAGE_SCN_CNT_INITIALIZED - Contains initialized data

  [+] Parsing section: .pdata
  [+] The total size of the section in memory is: 0x3c120
  [+] The relative virtual address of the section in memory is: 0x616000
  [+] The size of the section data in the file is: 0x3c200
  [+] The section's data starts at offset: 0x605400
  [+] The relocation entries start at offset: 0x0
  [+] The line-number entries for the section start at offset: 0x0
  [+] There are a total of 400000 line-number entries for the section.
  [+] This section has the following attributes:
      [+] IMAGE_SCN_CNT_INITIALIZED - Contains initialized data

  [+] Parsing section: .rsrc
  [+] The total size of the section in memory is: 0x54894
  [+] The relative virtual address of the section in memory is: 0x653000
  [+] The size of the section data in the file is: 0x54a00
  [+] The section's data starts at offset: 0x641600
  [+] The relocation entries start at offset: 0x0
  [+] The line-number entries for the section start at offset: 0x0
  [+] There are a total of 400000 line-number entries for the section.
  [+] This section has the following attributes:
      [+] IMAGE_SCN_CNT_INITIALIZED - Contains initialized data

Relevance for Malware Research

Packed files commonly show symptoms such as few imports, high-entropy regions, unusual section names, and entry points in unexpected places. Entropy is especially useful because compressed or encrypted regions often create visible entropy shifts, although entropy alone is not a verdict.
During triage, analysts should look for overlapping or misaligned sections, invalid entry-point mappings, corrupted data directories, malformed imports, fake UPX names, and packed-lookalike layouts.
When analyzing Section headers we can ask: "Does the file layout look like a normal compiler produced it, or does it look transformed by a packer, protector, loader, or adversarial manipulation?"

Conclusion

After parsing the individual headers, PEDetect can be used as a triage aid rather than simply as a PE structure viewer. A clean baseline executable should show coherent header offsets, expected section names, reasonable alignment values, a plausible import table, and section permissions that match their purpose.

For suspicious samples, the same fields can reveal weak signals: an unusual entry point, a missing or tiny import table, high-entropy sections, suspicious section permissions, malformed data directories, a stripped or inconsistent Rich Header, or timestamps that do not align with the rest of the file.

None of these indicators proves maliciousness on its own. The value of PE-header analysis is that it tells the analyst where to look next.

PE Headers for Malware Analysts: From File Structure to Suspicious Indicators

Introduction

Table of Contents

Understanding the PE-COFF File Structure

DOS Header

DOS Stub

Relevance for Malware Research

RICH Header

Relevance for Malware Research

PE Signature

COFF Header

Relevance for Malware Research

Optional Header

Relevance for Malware Research

Data Directories

Relevance for Malware Research

Sections

Relevance for Malware Research

Conclusion

Explore Tags