Reverse Engineering Deep Dive

The 13-Stage Scan Pipeline

Inside Windows Defender's mpengine.dll

13Pipeline Stages
14.3 MBBinary Size
0Exports
0Threats

Note: If it wasn't obvious, this knowledge base was created as an LLM-driven experiment — the documentation, slide decks, and structural analysis were generated with significant AI assistance. The content will be iterated on and improved over time as findings are validated and refined through hands-on reverse engineering.

What Happens When You Scan a File?

When Windows Defender scans a file, it passes through 13 distinct stages inside a single monolithic DLL. Each stage can detect threats, collect attributes, or recursively invoke the entire pipeline on extracted content.

The pipeline is sequential but has three recursive feedback loops: PE emulation unpacking, container extraction, and script deobfuscation. A single ZIP containing a macro-enabled Word doc could trigger hundreds of recursive scans.

Every address and string cited comes from reverse engineering the actual mpengine.dll binary.

16Sig Engines
70+Container Formats
4Script Languages
0WinAPI Handlers
0Lua Scripts
0TLV Entries

Entry Points

How scan requests enter mpengine.dll — 90 exports, 2 primary dispatchers

Export Table

// Primary exports __rsignal @ 0x10133CD0 // Command router rsignal @ 0x102BF000 // Newer dispatch MpBootStrap @ 0x102BD660 // Init engine MpContainerOpen @ 0x102BCF00 // Container API MpContainerAnalyze @ 0x102BCB80 // Container scan GetSigFiles @ 0x102BEE10 // VDM enumeration

Command Codes

0x4003 BOOT_ENGINE // Load signatures 0x400B SCAN_BUFFER // Primary scan 0x4019 SCAN_AMSI // Script content 0x4036 SCAN_FILE // File scan (new) 0x4052 SCAN_DISPATCH // Direct dispatch

__rsignal Dispatch

; @ 0x10133CD0 push ebp mov ebp, esp and esp, 0xFFFFFFF8 mov eax, [ebp+0xc] ; cmd code cmp eax, 0x4003 ; BOOT je handler cmp eax, 0x400B ; SCAN je handler cmp eax, 0x4019 ; AMSI je handler ; sub-dispatch @ 0x10133D35
0x10C707B0 Global engine context pointer
0x10CA5654 Engine initialized flag byte

The Complete Pipeline

Every stage a file passes through, from entry to verdict

Entry
FRIENDLY
Static x9
Attr Collect
PE Emulation
Unpack
Containers
Script Deob
BRUTE
Lua
AAGG Eval
MAPS
Verdict
SigTree ML (cross-cutting) — 33,428 decision trees evaluate over all accumulated sigattr
Stage 2
SHA-256 whitelist
Stages 3-4
9 engines + attrs
Stages 5-8
Emu + Unpack + Deob
Stages 9-13
BRUTE + Lua + Cloud

Recursive Feedback Loops

Three stages can trigger re-scanning through the entire pipeline

PE Unpacking

After emulation, modified PE sections and VFS-dropped files are fed back through the full pipeline.

Stage 6 → Stage 2

Container Extraction

Each child file from ZIP, OLE2, PDF, etc. gets a full recursive scan. Depth-limited by DBVAR.

Stage 7 → Stage 2

Script Deobfuscation

Each deobfuscated layer is scanned through Stages 3-10. Up to 32 passes per script.

Stage 8 → Stage 3
// Simplified recursive scan pseudocode fn scan_recursive(data: &[u8], depth: u32) { if depth > MAX_DEPTH { return; } run_static_engines(data); // Stage 3 if is_pe(data) { let unpacked = emulate_pe(data); // Stage 5 scan_recursive(unpacked, depth+1); // Stage 6 } for child in extract_container(data) { // Stage 7 scan_recursive(child, depth+1); } deobfuscate_script(data); // Stage 8 run_brute(data); // Stage 9 run_lua_scripts(data); // Stage 10 }

All 13 Stages at a Glance

01 Entry Point __rsignal dispatch
02 FRIENDLY_FILE SHA-256 whitelist
03 Static Cascade 9 static sig engines
04 Attr Collection HashSet<String>
05 PE Emulation x86/x64 CPU emu
06 Unpack Scan Post-emu rescan
07 Containers 70+ formats
08 Script Deob PS/VBS/JS/BAT
09 BRUTE Polymorphic match
10 Lua Scripts 59K scripts
11 AAGG Eval Boolean expressions
12 MAPS Cloud Bond + FASTPATH
13 Verdict Resolution Merge detections → final threat name
14 SigTree ML (cross-cutting) 33,428 ML decision trees — !MTB / !ml detections

Key Decision Points

What triggers each major pipeline branch?

FRIENDLY_FILE SHA-256 in whitelist → skip entire pipeline → return Clean
Static Detection High-confidence match → may skip emulation (configurable via attributes)
PE Emulation is_pe(data) → trigger emulation. Controlled by pea_force_unpacking, pea_disable_static_unpacking
Container Extract Magic byte detection → format-specific extractor. Depth limited by DBVAR
Script Deob Language auto-detect (extension + content) → apply transforms until fixed-point
MAPS Cloud Lowfi detection + MAPS enabled → Bond serialize → POST to MAPS endpoint (*.wdcp.microsoft.com, e.g. fastpath)

Data Flow Between Stages

What passes between pipeline stages

ScanContext (per-file)

struct ScanContext { file_data: &[u8], file_path: Option<&str>, file_size: u64, content_type: ContentType, scan_depth: u32, // Accumulated across all stages: attributes: HashSet<String>, threat_list: Vec<ThreatRecord>, // PE-specific: pe_info: Option<PeMetadata>, emulator_ctx: Option<EmuContext>, scan_flags: u32, }

Attribute Flow

Static Engines → deposit HSTR: / SIGATTR: attributes
PE Emulation → deposit FOP: / TUNNEL: / THREAD: attributes
Lua Scripts → deposit mp.setattribute() custom attributes
AAGGREGATOR evaluates boolean expressions over all collected attributes

Pipeline By The Numbers

0Threats Defined
0MD5 Hashes
0PEHSTR Rules
0KCRCE Entries
0Lua Scripts
0FOP Rules
0WinAPI Handlers
0Virtual DLLs
0Deob Transforms
70+Container Formats
0SIG_TREE ML Trees
9.3MTLV Entries
14.3 MBBinary Size

All data from reverse engineering mpengine.dll v1.1.24120.x

Reverse engineering of mpengine.dll — Windows Defender scan pipeline internals