Reverse Engineering Deep Dive

The 13-Stage Scan Pipeline

Inside Windows Defender's mpengine.dll

13Pipeline Stages

14.3 MBBinary Size

0Exports

0Threats

Note: If it wasn't obvious, this knowledge base was created as an LLM-driven experiment — the documentation, slide decks, and structural analysis were generated with significant AI assistance. The content will be iterated on and improved over time as findings are validated and refined through hands-on reverse engineering.

What Happens When You Scan a File?

When Windows Defender scans a file, it passes through 13 distinct stages inside a single monolithic DLL. Each stage can detect threats, collect attributes, or recursively invoke the entire pipeline on extracted content.

The pipeline is sequential but has three recursive feedback loops: PE emulation unpacking, container extraction, and script deobfuscation. A single ZIP containing a macro-enabled Word doc could trigger hundreds of recursive scans.

Every address and string cited comes from reverse engineering the actual mpengine.dll binary.

16Sig Engines

70+Container Formats

4Script Languages

0WinAPI Handlers

0Lua Scripts

0TLV Entries

Entry Points

How scan requests enter mpengine.dll — 90 exports, 2 primary dispatchers

Export Table

// Primary exports
__rsignal          @ 0x10133CD0  // Command router
rsignal             @ 0x102BF000  // Newer dispatch
MpBootStrap         @ 0x102BD660  // Init engine
MpContainerOpen     @ 0x102BCF00  // Container API
MpContainerAnalyze  @ 0x102BCB80  // Container scan
GetSigFiles         @ 0x102BEE10  // VDM enumeration

Command Codes

0x4003  BOOT_ENGINE    // Load signatures
0x400B  SCAN_BUFFER    // Primary scan
0x4019  SCAN_AMSI      // Script content
0x4036  SCAN_FILE      // File scan (new)
0x4052  SCAN_DISPATCH  // Direct dispatch

__rsignal Dispatch

; @ 0x10133CD0
push  ebp
mov   ebp, esp
and   esp, 0xFFFFFFF8
mov   eax, [ebp+0xc]  ; cmd code
cmp   eax, 0x4003     ; BOOT
je    handler
cmp   eax, 0x400B     ; SCAN
je    handler
cmp   eax, 0x4019     ; AMSI
je    handler
; sub-dispatch @ 0x10133D35

0x10C707B0 Global engine context pointer

0x10CA5654 Engine initialized flag byte

The Complete Pipeline

Every stage a file passes through, from entry to verdict

Entry

→

FRIENDLY

→

Static x9

→

Attr Collect

→

PE Emulation

→

Unpack

→

Containers

→

Script Deob

→

BRUTE

→

Lua

→

AAGG Eval

→

MAPS

→

Verdict

SigTree ML (cross-cutting) — 33,428 decision trees evaluate over all accumulated sigattr

Stage 2

SHA-256 whitelist

Stages 3-4

9 engines + attrs

Stages 5-8

Emu + Unpack + Deob

Stages 9-13

BRUTE + Lua + Cloud

Recursive Feedback Loops

Three stages can trigger re-scanning through the entire pipeline

PE Unpacking

After emulation, modified PE sections and VFS-dropped files are fed back through the full pipeline.

Stage 6 → Stage 2

Container Extraction

Each child file from ZIP, OLE2, PDF, etc. gets a full recursive scan. Depth-limited by DBVAR.

Stage 7 → Stage 2

Script Deobfuscation

Each deobfuscated layer is scanned through Stages 3-10. Up to 32 passes per script.

Stage 8 → Stage 3

// Simplified recursive scan pseudocode
fn scan_recursive(data: &[u8], depth: u32) {
    if depth > MAX_DEPTH { return; }
    run_static_engines(data);           // Stage 3
    if is_pe(data) {
        let unpacked = emulate_pe(data);  // Stage 5
        scan_recursive(unpacked, depth+1); // Stage 6
    }
    for child in extract_container(data) { // Stage 7
        scan_recursive(child, depth+1);
    }
    deobfuscate_script(data);           // Stage 8
    run_brute(data);                     // Stage 9
    run_lua_scripts(data);               // Stage 10
}

All 13 Stages at a Glance

01 Entry Point __rsignal dispatch

02 FRIENDLY_FILE SHA-256 whitelist

03 Static Cascade 9 static sig engines

04 Attr Collection HashSet<String>

05 PE Emulation x86/x64 CPU emu

06 Unpack Scan Post-emu rescan

07 Containers 70+ formats

08 Script Deob PS/VBS/JS/BAT

09 BRUTE Polymorphic match

10 Lua Scripts 59K scripts

11 AAGG Eval Boolean expressions

12 MAPS Cloud Bond + FASTPATH

13 Verdict Resolution Merge detections → final threat name

14 SigTree ML (cross-cutting) 33,428 ML decision trees — !MTB / !ml detections

Key Decision Points

What triggers each major pipeline branch?

FRIENDLY_FILE SHA-256 in whitelist → skip entire pipeline → return Clean

Static Detection High-confidence match → may skip emulation (configurable via attributes)

PE Emulation is_pe(data) → trigger emulation. Controlled by pea_force_unpacking, pea_disable_static_unpacking

Container Extract Magic byte detection → format-specific extractor. Depth limited by DBVAR

Script Deob Language auto-detect (extension + content) → apply transforms until fixed-point

MAPS Cloud Lowfi detection + MAPS enabled → Bond serialize → POST to MAPS endpoint (*.wdcp.microsoft.com, e.g. fastpath)

Data Flow Between Stages

What passes between pipeline stages

ScanContext (per-file)

struct ScanContext {
  file_data:    &[u8],
  file_path:    Option<&str>,
  file_size:    u64,
  content_type: ContentType,
  scan_depth:   u32,
  // Accumulated across all stages:
  attributes:   HashSet<String>,
  threat_list:  Vec<ThreatRecord>,
  // PE-specific:
  pe_info:      Option<PeMetadata>,
  emulator_ctx: Option<EmuContext>,
  scan_flags:   u32,
}

Attribute Flow

Static Engines → deposit HSTR: / SIGATTR: attributes

↓

PE Emulation → deposit FOP: / TUNNEL: / THREAD: attributes

↓

Lua Scripts → deposit mp.setattribute() custom attributes

↓

AAGGREGATOR evaluates boolean expressions over all collected attributes

Pipeline By The Numbers

0Threats Defined

0MD5 Hashes

0PEHSTR Rules

0KCRCE Entries

0Lua Scripts

0FOP Rules

0WinAPI Handlers

0Virtual DLLs

0Deob Transforms

70+Container Formats

0SIG_TREE ML Trees

9.3MTLV Entries

14.3 MBBinary Size

All data from reverse engineering mpengine.dll v1.1.24120.x

Reverse engineering of mpengine.dll — Windows Defender scan pipeline internals