ThinkPad firmware analysis · open toolchain · long-form writeup

ThinkPads From the Inside: A Reproducible Path
From Archived BIOS to Named SoC Pads

A reverse-engineering writeup on turning Lenovo's published BIOS archive into something structured enough to drive coreboot ports, Hackintosh skeletons, GPIO security audits, and CVE-level firmware diffs — without owning the hardware, and without folklore.

Estimated read: ~35 minutes · 18 sections · covers Intel, AMD and (stub) Qualcomm

00At a glance

Where the project stands before you read the long version.

11 / 11

ThinkPad BIOSes through the pipeline

22 / 22

Intel pipeline steps

19 / 19

AMD pipeline steps

9 / 9

BIOS-payload filter

SoC vendors supported

CVEs mapped

Lenovo advisories

Models in CVE→model index

100%

8051 + RL78 EC opcode coverage

0 / 8

OEM BIOSes w/ embedded SPD

FSP-S/M/T blobs carved from CNL+TGL

01Prologue: why archived ThinkPad firmware is a treasure trove

The peculiar fact that a manufacturer hands you, for free, the exact bits that drive every machine it has ever sold.

The ThinkPad is one of the few mass-market PC product lines that has accumulated a parallel open-source culture around it: long-running coreboot ports for a handful of Sandy Bridge and Ivy Bridge classics, Linux quirks tables for every model since the T20, Hackintosh ports for the GoBook generation, a quietly enormous body of forum knowledge about exactly which kernel option calms which Lenovo BIOS bug. The reasons are not mysterious — ThinkPads ship with field-replaceable parts, decent keyboards, and BIOSes that the firmware team actually maintains for years — but the consequence is that a lot of independent engineering value lives downstream of Lenovo's "Drivers & Software" pages.

That archive is the part most people forget about. Lenovo Support publishes every BIOS update for every supported model, every driver bundle, every flash utility, organized by machine type and chronological. For a model still under support you get the complete historical chain — every microcode bump, every Intel ME revision, every SMM hardening, every CVE fix — downloadable as a string of .exe packages. For models that have rolled out of support the binaries remain mirrored and discoverable. The same site indexes tens of thousands of binaries that, taken together, are a near-comprehensive record of how Lenovo built each board.

Inside each BIOS package is the part that matters: the device tree the OS sees (ACPI), the firmware volumes (PEI, DXE, SMM), the embedded controller firmware, the Intel FSP remnants, the microcode files, the VBT, the flash descriptor on some models. Everything a coreboot porter needs to bring a new board up; everything a security auditor needs to map a CVE to a binary fix; everything a Hackintosh ports project needs to drive a sensible config.plist. The bits are public. They have been public the whole time.

What is not public is a reproducible pipeline that says: "give me a model number and a BIOS version, and you get back structured JSON describing every device, every GPIO it touches by name, every firmware-internal change since the previous revision, every CVE the changelog mentions, every privacy-sensitive GPIO and its lock posture, every SMI-routable input, every UEFI module that grew or shrank between the last two updates, every signal of where a fix landed in the binary." That is the gap this project sets out to close.

The output of the pipeline is not a beauty-contest reverse engineering of one board. It is a coverage statement: across the archive, the toolchain produces structured data on every model whose firmware can be carved — and as the archive grows, the model count grows with it. Bug-for-bug, the per-model output may still need a human pass before it can drive a real port; but the cost of preparing that human pass drops by an order of magnitude when you start from a clean auto-generated baseline.

What this writeup is not. It is not a beginner's tour of UEFI internals (there are excellent ones — Beyond BIOS from Intel Press, the EDK2 docs, and the Phrack 66 PixieDust series cover the basics far better). It is a focused field report on the specific problems you hit when you try to do this at fleet scale, what worked, what surprised, and what the firmware-only path cannot answer no matter how clever you get.

02Background: three primers in twelve paragraphs

ACPI / DSDT, Intel's GPIO architecture, and the GpioLib indirection — just enough to read the rest.

A. ACPI, DSDT, SSDTs

The Differentiated System Description Table (DSDT) is the central ACPI table. Firmware compiles it from ASL (ACPI Source Language) into AML (ACPI Machine Language) at build time, and the OS interprets the AML at runtime to discover devices, power states, GPIO connections, EC commands, and thermal zones. Every ACPI-capable OS — Linux, Windows, macOS, the BSDs — reads the same DSDT. The DSDT is therefore the most reliable single description of what hardware is present on the board, more reliable than PCI enumeration (because it includes non-PCI devices like the EC, GPIO-attached buttons, sensors), and more reliable than Linux DMI tables (which only carry strings).

Secondary System Description Tables (SSDTs) compile separately and load at runtime to extend or override pieces of the DSDT. Boards with multiple variants typically ship a generic DSDT plus per-variant SSDTs that patch in the right chunks; AcpiPlatform decides which SSDTs to load based on hardware identifiers it reads at boot. A modern ThinkPad commonly ships ten SSDTs alongside the DSDT.

Every device with a hardware GPIO connection has a _CRS (Current Resource Settings) method that returns a serialized list of resource descriptors. For GPIO that descriptor is either a GpioIo (the device can drive or read the pin) or a GpioInt (the pin is wired to the device as an interrupt source). Each carries a controller path (e.g. \_SB.PCI0.GPI0), a pin number, an edge/level/polarity selector, and a pull configuration.

On AMD and Qualcomm ThinkPads, the pin number in GpioIo/GpioInt is a direct integer that maps straight onto an AGPIO or GPIO pad. On Intel it is often a method call — and that is where the trouble starts.

B. The Intel GPIO architecture

Each Intel PCH (from Skylake / Sunrise Point onwards) carries one or more GPIO controllers (community controllers), each owning a number of groups (banks of about 24 pins each). Pins inside a group are referred to as pads: GPP_A22 means "General Purpose Programmable, group A, pad 22". Each pad has two 32-bit configuration registers stored in fixed PCH-memory-mapped space: PADCFG_DW0 and PADCFG_DW1.

PADCFG_DW0 (per pad, 32 bits)
  31      PADRSTCFG[1]    (pad reset config, hi bit)
  30      RXEVCFG[1]      (edge / level select, hi bit)
  29      RXRAW1          (force raw 1 on RX)
  28      RXEVCFG[0]
  27      PREGFRXSEL      (route raw into glitch filter)
  26      RXINV           (invert RX)
  25      GPIROUTIOXAPIC  (route to IOxAPIC)
  24      GPIROUTSCI      (route to SCI)
  23      GPIROUTSMI      (route to SMI)
  22      GPIROUTNMI      (route to NMI)
  21..20  PMODE           (pad mode: 0 = GPIO, 1..3 = native funcs)
  19..18  RXTXENCFG
  17      RXDIS           (RX disable)
  16      TXDIS           (TX disable)
  15..8   reserved
  7..1    reserved
  0       GPIORXSTATE / GPIOTXSTATE

PADCFG_DW1
  31..16  TERM            (termination / pull strength)
  ...
  PADCFG[lock] is held in a separate per-group LOCK register, NOT in DW0/DW1.

The decode of every field above comes from Intel 100-Series PCH Datasheet Volume 2 (document 332691), and from the per-generation datasheets that follow it. Pinning the decode to the official document matters: intelp2m in coreboot is downstream and occasionally lags the register layout, and the security report needs an authoritative source for what "locked" means before it can call a pad genuinely-safe-versus-attacker-controllable.

C. The GpioLib indirection problem

Inside the DSDT, an Intel ThinkPad does not bake the pad name into the _CRS. It writes something like:

// DSDT excerpt (lightly redacted for readability)
Device (FPNT) {
  Name (_HID, "VFS5011")
  Method (_CRS, 0, NotSerialized) {
    Return (ResourceTemplate () {
      SpiSerialBusV2 (..., 0x00000000, ...) { ... }
      GpioInt (Level, ActiveLow, Shared, PullDefault, 0,
               "\\_SB.PCI0.GPI0", 0, ResourceConsumer)
        { GNUM(GFPI) }
    })
  }
}

GNUM is a method elsewhere in the DSDT that takes a GNVS field name (here GFPI, the fingerprint-interrupt field) and returns an integer. GNVS is the Global NVS region — an ACPI-NVS memory range whose layout is declared by the firmware and whose values are populated at boot. It is RAM, not flash; nothing in the firmware image carries the runtime contents of GNVS directly.

The values written into GNVS come from a UEFI driver called AcpiPlatform that runs in DXE. AcpiPlatform reads the board identity from hardware (typically EC straps or an EEPROM), picks the right pad set for the current board variant, and stores each pad as an Intel GpioLib GPIO_PAD immediate at the GNVS offset corresponding to its field name. The encoded form of that immediate is:

GPIO_PAD

= 0xCC_GG_NNNN

chipset id (per-SoC, e.g. 0x01 = SPT-LP, 0x07 = CNL-LP, 0x0B = TGL-LP)

group / bank (A = 0x00, B = 0x01, … per-SoC mapping)

NNNN

pad index within the group (0..N-1)

So the static information needed to resolve a single device to a named pad is distributed across three pieces of the firmware: the DSDT (which GNVS field feeds which device), AcpiPlatform (which GPIO_PAD immediate gets written into that field), and PlatformInit's PADCFG table (which mode, direction and lock state apply to that pad). Recovering the answer means joining all three.

03The Lenovo BIOS corpus, in shape

What is actually on those Lenovo download pages, organized for the resolver.

Every supported ThinkPad model on Lenovo's support site has a "Drivers & Software" page with a downloads tree organized by category (BIOS, Audio, Chipset, LAN, etc.). A BIOS update is distributed as an .exe file — an InnoSetup-packaged WinFlash installer of roughly 6–12 MB, occasionally larger when carrying microcode or ME firmware updates. The same site indexes the installer under a stable URL of the form:

by_mt/<MachineType>/drivers/<DocId>/files/<id>w.exe

The MachineType is the 4-character Lenovo MT (e.g. 20Q5); the DocId is the Lenovo support article id; the id encodes the BIOS revision (e.g. r0buj26ww = T-series, build 26ww). The naming convention is stable enough that an aggressive scraper can mirror every BIOS package for every published model with a few thousand HTTP requests, and a surprising amount of metadata (release date, supported OS, change summary) sits on the article pages adjacent to each download.

Inside a BIOS .exe the InnoSetup payload contains:

The Windows flash utility (WinFlash.exe or a vendor variant).
The firmware image, usually as one of .fl1, .fl2, .cap (signed capsule), or rarely .rom. The .fl1 is the raw UEFI flash image — what the SPI chip would hold (sans descriptor on some models).
A readme (readme.txt or similar) with the change-log: BIOS, EC, ME version bumps, plus a Security Updates block listing the CVEs and Lenovo advisories that the new version addresses.
One or more .PAT files containing microcode updates, named by CPUID (e.g. BDFA8000.PAT).
Occasionally a logo image, a license blob, and the flasher's own DLLs.

Why driver bundles are the first problem

Lenovo ships driver bundles — audio, GPU, wifi, fingerprint — using the same <id>w.exe naming convention as BIOS updates. They are not BIOS updates; they don't carry an .fl1. A naive coverage harness that tries to acpi_extract every <id>w.exe in a model's download tree will fail noisily on every driver bundle, which is most of them.

The cheap fix is to look at the payload first, without extracting it. Each .exe can be probed with innoextract --list (or with 7-zip in list mode for non-InnoSetup wrappers), and the listing alone tells you whether a .FL1/.FL2/.CAP file is present. Packages that pass the check are extracted normally; packages that fail are classified as a non-BIOS payload and reported separately, so that coverage numbers reflect what was actually attempted.

Sidenote on file extension heuristics. The naive heuristic "a BIOS payload is whatever is bigger than 4 MB" works most of the time and fails in three cases worth knowing about: ME firmware updates (look like a BIOS but the .fl1 is much smaller), GPU VBIOS hotfixes (occasionally large enough to trip a size-only test), and combo packages that wrap a BIOS + driver bundle in one installer. The listing-based check handles all three correctly because it looks at the actual payload structure.

04Extraction: from `.exe` to ACPI tables

Each step is reversible and well-understood; the trouble is in handling all of them at once.

Once a BIOS-class package has been identified, the work is to walk it down to ACPI tables. The path looks like this:

<id>w.exe (InnoSetup WinFlash installer) | | innoextract (Inno format, falls back to 7z then binwalk) v WinFlash.exe + image.fl1 + readme.txt + *.PAT (extracted payload) | | carve at first _FVH (skips installer header; lands at firmware-volume start) v raw UEFI flash image, FFSv1/v2/v3 fileset | | uefiextract (primary; UEFITool CLI, LongSoft/UEFITool) | | uefi-firmware-parser (fallback for cleaner ACPI carve on known-good Intel images) | | nested-FFSv2 carve (fallback for AMD/Phoenix wrapper-FV layouts) v file tree by GUID, including AcpiTableStorage FFS (GUID 7E374E25-...) | | parse FFS sections; pull out raw ACPI tables; dedup by SHA-256 v *.aml (DSDT, SSDT1..N, FACP, RSDP, ...) | | iasl -d (ACPICA disassembler) v *.dsl (human-readable ASL, the analysis-ready form)

No part of that path is novel in isolation. The interesting work is in the cases where the path breaks — and in the diversity of those cases across the archive.

Why `uefiextract` is the primary FV extractor

The Lenovo archive spans Phoenix, AMI and Insyde flavored BIOSes, sometimes with FFSv1, FFSv2 and FFSv3 files within the same image; with LZMA, Tiano-EFI compression, LZMA-F86 variants; with Phoenix's wrapper-FV layout where the inner FVs are themselves compressed inside an outer FV. UEFITool's uefiextract is the union of all of those handlers in one tool. Promoting it from "fallback when uefi-firmware-parser fails" to primary turned a 7-of-12 extraction rate on the diverse sample into 12 of 12, with uefi-firmware-parser kept as a fallback (it sometimes produces a cleaner ACPI carve on known-good Intel images), and a nested-FFSv2 carve handling the AMD / Phoenix wrapper-FV layout as a last resort.

The AcpiTableStorage FFS

UEFI Platform Init defines a specific FFS file type and GUID for the firmware's ACPI table store: EFI_ACPI_TABLE_STORAGE_FILE_GUID = 7E374E25-8E01-4FEE-87F2-390C23C606CD. Every ACPI-capable UEFI BIOS carries the compiled DSDT, SSDTs and supporting tables inside a single FFS file of that GUID, with each table as a raw section. Dedup-by-hash is necessary because some BIOSes carry a duplicate copy of the AcpiTableStorage FFS as a fallback (for example, when a recovery FV is allowed to override the main one).

The output of this step is a directory of .aml files plus the ASL decompilation; nothing else in the pipeline needs to touch the raw firmware image, and downstream tools work entirely off the ASL plus the per-FV file tree that uefiextract left behind.

05The Intel GPIO resolution problem

Why the DSDT, on its own, is necessary but not sufficient on Intel.

On AMD ThinkPads (and on the small number of Qualcomm-based ones) the DSDT is the whole answer. A GpioInt resource for the fingerprint reader carries the AGPIO pin number directly:

// AMD-style: pin number is a literal
GpioInt (Edge, ActiveLow, Exclusive, PullUp, ...,
         "\\_SB.GPIO", 0, ResourceConsumer) { 0x002B }   // AGPIO 43

On Intel ThinkPads, the same descriptor passes through GNUM(FIELDNAME):

// Intel-style: pin number is a method call
GpioInt (Level, ActiveLow, Shared, PullDefault, ...,
         "\\_SB.PCI0.GPI0", 0, ResourceConsumer) { GNUM(GFPI) }

GNUM reads a field out of GNVS (the ACPI-NVS region) and decodes it into a pin number relative to the named controller. The decoded value is an Intel GpioLib GPIO_PAD immediate that AcpiPlatform wrote at boot. Until you find that write, the DSDT alone tells you only that the fingerprint interrupt is "the pin GNVS field GFPI says it is" — which is not a pin number.

The static information required to recover the answer lives in three places:

The DSDT tells you which device's _CRS references which GNVS field. From this alone you get the (field, device) join — e.g. field GFPI → device FPNT.
AcpiPlatform, the DXE driver, contains the runtime writes that populate GNVS. Statically extracting those writes from the PE32+ image gives you a list of (field, GPIO_PAD immediate). The same field is sometimes written multiple times under different conditional branches (board-variant switches), which is the hard half of the problem.
PlatformInit's PADCFG table tells you, for each pad on the board, the configured mode, direction, pull, and lock state. Once you have a candidate pad for a device, you cross-check PADCFG to confirm the pad is configured for the device's role — for example, an SPI-CS pin must be in native function 1 with TX enabled.

Joining the three sources is mechanically straightforward but practically fiddly. AcpiPlatform is compiled X86 (or x86-64); the writes to GNVS are typically mov [GnvsBase + offset], imm32 sequences, sometimes preceded by a conditional that selects between multiple imm32 candidates. gpio_resolve.py walks the disassembly, collects every candidate immediate per field, and emits them with their guard conditions. The PADCFG cross-check then resolves the candidate set to a single pad per device.

For board-invariant pins — a fingerprint sensor wired the same way across every variant of a model — the candidate set is a single immediate and the resolution is exact. For board-variant pins — a GNSS or BT antenna routed differently depending on factory-installed radio — the candidate set carries several, and only the live board can tell you which one is yours. The toolkit emits the per-variant table directly so that follow-up work (or a collect_gpio.sh capture on hardware) can resolve it.

06A walkthrough: fingerprint reader on ThinkPad 13 Skylake

End-to-end resolution on a real BIOS, from the carved DSDT to the named GPP pad.

The ThinkPad 13 (1st gen, Skylake-U) is a useful walkthrough target: it is single-SoC (Sunrise Point-LP, INT344B), its DSDT is small enough to read end-to-end, and its fingerprint reader is board-invariant. The BIOS used here is r0buj26ww, the November 2022 update; the input was an 8.1 MB .exe downloaded from Lenovo Support.

Step 1: identify the device in the DSDT

Carve ACPI tables, decompile, grep for the fingerprint device. The relevant fragment looks like:

Device (FPNT) {
  Name (_HID, "VFS5011")
  Method (_STA, 0, NotSerialized) { Return (0x0F) }
  Method (_CRS, 0, NotSerialized) {
    Return (ResourceTemplate () {
      SpiSerialBusV2 (0x0001, PolarityLow, FourWireMode, 8,
                     ControllerInitiated, 0x007A1200, ClockPolarityLow,
                     ClockPhaseFirst, "\\_SB.PCI0.SPI1",
                     0, ResourceConsumer, , Exclusive, )
      GpioInt (Level, ActiveLow, Shared, PullDefault, 0,
               "\\_SB.PCI0.GPI0", 0, ResourceConsumer, , )
        { GNUM(GFPI) }
    })
  }
}

Two GPIOs touch this device: an SPI chip-select that is implicit in the SpiSerialBusV2 resource (pin owned by the SPI controller, not declared here), and the interrupt line that arrives as GpioInt with pin number GNUM(GFPI). The chip-select pin is declared elsewhere; the interrupt pin's identity depends on the GNVS field GFPI.

Step 2: find the `AcpiPlatform` writes to GNVS

AcpiPlatform.efi sits in the firmware volume tree under its own GUID. The disassembly contains, near the end of InstallAcpiPlatform, a sequence of stores to the GNVS region:

; ... earlier code computes GnvsBase into rdi
mov     dword [rdi + 0x10A], 0x01000016   ; GFPI = SPT-LP, group A, pad 0x16 (22)
mov     dword [rdi + 0x10E], 0x01000017   ; GFPS = SPT-LP, group A, pad 0x17 (23, same group)
mov     dword [rdi + 0x112], 0x01000040   ; GPLI = SPT-LP, group A, pad 0x40 (64)
...

The offsets 0x10A, 0x10E, 0x112 correspond to the GNVS field declarations the DSDT emits in its OperationRegion (GNVS, ...) Field blocks. Joining offsets to field names yields:

GFPI -> 0x01000016
GFPS -> 0x01000017
GPLI -> 0x01000040

Step 3: decode the immediate

Decoding 0x01000016 as CC=0x01, GG=0x00, NNNN=0x0016 gives chipset SPT-LP, group A, pad index 22 — or GPP_A22 in coreboot nomenclature. The interrupt sibling, GFPS = 0x01000017, is GPP_A23.

Step 4: PADCFG cross-check

PlatformInit's PADCFG table is a sequence of 12-byte triples, { GpioPad, PADCFG_DW0, PADCFG_DW1 }. Pulling out the entry for GPP_A22:

GPP_A22:
  PADCFG_DW0 = 0x44000300
  PADCFG_DW1 = 0x00003000
  Decoded:
    PMODE   = 1  (native function 1: SPI1_CS#)
    RXDIS   = 0  TXDIS = 0
    GPIROUT* = none
    TERM    = NoPullPad
    LOCKED  = no (held in group LOCK register; checked separately)

The native function is SPI1_CS, which matches the device's role as the chip-select of an SPI fingerprint reader; the role check passes. The interrupt sibling GPP_A23 is configured as a GPIO input with IOxAPIC routing, which matches its role as the fingerprint interrupt line.

ThinkPad 13 Skylake: fingerprint reader resolves to GPP_A22 (CS) / GPP_A23 (INT)

Resolution exact (board-invariant, no candidate ambiguity); PADCFG mode and direction consistent with role; cross-verified against the DSDT SpiSerialBusV2+GpioInt pair.

sample: r0buj26ww.exe · chipset: SPT-LP (INT344B) · tool chain: acpi_extract → gpio_resolve → gpio_padmap

07PADCFG decode — not coreboot folklore

Why the toolkit reads Intel 332691 rather than chasing community pad-config tables.

intelp2m (in coreboot) is the most popular community tool for decoding PADCFG into PAD_CFG_* macros. It is good, well-maintained, and sufficient for most coreboot work. It is also downstream — lagging the Intel datasheets by some number of months whenever a new PCH generation ships, and occasionally simplifying the decode (collapsing rare flag combinations to the closest-fit macro). For a coreboot port that is fine. For a security report that needs to be precise about what "locked" means and what the interrupt routing actually targets, it is not.

The toolkit decodes PADCFG directly from Intel 100-Series PCH Datasheet Volume 2 (332691, the Volume 2 that covers the GPIO controller architecture), with per-SoC supplements for Cannon Lake (332687), Tiger Lake (633331), Alder Lake (645549), and the Wildcat/Sunrise Point/Kaby Lake siblings where they diverge. The decode is structured per-field rather than pattern-matched-to-macros, so when the security check needs to ask "is bit 23 set on this PADCFG, and is the group LOCK register for group A also clear?" it can do that directly.

The fields that matter for security analysis

Field	Bits	What it tells you
`GPIROUTSMI`	DW0[23]	If set, this pin can trigger an SMI when its edge condition fires. Combined with RXEVCFG, this is the SMI attack surface for the pad.
`GPIROUTNMI`	DW0[22]	NMI routing. Less common; when present, often used for chassis-intrusion or watchdog inputs.
`GPIROUTSCI`	DW0[24]	SCI (System Control Interrupt) routing. Used for wake events, lid switches, hot-plug detection.
`GPIROUTIOXAPIC`	DW0[25]	IOxAPIC routing: this pin shows up as a normal device interrupt to the OS.
`PMODE`	DW0[21:20]	0 = GPIO, 1..3 = native functions. SPI, I2C, UART pins live here at native modes.
`RXDIS` / `TXDIS`	DW0[17:16]	Input/output disable. An RXDIS=1 pad cannot be sampled by software no matter the mode.
Group LOCK / LOCKTX	per-group register	Once set, prevents further writes to PADCFG (or its TX state) until reset.

The combination of GPIROUTSMI=1 with the group LOCK clear is the canonical "real attack surface" signal: the pin will fire an SMI on its configured edge, and the firmware did not lock the configuration, so privileged software (or a malicious DXE driver in a future boot) could rewrite the edge condition and re-route it. The GPIO security report sorts pads by exactly this combination, with privacy device GPIOs (camera kill, mic mute, fingerprint power, TPM provisioning) called out separately because they have their own spoofability story regardless of SMI routing.

08Native-mode elimination

Solving for board-variant pins without the live board, by ruling out everything else.

When AcpiPlatform writes multiple candidate immediates into the same GNVS field, the writes are guarded by a switch on a board identifier read at boot. The disassembly looks like:

cmp     al, 0x01           ; board_info->variant == 1?
jne     .v2
mov     dword [rdi + 0x118], 0x01000045   ; GBTI = GPP_A69 on variant 1
jmp     .end
.v2:
cmp     al, 0x02
jne     .v3
mov     dword [rdi + 0x118], 0x01000047   ; GBTI = GPP_A71 on variant 2
jmp     .end
.v3:
mov     dword [rdi + 0x118], 0x0100008C   ; GBTI = GPP_C12 on variant 3
.end:

Without the live board's variant byte, all three candidates remain. The static elimination trick is to look at the PADCFG table for each candidate and check which ones could possibly serve the device's role. If GPP_A71 is configured as native function 2 in the PADCFG table (let's say I2C2_SDA), it cannot also be the Bluetooth host-wake input that GBTI is feeding — it is reserved for I2C. Variant 2 is eliminated. The candidate set shrinks; sometimes to one, sometimes still to two or three, but always strictly smaller than the union of all variants.

The same trick narrows board ID itself when the per-variant tables are disjoint: if exactly one variant's pad set is consistent with the live PADCFG, that's the variant. gpio_resolve emits per-variant candidate sets and a confidence per variant, and the report at the top-level says "this device is GPP_A69 on board variant 1, GPP_C12 on variant 3, and undefined on variant 2". That is the firmware's contribution; the rest needs the live board.

The EC firmware reads board ID from hardware straps. That is why the EC firmware is not, by itself, enough to pin down the variant: the bytes that decide it never enter the image at all. They live in physical pull-up/pull-down resistors on specific GPIO straps, sampled at first power-on by the EC and stored in EC SRAM. Only a live machine knows.

09Vendor matrix: Intel, AMD, Qualcomm

The same pipeline, three quite different GPIO models.

The vendor of the GPIO controller in the DSDT is the dispatch key. The toolkit's vendor.py looks at the _HID of the GPIO controller device and picks the resolution path; the rest of the pipeline runs identically regardless of vendor, except that the Intel-only resolver is skipped on AMD and Qualcomm images (their DSDTs already contain literal pad numbers).

Vendor	Controller `_HID`	GPIO model	Resolution path	Pad naming	Status
Intel	INT34xx / INT33FF / INT54xx	GNVS-indirected, board-variant gated	full pipeline: AcpiPlatform immediates + PADCFG native-mode elimination	GPP_<bank><n>	22/22 on SKL/KBL/CFL/WHL/CML; 21/22 on TGL until AcpiPlatform rename is identified
AMD	AMD0030 / AMDI003x	AGPIO index direct in DSDT (no GNVS, no switch)	`gpio_report.py` alone resolves it	AGPIO<n>	19/19 on A275 (Bristol Ridge)
Qualcomm	QCOM…	direct in DSDT (like AMD)	`gpio_report.py`	GPIO<n>	stub — validate on X13s sample

acpi_extract.py handles both the Intel layout (a single FV with a standard AcpiTableStorage FFS at the top level) and the AMD / Phoenix nested-FFSv2 layout (a wrapper FV holding compressed inner FVs, with the real AcpiTableStorage one level deeper). The nested case took several iterations to get right because Phoenix's compression markers vary across versions; the toolkit handles the three variants observed so far in the archive.

An optional AMD extension decodes the FCH (Fusion Controller Hub) GPIO control register per pin from the AMD Platform Programming Reference, which gives full pad-config documentation comparable to the Intel PADCFG decode. It is not needed for device→pad resolution (because the DSDT already carries the pin number on AMD), but it is needed for the security report's pad-lock analysis on AMD boards.

10The Embedded Controller: an ITE 8051 with a custom SFR map

A second CPU, hidden in plain sight, with its own firmware and its own attack surface.

Almost every ThinkPad ships with an embedded controller (EC), a small 8-bit microcontroller on the LPC (or eSPI) bus that handles power button press, battery gas-gauging, fan PWM, charge control, lid switch, keyboard hotkeys, thermal sensor readout, and the keyboard backlight. The EC's firmware is independent of the BIOS proper and runs continuously from S5 onwards; the host CPU talks to it through a small region of LPC-mapped RAM (the EC-RAM) and a command/status register pair.

On ThinkPads, the EC is overwhelmingly an ITE part — an 8051-family MCU (MCS-51 instruction set), executing a vendor firmware image of roughly 64–128 KB. On the ThinkPad 13 Skylake the EC is an ITE 8051 v14.4, ~111 KB, reset vector at 0x0070, carved cleanly from a known region of the .fl1. AMD ThinkPads sometimes ship a Renesas part instead, with a different instruction set; the toolkit's MCS-51 disassembler does not handle those yet.

What the EC firmware contains

Working from the disassembly, the EC firmware breaks down into:

A reset stub that initializes RAM, sets up the watchdog, and configures the LPC interface so the host can talk to the EC.
An EC-RAM polling loop. Each byte of the host-visible EC-RAM has a known semantic meaning (defined by the host DSDT's OperationRegion(ECRA, EmbeddedControl, 0, 0xFF) and the fields beneath it). The EC keeps these in sync with the underlying hardware: thermal sensor readings, fan tachometers, battery state-of-charge, lid switch state.
A command dispatcher. The host writes a byte to the EC command register; the EC interprets it and may take some action (set fan PWM, trigger a charge cycle, suspend the system). The command codes are mostly the standard ones from the ACPI EC spec, with a Lenovo-specific extension set on top.
_Qxx query handlers, one per "EC event" reported back to the host (lid open/close, AC plug, hotkey press). When the EC sets the SCI line, the host runs the matching _Qxx method from the DSDT, and that method usually reads back an event code byte from EC-RAM.
Strap reads. Specific SFR reads from the GPIO ports of the EC chip pull the board-variant pins discussed earlier. These are the bytes that decide which variant of the board this is, sampled once at first power and stored in EC RAM for the rest of the life of the system.

SFR-aware disassembly

The 8051 uses a separate address space for its Special Function Registers (0x80–0xFF in internal RAM). The base SFR map is standardised; ITE adds its own custom SFRs on top, in roughly the same range, for the host-interface registers (EC-RAM, command/status, SMI control) and the GPIO ports. The toolkit's MCS-51 disassembler annotates both: a mov A, P3 becomes mov A, P3 ; GPIO port 3; a mov A, 0x9E becomes mov A, EC_CMD ; ITE host command reg. With those annotations, finding every host-interface site in the firmware is a grep instead of a multi-day reverse. Coverage on the MCS-51 instruction set is 100% with named annotations on most Lenovo/ITE SFR ranges.

Why the EC matters for the security report: the EC drives several of the "privacy" indicators (camera kill, mic mute) and is involved in firmware update flows (SMM → EC → SPI). A correct attack-surface report needs to account for the EC's role in mediating those signals; a sufficiently capable attacker that can address the EC has paths into the host that are not always closed by SMM hardening alone.

Renesas RL78 — the other EC family

A non-trivial subset of AMD ThinkPads ships a Renesas EC rather than an ITE one. Renesas RL78 is a 16-bit family with a variable-length instruction encoding (the successor to 78K0R), and its disassembly looks quite different from MCS-51 even though the role inside the board is similar. The toolkit carries a pure-Python RL78 decoder (renesas_rl78_disasm.py) aligned with the same decode(mem, pc) → (length, text) interface as the 8051 module, so downstream consumers can dispatch on EC family without changing call sites. Decoding follows the Renesas RL78 Family User's Manual: Software (R01US0015EJ); SFR annotation focuses on the bytes an EC analyst cares about most — port input / output (Pn), port mode (PMn), the ADC, the serial registers, and IRQ control — so strap reads and host-interface sites are findable by name. Coverage is 100% across the 256-byte primary opcode space plus the 0x31 / 0x61 / 0x71 prefix families — every byte resolves to a real instruction or an explicit RESERVED marker. Common opcode space; unknown encodings emit as DB <hex> rather than guessed, which is the right tradeoff for analysis work.

11Security: GPIO attack surface and spoofable privacy pins

Two distinct vulnerability classes, each derivable from the same per-pad PADCFG data.

SMI / NMI routable inputs

A pin configured as input with GPIROUTSMI=1 will trigger an SMI when its configured edge condition (rising, falling, level high, level low) is met. SMIs vector into SMM, which runs at the highest CPU privilege level and is therefore an interesting target: a primitive that lets an attacker influence the flow into SMM is the first step in many BIOS-level privilege escalations.

The first-order filter is "input + SMI routed". Not every such pin is reachable by a software attacker — some live on internal traces only, some are physically inaccessible to a non-root user, some are configured as level-high with an external pull that prevents firing in practice. The report flags candidates and leaves the physical-access part for a human pass; that is the right scope for a static pipeline.

The much more interesting signal is the lock state. The Intel PCH has a per-group LOCK register that, once set, prevents further writes to PADCFG until reset. Firmware sets the LOCK on critical pads after configuring them, so that later code (including a malicious DXE driver) cannot rewrite the routing. An unlocked SMI-routable input is qualitatively different from a locked one: the unlocked case lets a future code path re-aim the SMI to a chosen edge or invert the polarity, both of which can convert a benign signal into an attacker primitive. The toolkit cross-references PADCFG against the group LOCK register state captured in a collect_gpio.sh run from a live ThinkPad and reports unlocked + SMI-routable as the genuine risk class.

Software-controlled privacy GPIOs

A second class of report findings is the privacy / security device GPIO: a pin that controls something a user trusts visually (a camera kill switch, a microphone mute state, a TPM "physical presence" line, a fingerprint reader power line) but that can be written to from software. The trust assumption behind a privacy LED is that a lit LED means the camera is on and a dark LED means the camera is off — that the indicator and the underlying device share fate. If the indicator is driven by a software-controlled GPIO, that fate-sharing is software-mediated, and software can lie about it.

Hard-wired privacy switches (a physical slide that opens/closes a circuit before the camera's power line) do not have this problem and are the right answer to it. Most ThinkPads with a "ThinkShutter" sliding cover are in this category. ThinkPads that use software mute or software camera-disable, with no hardware interlock, are in the soft-privacy category and the report flags them as such.

Privacy GPIO posture is a per-model property derivable statically

gpio_security.py identifies which devices in the DSDT carry a software-controlled GPIO matching the privacy / kill-switch heuristic (camera, mic, fingerprint, TPM provisioning). With a live capture, the report adds the lock-state field. The result is a per-model posture summary: which privacy indicators on this ThinkPad model are spoofable from privileged software.

tool: gpio_security.py · live-capture input: collect_gpio.sh

12Security: CVE intelligence and module-level diff

From "the BIOS readme says CVE-2022-xxxxx is fixed" to "here is the UEFI module that grew".

Lenovo's BIOS readme files include a section that, on most models, lists the security updates included in the new revision: CVE IDs, Lenovo advisory IDs (LEN-12345), the subsystems touched (SMM, BootGuard, TXT, microcode, TPM), and sometimes a short text description. The format drifts gently over the years and across product lines, so the readme miner is intentionally schema-light: it looks for anything that matches a CVE pattern (CVE-\d{4}-\d+), a Lenovo advisory pattern (LEN-\d+), and the standard subsystem keywords (SMM, SMI, TPM, microcode, BootGuard, TXT, flash, secure boot), and then groups them per BIOS revision.

Across the archive sample so far, the miner surfaces 15 unique CVEs, 20 advisories, across 18 models. Spectre, MDS and Foreshadow fixes appear repeatedly because they were mitigated incrementally across many microcode and SMM revisions, and the dataset gives a clean fleet-view of who was patched when. The output is a per-model inventory and a fleet CVE→model index that lets you ask "which models, on which BIOS versions, fixed CVE-2022-xxxxx?".

Module-level diff: where did the fix land?

The natural next question is where in the binary the fix landed. The module-diff tool answers it. Two BIOS images are run through uefiextract, every File in the resulting tree is hashed by its body contents, and the trees are diffed: which modules changed, which were added or removed, and how much each one grew or shrank. A "module" here is a UEFI File keyed by its UI Section name when available (e.g. FlashUtilitySmm, TcgPei) and by its GUID otherwise.

On the ThinkPad 13 transition r0buj24ww→r0buj26ww, the module diff reports 66 of 546 modules changed, with a small number of SMM-class modules growing noticeably: FlashUtilitySmm (+1.1 KB), SystemSecureFlashSleepTrapSmm (+0.4 KB), TcgPei (+0.7 KB). Each of those changes correlates well with a security claim in the new BIOS readme: an SMI handler fix, a TPM self-test fix.

The secdiff capstone

fw_secdiff.py is the auto-correlator. It takes the per-revision readme miner output and the module diff, and for each readme security claim it proposes the changed module(s) most likely to implement the fix — based on name keywords (an SMI fix tends to land in a module with Smm or SMI in the name), size delta (a fix usually grows the module), and co-occurrence across multiple revisions (a module that consistently grows alongside a particular subsystem's fixes is a strong candidate). The output is, per readme claim, a ranked list of likely-fix modules with the per-module size delta and the keyword match.

r0buj24ww → r0buj26ww: TPM and SMM fixes correlate to `TcgPei` and the SMI modules

Readme claims a TPM self-test improvement → only TcgPei contains Tcg in its name and grew across the diff. Readme claims an SMM hardening → FlashUtilitySmm and SystemSecureFlashSleepTrapSmm grew and match the keyword Smm. Both attributions confirmed by hand-decompilation of the changed PE32+ images.

tools: fw_moddiff + fw_secdiff · samples: r0buj24ww, r0buj26ww

Cross-checking deployment with `cve_xcheck`

A readme that names a CVE is one model's statement about one BIOS revision. The interesting questions sit one level up: did the same patch land on every model whose firmware shares the affected module? Did any model silently ship the fix without listing the CVE? Did any model claim the CVE without the matching binary change? cve_xcheck.py joins fw_moddiff and bios_secupdate output to answer all three at once. It takes a reference rev pair as the ground-truth fix, builds a fix signature (the set of modules that changed plus their body hashes), and sweeps a corpus of other BIOSes. Each sequential rev pair in the corpus is classified into one of four buckets:

confirmed — the candidate's new revision carries the fix signature and its readme claims the CVE. Expected case across the fleet.
silent patch — the candidate carries the fix signature but the readme is quiet. Often the most useful output: a model that shipped the fix without notifying anyone outside the BIOS pipeline.
claim without fix — the readme claims the CVE but the binary doesn't show the matching change. The claim may apply to a different code path that the static signature missed, or it may be genuinely misleading; either way it is worth a closer look.
unaffected / unfixed — pre-fix module body still present. Residual exposure if the model is supposed to be patched, or genuinely unaffected if the module isn't relevant to the SKU.

The classifier uses a 50%-threshold majority vote across the signature's modules so that a single mismatched module doesn't flip a candidate's bucket on its own; the threshold is a starting point and will need tuning per-CVE once the full archive sweep produces ground-truth labels.

13Coreboot porting from the archived BIOS

How much of a coreboot board port can be generated from the firmware image alone.

A new coreboot board port traditionally starts with a working board, an inteltool capture, and a few days of careful hand-translation: walk the PCI tree, transcribe each device into devicetree.cb, decode the live GPIO PADCFG and translate to PAD_CFG_* macros, write a flash descriptor map, transcribe board straps. The toolkit's claim is that the archived BIOS, on its own, contains enough of that information to skip the inteltool step entirely and produce most of the boilerplate.

What can be auto-generated

coreboot artifact	Source in the OEM BIOS	Toolkit step	Confidence
`devicetree.cb`	DSDT PCI tree + ACPI device list	`gen_devicetree.py`	high
`gpio.h` (PAD_CFG_*)	OEM PADCFG carrier (structurally discovered)	`coreboot_gpio.py`	high
`board.fmd`	Intel flash descriptor at offset 0x10	`flash_fmd.py`	high
microcode files (.PAT)	BIOS package + carved 0x800-aligned blobs	`blob_extract.py`	high
VBT (iGPU)	FV section, known GUID	`blob_extract.py`	high
FSP binary	Pre-CNL: integrated into PEI (0 carves). CNL+ family: discrete FSP-T/M/S blobs carried inside the BIOS image	`fsp_upd.py` · `fsp_carve.py`	N/A on Skylake / Kaby Lake high on Whiskey / Coffee / Comet / Tiger Lake
Memory-down SPD	Verified absent in OEM BIOSes	`mrc_spd.py`	N/A — read live
Board strap mapping	EC firmware reads it at boot, not in image	—	requires hardware

The FSP reality

The Intel Firmware Support Package (FSP) is the binary blob that initializes memory, the CPU complex, and the PCH on Intel platforms. coreboot consumes it as a binary input; without it, a port cannot boot. The naive assumption is that the OEM BIOS embeds the FSP at a known offset, ready to be carved out.

The reality is generation-specific, and that nuance only surfaced once the archive sweep ran across multiple SoC families:

Pre-CNL (Skylake, Kaby Lake): OEM Lenovo BIOSes compile FSP into the PEI flow rather than carrying it as a standalone binary, and the standalone FspUpdRegion is frequently compressed inside another FFS. fsp_upd.py honestly reports "FSP UPD region not present" on these and the practical answer is Intel's reference FSP for the SoC.
Cannon Point-LP family and later (Whiskey / Coffee / Comet Lake) and Tiger Lake: the OEM BIOS carries discrete FSP-T / FSP-M / FSP-S blobs inside the image — what looked like a generation-spanning limitation in the Skylake-only sample was actually a Skylake-era packaging choice that did not carry forward. fsp_carve.py extracts these as native-format .fd files; see "Aggressive carving" below for the sweep numbers.

Aggressive carving: `fsp_carve.py`

Honest "not present" is the right default, but it leaves a real question open: could there be FSP fragments the standard path is missing? fsp_carve.py is the more aggressive companion that takes six passes over the same image and lets the operator decide. The byte-level scan over the raw .fl1 catches FSP fragments outside the FFS tree, where uefiextract's section parser doesn't surface them. A per-SoC ImageId scan ( $SKLFSP$ , $TGLFSP$ , $ADLFSP$ , and the other 14 known tags) is a secondary anchor that finds FSPs even when the FSPH magic is misaligned by a wrapper. Every FFS file's GUID is checked against the known Intel-FSP-package GUIDs published in the per-SoC .fdf files. Compressed sections are opportunistically LZMA-decompressed and re-scanned; Phoenix LATC wrappers are flagged for manual unwrap. With --exe, the InnoSetup payload is enumerated for any FSP-shaped files sitting alongside WinFlash.exe.

The header decoder is permissive across FSP spec revisions: FSP v1.x and v2.x disagree on where ImageId and ImageSize live within the header, and fsp_carve.py finds ImageId by ASCII pattern rather than by fixed offset, then back-figures the layout. This matters because the original strict v1-only decoder rejects otherwise-valid v2 FSPs as malformed — a likely contributor to the "0 FSPH headers on TP13" outcome. Every candidate is dumped with its SHA-256 and exact byte range in the manifest, so false positives are verifiable by hand.

The memory-down SPD finding

A coreboot port for a memory-down board (one with soldered DRAM, no DIMM slot) needs a JEDEC SPD byte stream describing the DRAM's geometry, timing, and refresh parameters. The natural place to look is the OEM BIOS, which already has the same information — it must, in order to bring up memory. mrc_spd.py is a structural probe that scans every flash payload for a candidate SPD byte sequence and validates each candidate using JEDEC's CRC-16, with type and revision filtering to reject coincidental hits. The CRC implementation was end-to-end verified by reproducing the stored CRC of known-good coreboot DDR4 SPDs exactly.

Across 8 diverse OEM ThinkPad BIOSes, the probe found zero embedded SPDs. Lenovo's MRC keeps memory-down configuration in proprietary PEI policy structures rather than as a flash SPD image, so a memory-down coreboot port must read the SPD from the live board with decode-dimms or use Intel reference values for the matching DRAM part. The detector still produces a useful geometry decode when a real SPD is present (early test images, third-party BIOSes, coreboot snapshots in the same archive).

SPD is the firmware-only hard limit; FSP turned out to be generation-specific

A memory-down SPD is kept in proprietary PEI policy, not embedded as a flash image — verified absent across 0 / 8 sampled OEM BIOSes. That is a real hard limit: a memory-down coreboot port has to read SPD from the live board with decode-dimms or use Intel reference values for the matching DRAM part.

FSP looked the same way on the Skylake-only sample the original writeup was based on, but the 11-BIOS sweep changed the picture. Pre-CNL (Skylake, Kaby Lake) really do integrate the FSP into PEI and yield 0 carves. From the Cannon Point-LP family forward, however, Lenovo ships discrete FSP-T / FSP-M / FSP-S blobs inside the BIOS image, and fsp_carve.py extracts them: 14 candidates on a Whiskey Lake L590 and a Comet Lake T14 G1, 10 on a Tiger Lake T14 G2 — each set including a full FSP-S (200–370 KB), a full FSP-M (425–650 KB), and a full FSP-T (16–28 KB), all with valid FSP_INFO_HEADERs and per-SoC ImageId tags ( $CFLFSP$ , $CMLFSP$ , $TGLFSP$ ). For any modern ThinkPad the OEM BIOS therefore carries the exact-match FSP, which is a better input to a coreboot port than the generic Intel reference package.

tools: fsp_upd.py + fsp_carve.py + mrc_spd.py · sample: SPD finding from 8 diverse OEM BIOSes; FSP carving validated across the 11-BIOS sweep (Skylake…Tiger Lake plus Bristol Ridge)

14Hackintosh porting: an OpenCore skeleton from the BIOS

macOS does not run on a ThinkPad out of the box. The boilerplate it needs to get close is, however, derivable.

The Hackintosh community has converged on OpenCore as the bootloader and on a standard set of patches to make a generic Intel laptop boot a recent macOS: ACPI SSDTs for EC, USB, RTC, brightness, sleep wake; kernel extensions (kexts) for audio, ethernet, wifi, trackpad, iGPU; an SMBIOS impersonation of a similar real Mac so the kernel takes the right code paths. Every Hackintosh port starts from the same boilerplate — and that boilerplate is what the toolkit auto-generates.

The SSDTs

gen_ssdt.py emits the standard set of patch SSDTs from facts in the decompiled DSDT:

SSDT-EC: an "Embedded Controller" device that macOS expects to be named EC and accept the standard EC ops — sometimes the OEM DSDT uses a different name (H_EC, ECDV), so the SSDT renames.
SSDT-PLUG: enables Apple's X86PlatformPlugin by setting plugin-type on CPU0.
SSDT-AWAC: forces use of the legacy RTC clock instead of the AWAC (ACPI Wake Alarm Clock) on platforms where macOS doesn't understand AWAC.
SSDT-PNLF: brightness keys for the iGPU.
SSDT-XOSI: redirects _OSI("Windows") queries so that Windows-specific paths in the DSDT (typically the ones that touch features macOS doesn't support) get taken.
SSDT-USBX: declares the USB power properties macOS expects.
SSDT-GPRW: corrects the wake mask for sleep / wake bring-up.

The output is iasl-clean ASL: the SSDTs compile without warnings under stock ACPICA, which matters because Hackintosh ports historically ship hand-edited SSDTs that accumulate iasl warnings over years, and untangling those is a real time sink.

The kext map and the iGPU framebuffer

kext_map.py walks the device inventory (PCI vendor/device IDs + ACPI HIDs) and emits the kext set: Lilu and WhateverGreen as base, IntelMausi for e1000-class NICs, AppleALC with a codec layout id derived from the audio device's subsystem ID, VoodooPS2Controller for the trackpad, and so on. igpu_fb.py picks a WhateverGreen ig-platform-id based on the iGPU's PCI device id and CPU generation, and emits a connector layout (an internal eDP panel plus two external DP / HDMI connectors, matching the typical ThinkPad chassis).

SMBIOS impersonation

The kernel's behavior changes based on the SMBIOS product name (which Mac it thinks it is running on). For a ThinkPad of a given CPU generation and form factor, the right impersonation is the matching mobile Mac of the same era — a quad-core Coffee Lake ThinkPad maps to MacBookPro15,2, a dual-core Whiskey Lake to MacBookPro15,4, etc. smbios_pick.py encodes the CPU family + segment matrix and picks a sensible default; gen_opencore.py folds the choice into the PlatformInfo section of the generated config.plist.

Hackintosh support is CPU-gen-gated, and the skeleton is just a skeleton

The generator handles Skylake through Comet Lake well; Ice Lake and Tiger Lake partially (the iGPU layouts shift); Alder Lake and AMD are not Hackintosh candidates and the toolkit declines them. Beyond the skeleton, real Hackintosh bring-up still needs human work for audio layout, trackpad calibration, sleep stability, and the persistent NVRAM. The skeleton's value is that it skips the mechanical day-one work; it does not skip the bring-up week.

tool: gen_opencore.py · cpu coverage: SKL..CML good, ICL/TGL partial

15Coverage harness: making the "handles all firmwares" claim measurable

A claim that scales to a fleet only if it is continuously measured.

The temptation when building an extraction pipeline is to test it on a few representative images, declare it good, and move on. The temptation gets punished the first time someone runs the pipeline on a new generation, a new vendor, or a legacy non-UEFI BIOS, and finds out by silently producing nothing useful. batch_extract.py exists to make that failure mode loud and structured: every BIOS in a sweep gets classified by outcome, every failure carries a reason code, and new failure classes accumulate at the top of the report until they get handled.

The classifications, in order of frequency:

Extracted (Intel): full ACPI carve, Intel PADCFG present, AcpiPlatform located — the happy path on Skylake-and-later Intel ThinkPads.
Extracted (AMD): nested-FFSv2 layout decoded, DSDT carries AGPIO literals, FCH PADCFG decoded.
Extracted (Qualcomm): DSDT decoded, GPIO controller HID matched to Qualcomm; stubbed pending an X13s sample.
Skipped — non-BIOS payload: <id>w.exe is a driver bundle, not a BIOS update; classified by the payload-listing pre-filter and not counted as a failure.
Failed — legacy non-UEFI BIOS: the payload is a flat BIOS image with no firmware volumes, typical of pre-2010 ThinkPads. Outside the current pipeline's scope.
Failed — capsule format unhandled: the payload is a signed UEFI capsule whose inner image cannot be opened with the current tools.
Failed — extraction tool error: uefiextract aborts on a malformed file; usually a parser bug worth fixing upstream.

The current sweep produces 12 of 12 extractions on the diverse sample (multi-vendor, multi-generation) and 9 of 9 on the BIOS-payload-classification check. As the archive grows the sweep grows with it; new failure classes surface naturally, and the toolkit hardens against them one at a time.

16Live ground truth: closing the loop with `collect_gpio.sh`

The last mile, on hardware, with a POSIX-sh script and no special tools.

Some pad data is not in the firmware image. The resolved gpio → consumer map only exists on a running machine, because board-variant selection runs at boot from hardware straps; the runtime PADCFG with the actual LOCK bits set is also only available live. collect_gpio.sh is a short POSIX-sh script that runs as root on any live ThinkPad and produces a self-describing tarball, with no dependencies beyond coreutils, dmidecode and an optional python3 for the NVS region dump.

File in the tarball	Source on the live system	What it gives the resolver
gpio.txt	/sys/kernel/debug/gpio	resolved gpio → consumer map — the primary ground truth
pinctrl/<ctrl>/*	/sys/kernel/debug/pinctrl	per-pad config + gpio-ranges (gpio# → pad mapping)
acpi/DSDT.aml, SSDT*	/sys/firmware/acpi/tables	cross-check against the archived BIOS tables
acpi_nvs_*.bin	/dev/mem (per /proc/iomem)	GNVS region with runtime values — the resolver's missing half
acpi_devices.txt	/sys/bus/acpi/devices	full ACPI device list (HID → path)
dmi_id.txt	/sys/class/dmi/id	model, MTM, BIOS version (identity only; redact serial before sharing)

On a Whiskey Lake ThinkPad with the CNL-LP PCH (GPIO controller HID INT34BB), a capture exercises every step of the Intel resolver: the chipset id CC at the top byte of every GPIO_PAD matches the runtime; the GNVS values written by AcpiPlatform match the values visible in acpi_nvs_*.bin; the PADCFG mode/direction in the runtime pinctrl matches the static decode. Comet Lake, Tiger Lake, Alder Lake and AMD platforms work identically; only the controller HID changes. The script degrades gracefully on older kernels and on machines without a pinctrl driver, recording what it could not capture and continuing.

The shape of the capture matters more than any single machine's bytes: the same script run on a hundred ThinkPads would produce a hundred per-board posture reports, indexable by model and BIOS version, that together describe the security posture of an entire fleet. The toolkit is structured to consume the captures in aggregate, not just one at a time.

17Honest boundaries

Where the firmware-only path stops and the physical board starts.

Board-variant pins need the hardware board ID. Native-mode elimination narrows the candidate set without the board, sometimes to a single pad, but cannot guarantee resolution in all cases. The toolkit emits the per-variant table so that a live capture or hand-annotation can finish the job.
Group → GPP-bank naming is discovered empirically. The chipset id (the top byte of the GPIO_PAD immediate) is an FSP-package identifier — the same PCH can carry different chipset ids depending on whether the OEM integrated the Whiskey, Coffee, or Comet Lake FSP, for example. Rather than rely on a hardcoded mapping that gets things wrong whenever a new FSP build ships, the toolkit walks the PADCFG carrier for its longest pure stride-12 run and uses the dominant top byte from that run. Verified across the archive: 0x02 = SPT-LP, 0x03 = CML-LP, 0x04 = WHL/CFL-LP, 0x09 = TGL-LP. Unknown chipset ids surface as UNK<cc>_G<gg>_<p> instead of being dropped, so a new SoC produces structured output from day one.
The PADCFG carrier and AcpiPlatform have OEM-specific names. Lenovo uses different module names across the archive (PlatformInit, PeimBoardInit, PeiBoardConfigInit, BoardConfigInitPreMem for PADCFG; AcpiPlatform, AcpiPlatformDxe, LenovoAcpiPlatform for the GNVS writer) and the names shift across SoC generations. gpio_padmap is fully structural — longest pure-CC run wins regardless of name. gpio_resolve tries the known names first and falls back to a structural counter (distinct known-good GPIO_PAD encodings per PE body) when none match. The structural AcpiPlatform fallback is non-crashing on Tiger Lake but does not yet correctly identify the renamed module; resolution returns 0 devices on TGL until the actual name is added to the fast path.
FSP is integrated, not standalone — with a footnote. Pre-CNL Lenovo BIOSes (Skylake, Kaby Lake) compile FSP into PEI and yield 0 carves; from the Cannon Point-LP family forward the OEM image carries the discrete FSP-T / FSP-M / FSP-S blobs, and fsp_carve.py extracts them with valid FSP_INFO_HEADERs and per-SoC ImageIds ( $CFLFSP$ , $CMLFSP$ , $TGLFSP$ ). Where the carve does come up empty, Intel's reference FSP for the SoC remains the alternate source.
macOS support is CPU-generation-gated. Skylake through Comet Lake is mature; Ice Lake and Tiger Lake partial; Alder Lake and AMD not candidates. The Hackintosh generator declines unsupported generations explicitly.
linux-hardware.org log bodies are JS-rendered. lhw_fetch.py recovers machine identity from the static parts of the page but cannot extract the dmidecode or acpidump bodies, which the site renders client-side. For remote ground-truth GPIO data use coreboot board-status inteltool dumps, or run collect_gpio.sh on the live hardware.
Memory-down boards need live SPD. Verified absent in OEM ThinkPads (0 of 8 sampled). Pull from decode-dimms on the live board or use Intel reference values.
EC instruction set coverage now ITE + Renesas. renesas_rl78_disasm.py covers RL78 in v1.1. ITE 8051 was already there. Renesas RX (used by some workstation EC parts) is the remaining gap.

18What's next

The interesting unfinished work, in rough order of leverage.

Run the full-archive sweep. The pipeline is validated across an 11-BIOS spread spanning Skylake through Tiger Lake on Intel plus Bristol Ridge on AMD; the bugs that surfaced during that sweep are fixed. The next step is to point the toolchain at every ThinkPad model and BIOS revision in the archive and publish the per-model coverage matrix as a companion page — the canonical answer to "does it work on my ThinkPad?".
Identify the Tiger Lake AcpiPlatform rename. Lenovo renamed AcpiPlatform on Tiger Lake (and likely Alder / Raptor / Meteor Lake). gpio_resolve's structural fallback runs without crashing on those samples but doesn't yet pin down the actual module — the GNVS resolution returns 0 devices on TGL. One pass of hand-analysis on a single TGL BIOS gives the renamed module's UI name, which then plugs into the fast-path list and unblocks TGL+ samples pipeline-wide.
Verify the provisional GROUP_NAMES entries. ICL / ADL / RPL / MTL entries are transcribed from Intel FSP GpioLib headers. They have not yet been confirmed against in-archive samples. When the first ADL or RPL ThinkPad lands in the sweep, the empirical pure-CC scan will either confirm the entries or expose them as wrong (just as it did for the original TGL / CML guesses).
Fleet-wide CVE evolution. The CVE intelligence becomes far more interesting at scale: "all SMM fixes Lenovo shipped between 2020 and 2025, by CVE, by module, by model" is a single report once the archive is fully indexed. The reporting code is in place; what's left is to actually run it across the whole archive and publish the resulting index.
Live-board feedback loop. Shipping collect_gpio.sh as a first-class artifact, with a small landing page that explains what it captures and what to do with the resulting tarball, would let third parties contribute ground truth back into the model coverage matrix. Aggregated correctly, that becomes a per-model security posture report that updates as Lenovo ships new BIOSes.
Coreboot board-status integration. Many of the same questions the toolkit answers are already partially answered for ported boards in coreboot's board-status repository (inteltool dumps, lspci, dmidecode). Joining the two datasets would give a richer cross-check on the resolver's accuracy on already-ported models and a head start on board-status for new ports.

The project is, in the end, a small bet that the value in the Lenovo firmware archive is mostly latent — that the bytes are out there, the tools to look at them are out there, but the glue that holds the pipeline together and turns one model's BIOS into useful answers about that model has not been written. Writing that glue, carefully and reproducibly, is what the codebase is for. The findings above are what came out of doing it.

19Contact & contributing

Where the code lives, where to file bugs, and how to reach the maintainer.

Repository

Source	codeberg.org/tetdrad0n/thinkpad-fw-analysis
Issues	codeberg.org/tetdrad0n/thinkpad-fw-analysis/issues
Pull requests	codeberg.org/tetdrad0n/thinkpad-fw-analysis/pulls
Maintainer profile	codeberg.org/tetdrad0n

Direct contact

Email	tetdrad0n@proton.me
Telegram	`@tetdrad0n`
Tox (uTox)	2032774D78DD625E94814247FB454846B41F320A98A24125D84107D88A6A5C19E3565D6AC07D

Contributing

Contributions are welcome. The highest-leverage open areas are listed in "What's next": identifying Lenovo's renamed AcpiPlatform module on Tiger Lake (to unblock TGL+ GNVS resolution), verifying the provisional GROUP_NAMES entries for ICL / ADL / RPL / MTL against in-archive samples, a Renesas RX disassembler for workstation EC parts, a Qualcomm validation against a real X13s capture, and ground-truth tarballs from collect_gpio.sh runs on hardware not yet in the coverage matrix.

Bug reports are useful at any level of detail; if you have a failing batch_extract.py run on a particular BIOS, attach the package URL or the model + MT + BIOS revision, and the classifier's output. New failure classes are how the pipeline hardens.

Pull requests should target main. Keep the no-folklore rule: PADCFG and register decodes from the official Intel datasheet (332691 and the per-SoC successors), UEFI / ACPI references from the published specifications, AMD work from the PPR rather than community write-ups.

20Revisions

Released versions and what's planned next. Tracks the public repository's tags; per-commit history lives in git log.

Version	Date	Status	Highlights
v1.2	TBD	planned	Full-archive sweep with per-model coverage matrix as a companion page; identification of Lenovo's TGL/ADL/RPL/MTL AcpiPlatform rename to upgrade `gpio_resolve` from "non-crashing" to "resolving" on newer SoCs; verified `GROUP_NAMES` bank orders for ICL/ADL/RPL/MTL once real samples land (current entries for those are FSP-header transcriptions, marked provisional); structural `find_padcfg_table` sped up via candidate-byte sweep so it scales to hundreds of models without becoming the bottleneck.
v1.1	2026-06-01	released	Validated end-to-end across an 11-BIOS sweep (Skylake / Kaby Lake-R / Coffee / Whiskey / Comet / Tiger Lake on Intel plus Bristol Ridge on AMD). Real bugs surfaced and fixed: (a) `soc_buses.chipset()` picked the wrong PCH from multi-return `_HID` methods (Sunrise Point-H instead of Cannon Point-LP on every Whiskey/Coffee/Comet ThinkPad); (b) `gpio_padmap`'s hardcoded module-name list and chipset-id table were wrong for most Lenovo BIOSes — replaced with structural discovery (longest pure-CC stride-12 run wins, chipset id derived from the run, unknown SoCs surface as `UNK<cc>_G<gg>_<p>` instead of being dropped); (c) `gpio_resolve`'s exact-name `AcpiPlatform` lookup hard-errored on Tiger Lake — added a list of known Lenovo names plus a structural fallback that's non-crashing on unfamiliar BIOSes. New tools landed in this release: Renesas RL78 disassembler (`renesas_rl78_disasm.py`) for non-ITE EC firmware; `cve_xcheck.py` four-bucket classifier (confirmed / silent patch / claim without fix / unaffected); `fsp_carve.py` aggressive FSP extraction with per-SoC ImageId scan and permissive multi-revision header decode. Major finding: 38 discrete FSP-T/M/S blobs carved from Whiskey / Coffee / Comet / Tiger Lake OEM BIOSes — the original Skylake-era "FSP is the hard pass" caveat was a generational accident, not a property of Lenovo BIOSes generally.
v1.0	2026-05-31	released	First public release. End-to-end pipeline (`pipeline.py`) on Intel + AMD with a Qualcomm stub; ACPI extraction via `uefiextract` + multi-fallback (12 / 12 diverse sample); Intel GpioLib resolver with native-mode elimination; PADCFG decode from Intel 332691; ITE 8051 EC disassembler with SFR annotation; GPIO security report (SMI / NMI surface + privacy pins, with live lock posture from `collect_gpio.sh`); BIOS-readme CVE miner; UEFI module diff (`fw_moddiff`) and CVE→module auto-correlator (`fw_secdiff`); coreboot board-port generator and OpenCore Hackintosh skeleton; coverage harness with BIOS-payload pre-filter. Verified findings: TP13 Skylake fingerprint reader resolves to GPP_A22 / GPP_A23; 0 / 8 OEM BIOSes embed a JEDEC SPD; r0buj24ww→r0buj26ww = 66 of 546 modules changed.

Pre-1.0 capability milestones

Each entry below was the load-bearing change that unlocked the next pipeline stage. They are not separate releases — they are the milestones a contributor would want to know about when reading the codebase.

Milestone	What it added
Porting toolkit	Hackintosh OpenCore + coreboot board generators (`gen_opencore`, `gen_coreboot_port`, the shared RE passes).
Named-module diff + secdiff capstone	UEFI module-level binary diff + auto-correlation of readme security claims to the module that grew.
Security-update intelligence	BIOS readme mining for CVE / advisory / subsystem keywords, with a fleet CVE→model index.
Live lock / ownership posture	Real PADCFG `[LOCKED]` + ACPI / HOST ownership from a `collect_gpio.sh` capture, plus fleet GPIO diff.
GPIO applications	coreboot `gpio.h` generator (PAD_CFG_* macros sourced from the archived BIOS, not `inteltool`) + GPIO security report.
`uefiextract` as primary FV extractor	Promoted UEFITool's CLI to primary — turned the diverse-sample extraction rate from 7 / 12 to 12 / 12.
Robust multi-firmware extraction + coverage harness	`batch_extract.py` with classification of outcomes; new failure classes surface and get handled.
Multi-vendor GPIO support	AMD added; vendor dispatch via the ACPI GPIO-controller `_HID`.
8051 disassembler for ITE EC firmware	Pure-Python MCS-51 decode with SFR + ITE-custom-register annotation.
Native-mode elimination + board-id back-solve	`gpio_resolve.py` narrows board-variant candidate sets without the live board, by ruling out pads reserved for native functions.
Live-machine GPIO telemetry	`collect_gpio.sh` + `lhw_fetch.py` — the dynamic ground truth the firmware-only path needs.
Initial ACPI / GPIO analysis toolchain	The first end-to-end path: BIOS `.exe` → ACPI tables → per-device GPIO topology.

Source: codeberg.org/tetdrad0n/thinkpad-fw-analysis · Issues: /issues · Contact: tetdrad0n@proton.me · Telegram

References: Intel 332691 (100-Series PCH Datasheet, Vol 2), Intel 332687 (Cannon Lake PCH), Intel 633331 (Tiger Lake PCH), Intel 645549 (Alder Lake PCH); UEFI Specification (2.10), UEFI Platform Init Specification; ACPI Specification (6.5); AMD Platform Programming Reference for the relevant FCH generations. Tooling: UEFITool, ACPICA iasl, innoextract, binwalk, 7z, Unicorn, Capstone, uefi-firmware-parser.

00At a glance

01Prologue: why archived ThinkPad firmware is a treasure trove

02Background: three primers in twelve paragraphs

A. ACPI, DSDT, SSDTs

B. The Intel GPIO architecture

C. The GpioLib indirection problem

03The Lenovo BIOS corpus, in shape

Why driver bundles are the first problem

04Extraction: from .exe to ACPI tables

Why uefiextract is the primary FV extractor

The AcpiTableStorage FFS

05The Intel GPIO resolution problem

06A walkthrough: fingerprint reader on ThinkPad 13 Skylake

Step 1: identify the device in the DSDT

Step 2: find the AcpiPlatform writes to GNVS

Step 3: decode the immediate

Step 4: PADCFG cross-check

ThinkPad 13 Skylake: fingerprint reader resolves to GPP_A22 (CS) / GPP_A23 (INT)

07PADCFG decode — not coreboot folklore

The fields that matter for security analysis

08Native-mode elimination

09Vendor matrix: Intel, AMD, Qualcomm

10The Embedded Controller: an ITE 8051 with a custom SFR map

What the EC firmware contains

SFR-aware disassembly

Renesas RL78 — the other EC family

11Security: GPIO attack surface and spoofable privacy pins

SMI / NMI routable inputs

Software-controlled privacy GPIOs

Privacy GPIO posture is a per-model property derivable statically

12Security: CVE intelligence and module-level diff

Module-level diff: where did the fix land?

The secdiff capstone

r0buj24ww → r0buj26ww: TPM and SMM fixes correlate to TcgPei and the SMI modules

Cross-checking deployment with cve_xcheck

13Coreboot porting from the archived BIOS

What can be auto-generated

The FSP reality

Aggressive carving: fsp_carve.py

The memory-down SPD finding

SPD is the firmware-only hard limit; FSP turned out to be generation-specific

14Hackintosh porting: an OpenCore skeleton from the BIOS

The SSDTs

The kext map and the iGPU framebuffer

SMBIOS impersonation

Hackintosh support is CPU-gen-gated, and the skeleton is just a skeleton

15Coverage harness: making the "handles all firmwares" claim measurable

16Live ground truth: closing the loop with collect_gpio.sh

17Honest boundaries

18What's next

19Contact & contributing

Repository

Direct contact

Contributing

20Revisions

Pre-1.0 capability milestones

04Extraction: from `.exe` to ACPI tables

Why `uefiextract` is the primary FV extractor

Step 2: find the `AcpiPlatform` writes to GNVS

r0buj24ww → r0buj26ww: TPM and SMM fixes correlate to `TcgPei` and the SMI modules

Cross-checking deployment with `cve_xcheck`

Aggressive carving: `fsp_carve.py`

16Live ground truth: closing the loop with `collect_gpio.sh`