A reverse-engineering writeup on turning Lenovo's published BIOS archive into something structured enough to drive coreboot ports, Hackintosh skeletons, GPIO security audits, and CVE-level firmware diffs — without owning the hardware, and without folklore.
Where the project stands before you read the long version.
The peculiar fact that a manufacturer hands you, for free, the exact bits that drive every machine it has ever sold.
The ThinkPad is one of the few mass-market PC product lines that has accumulated a parallel open-source culture around it: long-running coreboot ports for a handful of Sandy Bridge and Ivy Bridge classics, Linux quirks tables for every model since the T20, Hackintosh ports for the GoBook generation, a quietly enormous body of forum knowledge about exactly which kernel option calms which Lenovo BIOS bug. The reasons are not mysterious — ThinkPads ship with field-replaceable parts, decent keyboards, and BIOSes that the firmware team actually maintains for years — but the consequence is that a lot of independent engineering value lives downstream of Lenovo's "Drivers & Software" pages.
That archive is the part most people forget about. Lenovo Support publishes every BIOS
update for every supported model, every driver bundle, every flash utility, organized
by machine type and chronological. For a model still under support you get the
complete historical chain — every microcode bump, every Intel ME revision, every
SMM hardening, every CVE fix — downloadable as a string of .exe
packages. For models that have rolled out of support the binaries remain mirrored and
discoverable. The same site indexes tens of thousands of binaries that, taken
together, are a near-comprehensive record of how Lenovo built each board.
Inside each BIOS package is the part that matters: the device tree the OS sees
(ACPI), the firmware volumes (PEI, DXE, SMM), the embedded controller firmware, the
Intel FSP remnants, the microcode files, the VBT, the flash descriptor on some
models. Everything a coreboot porter needs to bring a new board up; everything a
security auditor needs to map a CVE to a binary fix; everything a Hackintosh ports
project needs to drive a sensible config.plist. The bits are public.
They have been public the whole time.
What is not public is a reproducible pipeline that says: "give me a model number and a BIOS version, and you get back structured JSON describing every device, every GPIO it touches by name, every firmware-internal change since the previous revision, every CVE the changelog mentions, every privacy-sensitive GPIO and its lock posture, every SMI-routable input, every UEFI module that grew or shrank between the last two updates, every signal of where a fix landed in the binary." That is the gap this project sets out to close.
The output of the pipeline is not a beauty-contest reverse engineering of one board. It is a coverage statement: across the archive, the toolchain produces structured data on every model whose firmware can be carved — and as the archive grows, the model count grows with it. Bug-for-bug, the per-model output may still need a human pass before it can drive a real port; but the cost of preparing that human pass drops by an order of magnitude when you start from a clean auto-generated baseline.
ACPI / DSDT, Intel's GPIO architecture, and the GpioLib indirection — just enough to read the rest.
The Differentiated System Description Table (DSDT) is the central ACPI table. Firmware compiles it from ASL (ACPI Source Language) into AML (ACPI Machine Language) at build time, and the OS interprets the AML at runtime to discover devices, power states, GPIO connections, EC commands, and thermal zones. Every ACPI-capable OS — Linux, Windows, macOS, the BSDs — reads the same DSDT. The DSDT is therefore the most reliable single description of what hardware is present on the board, more reliable than PCI enumeration (because it includes non-PCI devices like the EC, GPIO-attached buttons, sensors), and more reliable than Linux DMI tables (which only carry strings).
Secondary System Description Tables (SSDTs) compile separately and load at runtime to extend or override pieces of the DSDT. Boards with multiple variants typically ship a generic DSDT plus per-variant SSDTs that patch in the right chunks; AcpiPlatform decides which SSDTs to load based on hardware identifiers it reads at boot. A modern ThinkPad commonly ships ten SSDTs alongside the DSDT.
Every device with a hardware GPIO connection has a _CRS (Current
Resource Settings) method that returns a serialized list of resource descriptors. For
GPIO that descriptor is either a GpioIo (the device can drive or read
the pin) or a GpioInt (the pin is wired to the device as an interrupt
source). Each carries a controller path (e.g. \_SB.PCI0.GPI0), a pin
number, an edge/level/polarity selector, and a pull configuration.
On AMD and Qualcomm ThinkPads, the pin number in GpioIo/GpioInt
is a direct integer that maps straight onto an AGPIO or GPIO pad. On Intel it is
often a method call — and that is where the trouble starts.
Each Intel PCH (from Skylake / Sunrise Point onwards) carries one or more
GPIO controllers (community controllers), each owning a number of
groups (banks of about 24 pins each). Pins inside a group are referred to as
pads: GPP_A22 means "General Purpose Programmable, group A, pad 22".
Each pad has two 32-bit configuration registers stored in fixed PCH-memory-mapped
space: PADCFG_DW0 and PADCFG_DW1.
PADCFG_DW0 (per pad, 32 bits)
31 PADRSTCFG[1] (pad reset config, hi bit)
30 RXEVCFG[1] (edge / level select, hi bit)
29 RXRAW1 (force raw 1 on RX)
28 RXEVCFG[0]
27 PREGFRXSEL (route raw into glitch filter)
26 RXINV (invert RX)
25 GPIROUTIOXAPIC (route to IOxAPIC)
24 GPIROUTSCI (route to SCI)
23 GPIROUTSMI (route to SMI)
22 GPIROUTNMI (route to NMI)
21..20 PMODE (pad mode: 0 = GPIO, 1..3 = native funcs)
19..18 RXTXENCFG
17 RXDIS (RX disable)
16 TXDIS (TX disable)
15..8 reserved
7..1 reserved
0 GPIORXSTATE / GPIOTXSTATE
PADCFG_DW1
31..16 TERM (termination / pull strength)
...
PADCFG[lock] is held in a separate per-group LOCK register, NOT in DW0/DW1.
The decode of every field above comes from Intel 100-Series PCH Datasheet
Volume 2 (document 332691), and from the per-generation datasheets that follow
it. Pinning the decode to the official document matters: intelp2m in
coreboot is downstream and occasionally lags the register layout, and the security
report needs an authoritative source for what "locked" means before it can call a
pad genuinely-safe-versus-attacker-controllable.
Inside the DSDT, an Intel ThinkPad does not bake the pad name into the
_CRS. It writes something like:
// DSDT excerpt (lightly redacted for readability)
Device (FPNT) {
Name (_HID, "VFS5011")
Method (_CRS, 0, NotSerialized) {
Return (ResourceTemplate () {
SpiSerialBusV2 (..., 0x00000000, ...) { ... }
GpioInt (Level, ActiveLow, Shared, PullDefault, 0,
"\\_SB.PCI0.GPI0", 0, ResourceConsumer)
{ GNUM(GFPI) }
})
}
}
GNUM is a method elsewhere in the DSDT that takes a GNVS field name
(here GFPI, the fingerprint-interrupt field) and returns an integer.
GNVS is the Global NVS region — an ACPI-NVS memory range
whose layout is declared by the firmware and whose values are populated at boot. It
is RAM, not flash; nothing in the firmware image carries the runtime
contents of GNVS directly.
The values written into GNVS come from a UEFI driver called AcpiPlatform
that runs in DXE. AcpiPlatform reads the board identity from hardware
(typically EC straps or an EEPROM), picks the right pad set for the current board
variant, and stores each pad as an Intel GpioLib GPIO_PAD immediate at
the GNVS offset corresponding to its field name. The encoded form of that
immediate is:
So the static information needed to resolve a single device to a named pad is
distributed across three pieces of the firmware: the DSDT (which GNVS field
feeds which device), AcpiPlatform (which GPIO_PAD immediate gets
written into that field), and PlatformInit's PADCFG table (which mode,
direction and lock state apply to that pad). Recovering the answer means joining all
three.
What is actually on those Lenovo download pages, organized for the resolver.
Every supported ThinkPad model on Lenovo's support site has a "Drivers &
Software" page with a downloads tree organized by category (BIOS, Audio, Chipset,
LAN, etc.). A BIOS update is distributed as an .exe file
— an InnoSetup-packaged WinFlash installer of roughly 6–12 MB,
occasionally larger when carrying microcode or ME firmware updates. The same site
indexes the installer under a stable URL of the form:
by_mt/<MachineType>/drivers/<DocId>/files/<id>w.exe
The MachineType is the 4-character Lenovo MT (e.g. 20Q5);
the DocId is the Lenovo support article id; the id
encodes the BIOS revision (e.g. r0buj26ww = T-series, build 26ww). The
naming convention is stable enough that an aggressive scraper can mirror every BIOS
package for every published model with a few thousand HTTP requests, and a
surprising amount of metadata (release date, supported OS, change summary) sits on
the article pages adjacent to each download.
Inside a BIOS .exe the InnoSetup payload contains:
WinFlash.exe or a vendor variant)..fl1, .fl2,
.cap (signed capsule), or rarely .rom. The
.fl1 is the raw UEFI flash image — what the SPI chip
would hold (sans descriptor on some models).readme.txt or similar) with the change-log: BIOS, EC,
ME version bumps, plus a Security Updates block listing the CVEs and
Lenovo advisories that the new version addresses..PAT files containing microcode updates, named by
CPUID (e.g. BDFA8000.PAT).
Lenovo ships driver bundles — audio, GPU, wifi, fingerprint —
using the same <id>w.exe naming convention as BIOS updates. They
are not BIOS updates; they don't carry an .fl1. A naive coverage
harness that tries to acpi_extract every <id>w.exe
in a model's download tree will fail noisily on every driver bundle, which is most
of them.
The cheap fix is to look at the payload first, without extracting it. Each
.exe can be probed with innoextract --list (or with 7-zip
in list mode for non-InnoSetup wrappers), and the listing alone tells you whether a
.FL1/.FL2/.CAP file is present. Packages that pass the check are
extracted normally; packages that fail are classified as a non-BIOS payload and
reported separately, so that coverage numbers reflect what was actually attempted.
.fl1 is much smaller), GPU VBIOS hotfixes (occasionally large enough
to trip a size-only test), and combo packages that wrap a BIOS + driver bundle in
one installer. The listing-based check handles all three correctly because it
looks at the actual payload structure.
.exe to ACPI tablesEach step is reversible and well-understood; the trouble is in handling all of them at once.
Once a BIOS-class package has been identified, the work is to walk it down to ACPI tables. The path looks like this:
No part of that path is novel in isolation. The interesting work is in the cases where the path breaks — and in the diversity of those cases across the archive.
uefiextract is the primary FV extractor
The Lenovo archive spans Phoenix, AMI and Insyde flavored BIOSes, sometimes with
FFSv1, FFSv2 and FFSv3 files within the same image; with LZMA, Tiano-EFI compression,
LZMA-F86 variants; with Phoenix's wrapper-FV layout where the inner FVs are
themselves compressed inside an outer FV. UEFITool's uefiextract is the
union of all of those handlers in one tool. Promoting it from "fallback when
uefi-firmware-parser fails" to primary turned a 7-of-12 extraction rate on the
diverse sample into 12 of 12, with uefi-firmware-parser kept as
a fallback (it sometimes produces a cleaner ACPI carve on known-good Intel images),
and a nested-FFSv2 carve handling the AMD / Phoenix wrapper-FV layout as a last
resort.
UEFI Platform Init defines a specific FFS file type and GUID for the firmware's
ACPI table store: EFI_ACPI_TABLE_STORAGE_FILE_GUID =
7E374E25-8E01-4FEE-87F2-390C23C606CD. Every ACPI-capable UEFI BIOS
carries the compiled DSDT, SSDTs and supporting tables inside a single FFS file of
that GUID, with each table as a raw section. Dedup-by-hash is necessary because some
BIOSes carry a duplicate copy of the AcpiTableStorage FFS as a fallback (for example,
when a recovery FV is allowed to override the main one).
The output of this step is a directory of .aml files plus the ASL
decompilation; nothing else in the pipeline needs to touch the raw firmware image,
and downstream tools work entirely off the ASL plus the per-FV file tree
that uefiextract left behind.
Why the DSDT, on its own, is necessary but not sufficient on Intel.
On AMD ThinkPads (and on the small number of Qualcomm-based ones) the DSDT is the
whole answer. A GpioInt resource for the fingerprint reader carries the
AGPIO pin number directly:
// AMD-style: pin number is a literal
GpioInt (Edge, ActiveLow, Exclusive, PullUp, ...,
"\\_SB.GPIO", 0, ResourceConsumer) { 0x002B } // AGPIO 43
On Intel ThinkPads, the same descriptor passes through GNUM(FIELDNAME):
// Intel-style: pin number is a method call
GpioInt (Level, ActiveLow, Shared, PullDefault, ...,
"\\_SB.PCI0.GPI0", 0, ResourceConsumer) { GNUM(GFPI) }
GNUM reads a field out of GNVS (the ACPI-NVS region) and decodes it
into a pin number relative to the named controller. The decoded value is an Intel
GpioLib GPIO_PAD immediate that AcpiPlatform wrote at
boot. Until you find that write, the DSDT alone tells you only that the fingerprint
interrupt is "the pin GNVS field GFPI says it is" — which is not
a pin number.
The static information required to recover the answer lives in three places:
_CRS references which GNVS
field. From this alone you get the (field, device) join — e.g.
field GFPI → device FPNT.
GPIO_PAD immediate). The same field is sometimes written
multiple times under different conditional branches (board-variant switches),
which is the hard half of the problem.
Joining the three sources is mechanically straightforward but practically fiddly.
AcpiPlatform is compiled X86 (or x86-64); the writes to GNVS are
typically mov [GnvsBase + offset], imm32 sequences, sometimes
preceded by a conditional that selects between multiple imm32
candidates. gpio_resolve.py walks the disassembly, collects every
candidate immediate per field, and emits them with their guard conditions. The
PADCFG cross-check then resolves the candidate set to a single pad per device.
For board-invariant pins — a fingerprint sensor wired the same way across every
variant of a model — the candidate set is a single immediate and the
resolution is exact. For board-variant pins — a GNSS or BT antenna routed
differently depending on factory-installed radio — the candidate set carries
several, and only the live board can tell you which one is yours. The toolkit emits
the per-variant table directly so that follow-up work (or a collect_gpio.sh
capture on hardware) can resolve it.
End-to-end resolution on a real BIOS, from the carved DSDT to the named GPP pad.
The ThinkPad 13 (1st gen, Skylake-U) is a useful walkthrough target: it is single-SoC
(Sunrise Point-LP, INT344B), its DSDT is small enough to read end-to-end, and its
fingerprint reader is board-invariant. The BIOS used here is
r0buj26ww, the November 2022 update; the input was an
8.1 MB .exe downloaded from Lenovo Support.
Carve ACPI tables, decompile, grep for the fingerprint device. The relevant fragment looks like:
Device (FPNT) {
Name (_HID, "VFS5011")
Method (_STA, 0, NotSerialized) { Return (0x0F) }
Method (_CRS, 0, NotSerialized) {
Return (ResourceTemplate () {
SpiSerialBusV2 (0x0001, PolarityLow, FourWireMode, 8,
ControllerInitiated, 0x007A1200, ClockPolarityLow,
ClockPhaseFirst, "\\_SB.PCI0.SPI1",
0, ResourceConsumer, , Exclusive, )
GpioInt (Level, ActiveLow, Shared, PullDefault, 0,
"\\_SB.PCI0.GPI0", 0, ResourceConsumer, , )
{ GNUM(GFPI) }
})
}
}
Two GPIOs touch this device: an SPI chip-select that is implicit in the
SpiSerialBusV2 resource (pin owned by the SPI controller, not declared
here), and the interrupt line that arrives as GpioInt with pin number
GNUM(GFPI). The chip-select pin is declared elsewhere; the interrupt
pin's identity depends on the GNVS field GFPI.
AcpiPlatform writes to GNVS
AcpiPlatform.efi sits in the firmware volume tree under its own GUID.
The disassembly contains, near the end of InstallAcpiPlatform, a
sequence of stores to the GNVS region:
; ... earlier code computes GnvsBase into rdi
mov dword [rdi + 0x10A], 0x01000016 ; GFPI = SPT-LP, group A, pad 0x16 (22)
mov dword [rdi + 0x10E], 0x01000017 ; GFPS = SPT-LP, group A, pad 0x17 (23, same group)
mov dword [rdi + 0x112], 0x01000040 ; GPLI = SPT-LP, group A, pad 0x40 (64)
...
The offsets 0x10A, 0x10E, 0x112 correspond to
the GNVS field declarations the DSDT emits in its OperationRegion (GNVS, ...)
Field blocks. Joining offsets to field names yields:
GFPI -> 0x01000016
GFPS -> 0x01000017
GPLI -> 0x01000040
Decoding 0x01000016 as
CC=0x01, GG=0x00, NNNN=0x0016 gives chipset SPT-LP, group A, pad index
22 — or GPP_A22 in coreboot nomenclature. The interrupt sibling,
GFPS = 0x01000017, is GPP_A23.
PlatformInit's PADCFG table is a sequence of 12-byte triples,
{ GpioPad, PADCFG_DW0, PADCFG_DW1 }. Pulling out the entry for
GPP_A22:
GPP_A22:
PADCFG_DW0 = 0x44000300
PADCFG_DW1 = 0x00003000
Decoded:
PMODE = 1 (native function 1: SPI1_CS#)
RXDIS = 0 TXDIS = 0
GPIROUT* = none
TERM = NoPullPad
LOCKED = no (held in group LOCK register; checked separately)
The native function is SPI1_CS, which matches the device's role as the chip-select of an SPI fingerprint reader; the role check passes. The interrupt sibling GPP_A23 is configured as a GPIO input with IOxAPIC routing, which matches its role as the fingerprint interrupt line.
Resolution exact (board-invariant, no candidate ambiguity); PADCFG mode and direction
consistent with role; cross-verified against the DSDT
SpiSerialBusV2+GpioInt pair.
Why the toolkit reads Intel 332691 rather than chasing community pad-config tables.
intelp2m (in coreboot) is the most popular community tool for decoding
PADCFG into PAD_CFG_* macros. It is good, well-maintained, and
sufficient for most coreboot work. It is also downstream — lagging the
Intel datasheets by some number of months whenever a new PCH generation ships, and
occasionally simplifying the decode (collapsing rare flag combinations to the
closest-fit macro). For a coreboot port that is fine. For a security report that
needs to be precise about what "locked" means and what the interrupt routing
actually targets, it is not.
The toolkit decodes PADCFG directly from Intel 100-Series PCH Datasheet Volume 2 (332691, the Volume 2 that covers the GPIO controller architecture), with per-SoC supplements for Cannon Lake (332687), Tiger Lake (633331), Alder Lake (645549), and the Wildcat/Sunrise Point/Kaby Lake siblings where they diverge. The decode is structured per-field rather than pattern-matched-to-macros, so when the security check needs to ask "is bit 23 set on this PADCFG, and is the group LOCK register for group A also clear?" it can do that directly.
| Field | Bits | What it tells you |
|---|---|---|
GPIROUTSMI | DW0[23] | If set, this pin can trigger an SMI when its edge condition fires. Combined with RXEVCFG, this is the SMI attack surface for the pad. |
GPIROUTNMI | DW0[22] | NMI routing. Less common; when present, often used for chassis-intrusion or watchdog inputs. |
GPIROUTSCI | DW0[24] | SCI (System Control Interrupt) routing. Used for wake events, lid switches, hot-plug detection. |
GPIROUTIOXAPIC | DW0[25] | IOxAPIC routing: this pin shows up as a normal device interrupt to the OS. |
PMODE | DW0[21:20] | 0 = GPIO, 1..3 = native functions. SPI, I2C, UART pins live here at native modes. |
RXDIS / TXDIS | DW0[17:16] | Input/output disable. An RXDIS=1 pad cannot be sampled by software no matter the mode. |
| Group LOCK / LOCKTX | per-group register | Once set, prevents further writes to PADCFG (or its TX state) until reset. |
The combination of GPIROUTSMI=1 with the group LOCK clear is the canonical "real attack surface" signal: the pin will fire an SMI on its configured edge, and the firmware did not lock the configuration, so privileged software (or a malicious DXE driver in a future boot) could rewrite the edge condition and re-route it. The GPIO security report sorts pads by exactly this combination, with privacy device GPIOs (camera kill, mic mute, fingerprint power, TPM provisioning) called out separately because they have their own spoofability story regardless of SMI routing.
Solving for board-variant pins without the live board, by ruling out everything else.
When AcpiPlatform writes multiple candidate immediates into the same
GNVS field, the writes are guarded by a switch on a board identifier read at boot.
The disassembly looks like:
cmp al, 0x01 ; board_info->variant == 1?
jne .v2
mov dword [rdi + 0x118], 0x01000045 ; GBTI = GPP_A69 on variant 1
jmp .end
.v2:
cmp al, 0x02
jne .v3
mov dword [rdi + 0x118], 0x01000047 ; GBTI = GPP_A71 on variant 2
jmp .end
.v3:
mov dword [rdi + 0x118], 0x0100008C ; GBTI = GPP_C12 on variant 3
.end:
Without the live board's variant byte, all three candidates remain. The static
elimination trick is to look at the PADCFG table for each candidate and check which
ones could possibly serve the device's role. If GPP_A71 is configured
as native function 2 in the PADCFG table (let's say I2C2_SDA), it cannot also be
the Bluetooth host-wake input that GBTI is feeding — it is
reserved for I2C. Variant 2 is eliminated. The candidate set shrinks; sometimes to
one, sometimes still to two or three, but always strictly smaller than the union of
all variants.
The same trick narrows board ID itself when the per-variant tables are
disjoint: if exactly one variant's pad set is consistent with the live PADCFG,
that's the variant. gpio_resolve emits per-variant candidate sets and
a confidence per variant, and the report at the top-level says "this device is
GPP_A69 on board variant 1, GPP_C12 on variant 3, and undefined on variant 2".
That is the firmware's contribution; the rest needs the live board.
The same pipeline, three quite different GPIO models.
The vendor of the GPIO controller in the DSDT is the dispatch key. The toolkit's
vendor.py looks at the _HID of the GPIO controller device
and picks the resolution path; the rest of the pipeline runs identically regardless
of vendor, except that the Intel-only resolver is skipped on AMD and Qualcomm
images (their DSDTs already contain literal pad numbers).
| Vendor | Controller _HID | GPIO model | Resolution path | Pad naming | Status |
|---|---|---|---|---|---|
| Intel | INT34xx / INT33FF | GNVS-indirected, board-variant gated | full pipeline: AcpiPlatform immediates + PADCFG native-mode elimination | GPP_<bank><n> | 21/21 on TP13 Skylake |
| AMD | AMD0030 / AMDI0030 | AGPIO index direct in DSDT (no GNVS, no switch) | gpio_report.py alone resolves it |
AGPIO<n> | 18/18 on A275 |
| Qualcomm | QCOM… | direct in DSDT (like AMD) | gpio_report.py |
GPIO<n> | stub — validate on X13s sample |
acpi_extract.py handles both the Intel layout (a single FV with a
standard AcpiTableStorage FFS at the top level) and the AMD / Phoenix
nested-FFSv2 layout (a wrapper FV holding compressed inner FVs, with the
real AcpiTableStorage one level deeper). The nested case took several iterations to
get right because Phoenix's compression markers vary across versions; the toolkit
handles the three variants observed so far in the archive.
An optional AMD extension decodes the FCH (Fusion Controller Hub) GPIO control register per pin from the AMD Platform Programming Reference, which gives full pad-config documentation comparable to the Intel PADCFG decode. It is not needed for device→pad resolution (because the DSDT already carries the pin number on AMD), but it is needed for the security report's pad-lock analysis on AMD boards.
A second CPU, hidden in plain sight, with its own firmware and its own attack surface.
Almost every ThinkPad ships with an embedded controller (EC), a small 8-bit microcontroller on the LPC (or eSPI) bus that handles power button press, battery gas-gauging, fan PWM, charge control, lid switch, keyboard hotkeys, thermal sensor readout, and the keyboard backlight. The EC's firmware is independent of the BIOS proper and runs continuously from S5 onwards; the host CPU talks to it through a small region of LPC-mapped RAM (the EC-RAM) and a command/status register pair.
On ThinkPads, the EC is overwhelmingly an ITE part — an 8051-family MCU
(MCS-51 instruction set), executing a vendor firmware image of roughly 64–128 KB.
On the ThinkPad 13 Skylake the EC is an ITE 8051 v14.4, ~111 KB,
reset vector at 0x0070, carved cleanly from a known region of the
.fl1. AMD ThinkPads sometimes ship a Renesas part instead, with a
different instruction set; the toolkit's MCS-51 disassembler does not handle those
yet.
Working from the disassembly, the EC firmware breaks down into:
OperationRegion(ECRA, EmbeddedControl, 0, 0xFF) and the fields
beneath it). The EC keeps these in sync with the underlying hardware: thermal
sensor readings, fan tachometers, battery state-of-charge, lid switch state._Qxx query handlers, one per "EC event" reported back to the host
(lid open/close, AC plug, hotkey press). When the EC sets the SCI line, the
host runs the matching _Qxx method from the DSDT, and that method
usually reads back an event code byte from EC-RAM.
The 8051 uses a separate address space for its Special Function Registers
(0x80–0xFF in internal RAM). The base SFR map is standardised; ITE adds its own
custom SFRs on top, in roughly the same range, for the host-interface registers
(EC-RAM, command/status, SMI control) and the GPIO ports. The toolkit's MCS-51
disassembler annotates both: a mov A, P3 becomes
mov A, P3 ; GPIO port 3; a mov A, 0x9E becomes
mov A, EC_CMD ; ITE host command reg. With those annotations, finding
every host-interface site in the firmware is a grep instead of a multi-day reverse.
Coverage on the MCS-51 instruction set is ~99% with named annotations on most
Lenovo/ITE SFR ranges.
Two distinct vulnerability classes, each derivable from the same per-pad PADCFG data.
A pin configured as input with GPIROUTSMI=1 will trigger an SMI
when its configured edge condition (rising, falling, level high, level low) is
met. SMIs vector into SMM, which runs at the highest CPU privilege level and is
therefore an interesting target: a primitive that lets an attacker influence the
flow into SMM is the first step in many BIOS-level privilege escalations.
The first-order filter is "input + SMI routed". Not every such pin is reachable by a software attacker — some live on internal traces only, some are physically inaccessible to a non-root user, some are configured as level-high with an external pull that prevents firing in practice. The report flags candidates and leaves the physical-access part for a human pass; that is the right scope for a static pipeline.
The much more interesting signal is the lock state. The Intel PCH has a per-group
LOCK register that, once set, prevents further writes to PADCFG until reset.
Firmware sets the LOCK on critical pads after configuring them, so that later
code (including a malicious DXE driver) cannot rewrite the routing. An
unlocked SMI-routable input is qualitatively different from a locked one: the
unlocked case lets a future code path re-aim the SMI to a chosen edge or invert
the polarity, both of which can convert a benign signal into an attacker
primitive. The toolkit cross-references PADCFG against the group LOCK register
state captured in a collect_gpio.sh run from a live ThinkPad and
reports unlocked + SMI-routable as the genuine risk class.
A second class of report findings is the privacy / security device GPIO: a pin that controls something a user trusts visually (a camera kill switch, a microphone mute state, a TPM "physical presence" line, a fingerprint reader power line) but that can be written to from software. The trust assumption behind a privacy LED is that a lit LED means the camera is on and a dark LED means the camera is off — that the indicator and the underlying device share fate. If the indicator is driven by a software-controlled GPIO, that fate-sharing is software-mediated, and software can lie about it.
Hard-wired privacy switches (a physical slide that opens/closes a circuit before the camera's power line) do not have this problem and are the right answer to it. Most ThinkPads with a "ThinkShutter" sliding cover are in this category. ThinkPads that use software mute or software camera-disable, with no hardware interlock, are in the soft-privacy category and the report flags them as such.
gpio_security.py identifies which devices in the DSDT carry a
software-controlled GPIO matching the privacy / kill-switch heuristic (camera,
mic, fingerprint, TPM provisioning). With a live capture, the report adds the
lock-state field. The result is a per-model posture summary: which privacy
indicators on this ThinkPad model are spoofable from privileged software.
From "the BIOS readme says CVE-2022-xxxxx is fixed" to "here is the UEFI module that grew".
Lenovo's BIOS readme files include a section that, on most models, lists the
security updates included in the new revision: CVE IDs, Lenovo advisory IDs
(LEN-12345), the subsystems touched (SMM, BootGuard, TXT, microcode,
TPM), and sometimes a short text description. The format drifts gently over the
years and across product lines, so the readme miner is intentionally schema-light:
it looks for anything that matches a CVE pattern (CVE-\d{4}-\d+), a
Lenovo advisory pattern (LEN-\d+), and the standard subsystem
keywords (SMM, SMI, TPM, microcode,
BootGuard, TXT, flash, secure boot),
and then groups them per BIOS revision.
Across the archive sample so far, the miner surfaces 15 unique CVEs, 20 advisories, across 18 models. Spectre, MDS and Foreshadow fixes appear repeatedly because they were mitigated incrementally across many microcode and SMM revisions, and the dataset gives a clean fleet-view of who was patched when. The output is a per-model inventory and a fleet CVE→model index that lets you ask "which models, on which BIOS versions, fixed CVE-2022-xxxxx?".
The natural next question is where in the binary the fix landed. The
module-diff tool answers it. Two BIOS images are run through
uefiextract, every File in the resulting tree is hashed by its body
contents, and the trees are diffed: which modules changed, which were added or
removed, and how much each one grew or shrank. A "module" here is a UEFI File
keyed by its UI Section name when available (e.g. FlashUtilitySmm,
TcgPei) and by its GUID otherwise.
On the ThinkPad 13 transition r0buj24ww→r0buj26ww,
the module diff reports 66 of 546 modules changed, with a small number of
SMM-class modules growing noticeably:
FlashUtilitySmm (+1.1 KB), SystemSecureFlashSleepTrapSmm
(+0.4 KB), TcgPei (+0.7 KB). Each of those changes correlates well with
a security claim in the new BIOS readme: an SMI handler fix, a TPM self-test fix.
fw_secdiff.py is the auto-correlator. It takes the per-revision
readme miner output and the module diff, and for each readme security claim it
proposes the changed module(s) most likely to implement the fix — based on
name keywords (an SMI fix tends to land in a module with Smm or
SMI in the name), size delta (a fix usually grows the module), and
co-occurrence across multiple revisions (a module that consistently grows alongside
a particular subsystem's fixes is a strong candidate). The output is, per readme
claim, a ranked list of likely-fix modules with the per-module size delta and the
keyword match.
TcgPei and the SMI modules
Readme claims a TPM self-test improvement → only TcgPei contains
Tcg in its name and grew across the diff. Readme claims an SMM
hardening → FlashUtilitySmm and
SystemSecureFlashSleepTrapSmm grew and match the keyword
Smm. Both attributions confirmed by hand-decompilation of the
changed PE32+ images.
How much of a coreboot board port can be generated from the firmware image alone.
A new coreboot board port traditionally starts with a working board, an inteltool
capture, and a few days of careful hand-translation: walk the PCI tree, transcribe
each device into devicetree.cb, decode the live GPIO PADCFG and
translate to PAD_CFG_* macros, write a flash descriptor map,
transcribe board straps. The toolkit's claim is that the archived BIOS, on its own,
contains enough of that information to skip the inteltool step entirely and produce
most of the boilerplate.
| coreboot artifact | Source in the OEM BIOS | Toolkit step | Confidence |
|---|---|---|---|
devicetree.cb | DSDT PCI tree + ACPI device list | gen_devicetree.py | high |
gpio.h (PAD_CFG_*) | PlatformInit PADCFG table | coreboot_gpio.py | high |
board.fmd | Intel flash descriptor at offset 0x10 | flash_fmd.py | high |
| microcode files (.PAT) | BIOS package + carved 0x800-aligned blobs | blob_extract.py | high |
| VBT (iGPU) | FV section, known GUID | blob_extract.py | high |
| FSP binary | Integrated, often compressed, often partial | fsp_upd.py | low — use Intel reference FSP |
| Memory-down SPD | Verified absent in OEM BIOSes | mrc_spd.py | N/A — read live |
| Board strap mapping | EC firmware reads it at boot, not in image | — | requires hardware |
The Intel Firmware Support Package (FSP) is the binary blob that initializes memory, the CPU complex, and the PCH on Intel platforms. coreboot consumes it as a binary input; without it, a port cannot boot. The naive assumption is that the OEM BIOS embeds the FSP at a known offset, ready to be carved out.
The reality, validated across the sample, is more nuanced. OEM Lenovo BIOSes
integrate the FSP into their PEI flow rather than carrying it as a standalone
binary, and the standalone FspUpdRegion is frequently compressed
inside another FFS file. A clean carve is possible on some images and not others;
fsp_upd.py was tightened after confirming that the false-positive rate
on naive carving was high, and it now honestly reports "FSP UPD region not present"
when the integration is too aggressive to recover. For a real coreboot port, the
practical answer is to use Intel's reference FSP for the matching SoC.
A coreboot port for a memory-down board (one with soldered DRAM, no DIMM slot) needs
a JEDEC SPD byte stream describing the DRAM's geometry, timing, and refresh
parameters. The natural place to look is the OEM BIOS, which already has the same
information — it must, in order to bring up memory. mrc_spd.py
is a structural probe that scans every flash payload for a candidate SPD byte
sequence and validates each candidate using JEDEC's CRC-16, with type and revision
filtering to reject coincidental hits. The CRC implementation was end-to-end
verified by reproducing the stored CRC of known-good coreboot DDR4 SPDs exactly.
Across 8 diverse OEM ThinkPad BIOSes, the probe found zero embedded
SPDs. Lenovo's MRC keeps memory-down configuration in proprietary PEI policy
structures rather than as a flash SPD image, so a memory-down coreboot port must
read the SPD from the live board with decode-dimms or use Intel
reference values for the matching DRAM part. The detector still produces a useful
geometry decode when a real SPD is present (early test images, third-party
BIOSes, coreboot snapshots in the same archive).
Two pieces of a coreboot port cannot be reliably extracted from an OEM BIOS image: the FSP binary (integrated and compressed) and a memory-down SPD (kept in proprietary PEI policy, not embedded as a flash image). For both, the toolkit reports honestly and points at the right alternate source. Everything else in the table above is recoverable directly from the archive.
macOS does not run on a ThinkPad out of the box. The boilerplate it needs to get close is, however, derivable.
The Hackintosh community has converged on OpenCore as the bootloader and on a standard set of patches to make a generic Intel laptop boot a recent macOS: ACPI SSDTs for EC, USB, RTC, brightness, sleep wake; kernel extensions (kexts) for audio, ethernet, wifi, trackpad, iGPU; an SMBIOS impersonation of a similar real Mac so the kernel takes the right code paths. Every Hackintosh port starts from the same boilerplate — and that boilerplate is what the toolkit auto-generates.
gen_ssdt.py emits the standard set of patch SSDTs from facts in the
decompiled DSDT:
EC and accept the standard EC ops — sometimes the
OEM DSDT uses a different name (H_EC, ECDV), so the
SSDT renames.X86PlatformPlugin by setting
plugin-type on CPU0._OSI("Windows") queries so that
Windows-specific paths in the DSDT (typically the ones that touch features
macOS doesn't support) get taken.The output is iasl-clean ASL: the SSDTs compile without warnings under stock ACPICA, which matters because Hackintosh ports historically ship hand-edited SSDTs that accumulate iasl warnings over years, and untangling those is a real time sink.
kext_map.py walks the device inventory (PCI vendor/device IDs +
ACPI HIDs) and emits the kext set: Lilu and WhateverGreen as base, IntelMausi for
e1000-class NICs, AppleALC with a codec layout id derived from the audio device's
subsystem ID, VoodooPS2Controller for the trackpad, and so on. igpu_fb.py
picks a WhateverGreen ig-platform-id based on the iGPU's PCI device
id and CPU generation, and emits a connector layout (an internal eDP panel plus
two external DP / HDMI connectors, matching the typical ThinkPad chassis).
The kernel's behavior changes based on the SMBIOS product name (which Mac it thinks
it is running on). For a ThinkPad of a given CPU generation and form factor, the
right impersonation is the matching mobile Mac of the same era — a quad-core
Coffee Lake ThinkPad maps to MacBookPro15,2, a dual-core Whiskey Lake
to MacBookPro15,4, etc. smbios_pick.py encodes the CPU
family + segment matrix and picks a sensible default; gen_opencore.py
folds the choice into the PlatformInfo section of the generated
config.plist.
The generator handles Skylake through Comet Lake well; Ice Lake and Tiger Lake partially (the iGPU layouts shift); Alder Lake and AMD are not Hackintosh candidates and the toolkit declines them. Beyond the skeleton, real Hackintosh bring-up still needs human work for audio layout, trackpad calibration, sleep stability, and the persistent NVRAM. The skeleton's value is that it skips the mechanical day-one work; it does not skip the bring-up week.
A claim that scales to a fleet only if it is continuously measured.
The temptation when building an extraction pipeline is to test it on a few
representative images, declare it good, and move on. The temptation gets punished
the first time someone runs the pipeline on a new generation, a new vendor, or a
legacy non-UEFI BIOS, and finds out by silently producing nothing useful.
batch_extract.py exists to make that failure mode loud and structured:
every BIOS in a sweep gets classified by outcome, every failure carries a reason
code, and new failure classes accumulate at the top of the report until they get
handled.
The classifications, in order of frequency:
<id>w.exe is a
driver bundle, not a BIOS update; classified by the payload-listing pre-filter
and not counted as a failure.uefiextract aborts
on a malformed file; usually a parser bug worth fixing upstream.The current sweep produces 12 of 12 extractions on the diverse sample (multi-vendor, multi-generation) and 9 of 9 on the BIOS-payload-classification check. As the archive grows the sweep grows with it; new failure classes surface naturally, and the toolkit hardens against them one at a time.
collect_gpio.shThe last mile, on hardware, with a POSIX-sh script and no special tools.
Some pad data is not in the firmware image. The resolved gpio →
consumer map only exists on a running machine, because board-variant
selection runs at boot from hardware straps; the runtime PADCFG with the actual
LOCK bits set is also only available live. collect_gpio.sh is a
short POSIX-sh script that runs as root on any live ThinkPad and produces a
self-describing tarball, with no dependencies beyond coreutils, dmidecode and an
optional python3 for the NVS region dump.
| File in the tarball | Source on the live system | What it gives the resolver |
|---|---|---|
| gpio.txt | /sys/kernel/debug/gpio | resolved gpio → consumer map — the primary ground truth |
| pinctrl/<ctrl>/* | /sys/kernel/debug/pinctrl | per-pad config + gpio-ranges (gpio# → pad mapping) |
| acpi/DSDT.aml, SSDT* | /sys/firmware/acpi/tables | cross-check against the archived BIOS tables |
| acpi_nvs_*.bin | /dev/mem (per /proc/iomem) | GNVS region with runtime values — the resolver's missing half |
| acpi_devices.txt | /sys/bus/acpi/devices | full ACPI device list (HID → path) |
| dmi_id.txt | /sys/class/dmi/id | model, MTM, BIOS version (identity only; redact serial before sharing) |
On a Whiskey Lake ThinkPad with the CNL-LP PCH (GPIO controller HID
INT34BB), a capture exercises every step of the Intel resolver: the
chipset id CC at the top byte of every GPIO_PAD matches
the runtime; the GNVS values written by AcpiPlatform match the values
visible in acpi_nvs_*.bin; the PADCFG mode/direction in the runtime
pinctrl matches the static decode. Comet Lake, Tiger Lake, Alder Lake and AMD
platforms work identically; only the controller HID changes. The script degrades
gracefully on older kernels and on machines without a pinctrl driver, recording
what it could not capture and continuing.
The shape of the capture matters more than any single machine's bytes: the same script run on a hundred ThinkPads would produce a hundred per-board posture reports, indexable by model and BIOS version, that together describe the security posture of an entire fleet. The toolkit is structured to consume the captures in aggregate, not just one at a time.
Where the firmware-only path stops and the physical board starts.
GPIO_PAD immediate to the "GPP_A" / "GPP_B" / "GPP_C" community
label changes per chipset id (Sunrise Point LP vs H, Cannon Lake LP vs H, etc).
The toolkit cross-checks against the Intel FSP GpioLib GPIO_PAD
definitions for the SoC rather than against community tables.
FspUpdRegion. The reliable source is Intel reference
FSP for the matching SoC; the toolkit reports honestly when a clean carve is not
possible.
lhw_fetch.py
recovers machine identity from the static parts of the page but cannot extract
the dmidecode or acpidump bodies, which the site renders client-side. For
remote ground-truth GPIO data use coreboot board-status
inteltool dumps, or run collect_gpio.sh on the live
hardware.
decode-dimms on the live board or
use Intel reference values.
The interesting unfinished work, in rough order of leverage.
_Qxx handlers, host commands), but the EC firmware itself is
unreadable without a Renesas RL78 / RX disassembler — an obvious gap to
close.
collect_gpio.sh as a
first-class artifact, with a small landing page that explains what it captures
and what to do with the resulting tarball, would let third parties contribute
ground truth back into the model coverage matrix. Aggregated correctly, that
becomes a per-model security posture report that updates as Lenovo ships new
BIOSes.
board-status repository (inteltool dumps, lspci, dmidecode). Joining
the two datasets would give a richer cross-check on the resolver's accuracy
on already-ported models and a head start on board-status for new ports.
The project is, in the end, a small bet that the value in the Lenovo firmware archive is mostly latent — that the bytes are out there, the tools to look at them are out there, but the glue that holds the pipeline together and turns one model's BIOS into useful answers about that model has not been written. Writing that glue, carefully and reproducibly, is what the codebase is for. The findings above are what came out of doing it.
Where the code lives, where to file bugs, and how to reach the maintainer.
| Source | codeberg.org/tetdrad0n/thinkpad-fw-analysis |
| Issues | codeberg.org/tetdrad0n/thinkpad-fw-analysis/issues |
| Pull requests | codeberg.org/tetdrad0n/thinkpad-fw-analysis/pulls |
| Maintainer profile | codeberg.org/tetdrad0n |
| tetdrad0n@proton.me | |
| Telegram | @tetdrad0n |
| Tox (uTox) | 2032774D78DD625E94814247FB454846B41F320A98A24125D84107D88A6A5C19E3565D6AC07D |
Contributions are welcome. The highest-leverage open areas are listed in
"What's next": new chipset-family entries (Alder/Raptor Lake on
Intel; Rembrandt/Phoenix on AMD), Renesas EC disassembly, a Qualcomm validation
against a real X13s capture, and ground-truth tarballs from
collect_gpio.sh runs on hardware not yet in the coverage matrix.
Bug reports are useful at any level of detail; if you have a failing
batch_extract.py run on a particular BIOS, attach the package URL or
the model + MT + BIOS revision, and the classifier's output. New failure classes
are how the pipeline hardens.
Pull requests should target main. Keep the no-folklore rule: PADCFG
and register decodes from the official Intel datasheet (332691 and the per-SoC
successors), UEFI / ACPI references from the published specifications, AMD work
from the PPR rather than community write-ups.