W^X in UEFI firmware and the linux boot chain.

What is W^X?

If this sounds familiar to you, it probably is. It means that memory should be either writable ("W", typically data), or executeable ("X", typically code), but not both. Elsewhere in the software industry this is standard security practice since ages. Now it starts to take off for UEFI firmware too.

This is a deep dive into recent changes, in both code (firmware) and administration (secure boot signing), the consequences this has for the linux, and the current state of affairs.

Changes in the UEFI spec and edk2

All UEFI memory allocations carry a memory type (EFI_MEMORY_TYPE). UEFI tracks since day one whenever a memory allocation is meant for code or data, among a bunch of other properties such as boot service vs. runtime service memory.

For a long time it didn't matter much in practice. The concept of virtual memory does not exist for UEFI. IA32 builds even run with paging disabled (and this is unlikely to change until the architecture disappears into irrelevance). Other architectures use identity mappings.

While UEFI does not use address translation, nowdays it can use page tables to enforce memory attributes, including (but not limited to) write and execute permissions. When configured to do so it will set code pages to R-X and data pages to RW- instead of using RWX everywhere, so code using memory types incorrectly will trigger page faults.

New in the UEFI spec (added in version 2.10) is the EFI_MEMORY_ATTRIBUTE_PROTOCOL. Sometimes properties of memory regions need to change, and this protocol can be used to do so. One example is a self-uncompressing binary, where the memory region the binary gets unpacked to initially must be writable. Later (parts of) the memory region must be flipped from writable to executeable.

As of today (Dec 2023) edk2 has a EFI_MEMORY_ATTRIBUTE_PROTOCOL implementation for the ARM and AARCH64 architectures, so this is present in the ArmVirt firmware builds but not in the OVMF builds.

Changed secure boot signing requirements

In an effort to improve firmware security in general and especially for secure boot Microsoft changed the requirements for binaries they are willing to sign with their UEFI CA key.

One key requirement added is that the binary layout must allow to enforce memory attributes with page tables, i.e. PE binary sections must be aligned to page size (4k). Sections also can't be both writable and executable. And the application must be able to deal with data section being mapped as not executable (NX_COMPAT).

These requirements apply to the binary itself (i.e. shim.efi for linux systems) and everything loaded by the binary (i.e. grub.efi, fwupd.efi and the linux kernel).

Where does linux stand?

We had and party still have a bunch of problems in all components involved in the linux boot process, i.e. shim.efi, grub.efi and the efi stub of the linux kernel.

Some are old bugs such as memory types not being used correctly, which start to cause problems due to the firmware becoming more strict. Some are new problems due to Microsoft raising the bar for PE binaries, typically sections not being page-aligned. The latter are easily fixed in most cases, often it is just a matter of adding alignment to the right places in the linker scripts.

Lets have closer look at the components one by one:

shim.efi

shim added code to use the new EFI_MEMORY_ATTRIBUTE_PROTOCOL before it was actually implemented by any firmware. Then this was released completely untested. Did not work out very well, we got a nice time bomb, and edk2 implementing EFI_MEMORY_ATTRIBUTE_PROTOCOL for arm triggered it ...

Fixed in main branch, no release yet.

Getting new shim.efi binaries signed by Microsoft depends on the complete boot chain being compilant with the new requirements, which prevents shim bugfixes being shipped to users right now.

That should be solved soon though, see the kernel section below.

grub.efi

grub.efi used to use memory types incorrectly.

Fixed upstream years ago, case closed.

Well, in theory. Upstream grub development goes at glacial speeds, so all distros carry a big stack of downstream patches. Not surprisingly that leads to upstream fixes being absorbed slowly and also to bugs getting reintroduced.

So, in practice we still have buggy grub versions in the wild. It is getting better though.

The linux kernel

The linux kernel efi stub had it's fair share of bugs too. On non-x86 architectures (arm, riscv, ...) all issues have been fixed a few releases ago. They all share much of the efi stub code base and also use the same self-decompressing method (CONFIG_EFI_ZBOOT=y).

On x86 this all took a bit longer to sort out. For historical reasons x86 can't use the zboot approach used by the other architectures. At least as long as we need hybrid BIOS/UEFI kernels, which most likely will be a number of years still.

The final x86 patch series has been merged during the 6.7 merge window. So we should have a fixed stable kernel in early January 2024, and distros picking up the new kernel in the following weeks or months. Which in turn should finally unblock shim updates.

There should be enough time to get everything sorted for the spring distro releases (Fedora 40, Ubuntu 24.04).

edk2 config options

edk2 has a bunch of config options to fine tune the firmware behavior, both compile time and runtime. The relevant ones for the problems listed above are:

PcdDxeNxMemoryProtectionPolicy

Compile time option. Use the --pcd switch for the edk2 build script to set these. It's bitmask, with one bit for each memory type, specifying whenever the firmware shoud apply memory protections for that particular memory type, by setting the flags in the page tables accordingly.

Strict configuration is PcdDxeNxMemoryProtectionPolicy = 0xC000000000007FD5. This is also the default for ArmVirt builds.

Bug compatible configuration is PcdDxeNxMemoryProtectionPolicy = 0xC000000000007FD1. This excludes the EfiLoaderData memory type from memory protections, so using EfiLoaderData allocations for code will not trigger page faults. Which is an very common pattern seen in boot loader bugs.

PcdUninstallMemAttrProtocol

Compile time options, for ArmVirt only. Brand new, committed to the edk2 repo this week (Dec 12th 2023). When set to TRUE the EFI_MEMORY_ATTRIBUTE_PROTOCOL will be unistalled. Default is FALSE.

Setting this to TRUE will work around the shim bug.

opt/org.tianocore/UninstallMemAttrProtocol

Runtime option, for ArmVirt only. Also new. Can be set using -fw_cfg on the qemu command line: -fw_cfg name=opt/org.tianocore/UninstallMemAttrProtocol,string=y|n. This is a runtime override for PcdUninstallMemAttrProtocol. Works for both enabling and disabling the shim bug workaround.

In the future PcdDxeNxMemoryProtectionPolicy will probably disappear in favor of memory profiles, which will allow to configure the same settings (plus a few more) at runtime.

Hands on, part #1 — using fedora edk2 builds

The default builds in the edk2-ovmf and edk2-aarch64 packages are configured to be bug compatible, so VMs should boot fine even in case the guests are using a buggy boot chain.

While this is great for end users it doesn't help much for bootloader development and testing, so there are alternatives. The edk2-experimental package comes with a collection of builds better suited for that use case, configured with strict memory protections and (on aarch64) EFI_MEMORY_ATTRIBUTE_PROTOCOL enabled, so you can see buggy builds actually crash and burn. 🔥

AARCH64 architecture

For AARCH64 this is /usr/share/edk2/experimental/QEMU_EFI-strictnx-pflash.raw. The magic words for libvirt are:

<domain type='kvm'>
[ ... ]
  <os>
    <type arch='aarch64' machine='virt'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/edk2/experimental/QEMU_EFI-strictnx-pflash.raw</loader>
    <nvram template='/usr/share/edk2/aarch64/vars-template-pflash.raw'/>
  </os>
[ ... ]

If a page fault happens you will get this line ...

  Synchronous Exception at 0x00000001367E6578

... on the serial console, followed by a stack trace and register dump.

X64 architecture

For X64 this is /usr/share/edk2/experimental/OVMF_CODE_4M.secboot.strictnx.qcow2. Needs edk2-20231122-12.fc39 or newer. The magic words for libvirt are:

<domain type='kvm'>
[ ... ]
  <os>
    <type arch='x86_64' machine='q35'>hvm</type>
    <loader readonly='yes' secure='yes' type='pflash' format='qcow2'>/usr/share/edk2/experimental/OVMF_CODE_4M.secboot.strictnx.qcow2</loader>
    <nvram template='/usr/share/edk2/ovmf/OVMF_VARS_4M.secboot.qcow2' format='qcow2'/>
  </os>
[ ... ]

It is also a good idea to add a debug console to capture the firmware log:

    <serial type='null'>
      <log file='/path/to/firmware.log' append='off'/>
      <target type='isa-debug' port='1'>
        <model name='isa-debugcon'/>
      </target>
      <address type='isa' iobase='0x402'/>
    </serial>

If you are lucky the page fault is logged there, also with an register dump. If you are not so lucky the VM will just reset and reboot.

Hands on, part #2 — using virt-firmware

The virt-firmware project is a collection of python modules and scripts for working with efi variables, efi varstores and also pe binaries. In case your distro hasn't packages you can install it using pip like most python packages.

virt-fw-vars

The virt-fw-vars utility can work with efi varstores. For example it is used to create the OVMF_VARS*secboot* files, enrolling the secure boot certificates into the efi security databases.

The simplest operation is to print the variable store:

virt-fw-vars --input /usr/share/edk2/ovmf/OVMF_VARS_4M.secboot.qcow2 \
             --print --verbose | less

When updating edk2 varstores virt-fw-vars always needs both input and output files. If you want change an existing variable store both input and output can point to the same file. For example you can turn on shim logging for an existing libvirt guest this way:

virt-fw-vars --input /var/lib/libvirt/qemu/nvram/${guest}_VARS.qcow2 \
             --output /var/lib/libvirt/qemu/nvram/${guest}_VARS.qcow2 \
             --set-shim-verbose

The next virt-firmware version will get a new --inplace switch to avoid listing the file twice on the command line for this use case.

If you want start from scratch you can use an empty variable store from /usr/share/edk2 as input. For example when creating a new variable store template with the test CA certificate (shipped with pesign.rpm) enrolled additionally:

dnf install -y pesign
certutil -L -d /etc/pki/pesign-rh-test -n "Red Hat Test CA" -a \
             | openssl x509 -text > rh-test-ca.txt
virt-fw-vars --input /usr/share/edk2/ovmf/OVMF_VARS_4M.qcow2 \
             --output OVMF_VARS_4M.secboot.rhtest.qcow2 \
             --enroll-redhat --secure-boot \
             --add-db OvmfEnrollDefaultKeys rh-test-ca.txt

The test CA will be used by all Fedora, CentOS Stream and RHEL build infrastructure to sign unofficial builds, for example when doing scratch builds in koji or when building rpms locally on your developer workstation. If you want test such builds in a VM, with secure boot enabled, this is a convenient way to do it.

pe-inspect

Useful for having a look at EFI binaries is pe-inspect. If this isn't present try pe-listsigs. Initially the utility only listed the signatures, but was extended over time to show more information, so I added the pe-inspect alias later on.

Below is the output for an 6.6 x86 kernel, you can see it does not have the patches to page-align the sections:

# file: /boot/vmlinuz-6.6.4-200.fc39.x86_64
#    section: file 0x00000200 +0x00003dc0  virt 0x00000200 +0x00003dc0  r-x (.setup)
#    section: file 0x00003fc0 +0x00000020  virt 0x00003fc0 +0x00000020  r-- (.reloc)
#    section: file 0x00003fe0 +0x00000020  virt 0x00003fe0 +0x00000020  r-- (.compat)
#    section: file 0x00004000 +0x00df6cc0  virt 0x00004000 +0x05047000  r-x (.text)
#    sigdata: addr 0x00dfacc0 +0x00000d48
#       signature: len 0x5da, type 0x2
#          certificate
#             subject CN: Fedora Secure Boot Signer
#             issuer  CN: Fedora Secure Boot CA
#       signature: len 0x762, type 0x2
#          certificate
#             subject CN: kernel-signer
#             issuer  CN: fedoraca

pe-inspect also knows the names for a number of special sections and supports decoding and pretty-printing them, for example here:

# file: /usr/lib/systemd/boot/efi/systemd-bootx64.efi
#    section: file 0x00000400 +0x00011a00  virt 0x00001000 +0x0001191f  r-x (.text)
#    section: file 0x00011e00 +0x00003a00  virt 0x00013000 +0x00003906  r-- (.rodata)
#    section: file 0x00015800 +0x00000400  virt 0x00017000 +0x00000329  rw- (.data)
#    section: file 0x00015c00 +0x00000200  virt 0x00018000 +0x00000030  r-- (.sdmagic)
#       #### LoaderInfo: systemd-boot 254.7-1.fc39 ####
#    section: file 0x00015e00 +0x00000200  virt 0x00019000 +0x00000049  r-- (.osrel)
#    section: file 0x00016000 +0x00000200  virt 0x0001a000 +0x000000de  r-- (.sbat)
#       sbat,1,SBAT Version,sbat,1,https://github.com/rhboot/shim/blob/main/SBAT.md
#       systemd,1,The systemd Developers,systemd,254,https://systemd.io/
#       systemd.fedora,1,Fedora Linux,systemd,254.7-1.fc39,https://bugzilla.redhat.com/
#    section: file 0x00016200 +0x00000200  virt 0x0001b000 +0x00000084  r-- (.reloc)

virt-fw-sigdb

The last utility I want introduce is virt-fw-sigdb, which can create, parse and modify signature databases. The signature database format is used by the firmware to store certificates and hashes in EFI variables. But sometimes the format used for files too. virt-firmware has the functionality anyway, so I've added a small frontend utility to work with those files.

One file in signature database format is /etc/pki/ca-trust/extracted/edk2/cacerts.bin which contains the list of of trusted CAs in sigature database format. Can be used to pass the CA list to the VM firmware for TLS connections (https network boot).

Shim also uses that format when compiling multiple certificates into the built-in VENDOR_DB or VENDOR_DBX databases.

Final remarks

Thats it for today folks. Hope you find this useful.