Trials and tribulations with EFI
May. 25th, 2011 11:45 am
mjg59
I wrote about some EFI implementation issues I'd seen on Macs a while back. Shortly afterwards we started seeing approximately identical bugs on some Intel reference platforms, and fixing it actually became more of a priority.
The fundamental problem is the same. We take the EFI memory map, identify the virtual addresses of the regions that will be required for runtime (mapping them into virtual address space if needed) and then call the firmware's SetVirtualAddressMap() implementation in order to let the firmware convert all its pointers. Sadly it seems that some firmware implementations call into sections of boot services code to do this, which is unfortunate because we've already taken that back to use as RAM. So, given that this is clearly against the spec, how does it ever work?
The tediously dull version is that Linux typically calls SetVirtualAddressMap() in the kernel, and everyone else does it in their bootloaders. The bootloader hasn't set up NX bits or anything, so it just happens to work there. We could just do it in the bootloader in Linux, but that makes doing things like kernel address space randomisation trickier, so it's not the favoured approach. So, instead, we can probably just reserve those ranges until after we've switched to virtual mode, and make sure the pages are executable. This ought to land in 2.6.40, or whatever it ends up being called.
(The alternative approach, of just never transitioning to physical mode, turns out to mysteriously fail on various machines. Calls to SetVariable() just give errors. We just don't know)
That still leaves the problem of SetVariable() on the test Mac trying to access a random address. That one turned out to be easier. There's 2MB of flash at the top of physical address space, and this was being presented as being broken into four separate EFI regions. While physically contiguous, Linux was mapping these to discontiguous virtual addresses. Apple's firmware appeared to assume that a pointer into one region could just be incremented into another. So because it's still easier to change the kernel than change Apple, 2.6.39 merges these regions to ensure they're contiguous.
Remaining problems include some machines seemingly not booting if they have 4GB of RAM or more and this Apple failing to communicate with its panel over the eDP auxchannel. Anyone got any idea how to dump the bios compatibility module out of a running EFI session?
The fundamental problem is the same. We take the EFI memory map, identify the virtual addresses of the regions that will be required for runtime (mapping them into virtual address space if needed) and then call the firmware's SetVirtualAddressMap() implementation in order to let the firmware convert all its pointers. Sadly it seems that some firmware implementations call into sections of boot services code to do this, which is unfortunate because we've already taken that back to use as RAM. So, given that this is clearly against the spec, how does it ever work?
The tediously dull version is that Linux typically calls SetVirtualAddressMap() in the kernel, and everyone else does it in their bootloaders. The bootloader hasn't set up NX bits or anything, so it just happens to work there. We could just do it in the bootloader in Linux, but that makes doing things like kernel address space randomisation trickier, so it's not the favoured approach. So, instead, we can probably just reserve those ranges until after we've switched to virtual mode, and make sure the pages are executable. This ought to land in 2.6.40, or whatever it ends up being called.
(The alternative approach, of just never transitioning to physical mode, turns out to mysteriously fail on various machines. Calls to SetVariable() just give errors. We just don't know)
That still leaves the problem of SetVariable() on the test Mac trying to access a random address. That one turned out to be easier. There's 2MB of flash at the top of physical address space, and this was being presented as being broken into four separate EFI regions. While physically contiguous, Linux was mapping these to discontiguous virtual addresses. Apple's firmware appeared to assume that a pointer into one region could just be incremented into another. So because it's still easier to change the kernel than change Apple, 2.6.39 merges these regions to ensure they're contiguous.
Remaining problems include some machines seemingly not booting if they have 4GB of RAM or more and this Apple failing to communicate with its panel over the eDP auxchannel. Anyone got any idea how to dump the bios compatibility module out of a running EFI session?
contiguous non-contiguous regions
Date: 2011-05-26 01:13 am (UTC)I wonder if the presentation that the flash is 4 separate regions is false & actually the bug - or maybe it's a limitation because it's on 4 separate 512KB chips?
Don't know enough about EFI, just trying to attack the problem another way.
Re: contiguous non-contiguous regions
Date: 2011-05-26 01:21 am (UTC)ignore BIOS^wEFI
Date: 2011-05-26 03:09 am (UTC)Re: ignore BIOS^wEFI
Date: 2011-05-26 12:32 pm (UTC)Re: ignore BIOS^wEFI
Date: 2011-05-26 01:41 pm (UTC)Are you implying that x86 hardware vendors are heading towards the same segmentation as in the ARM world, using EFI as a "we're still compatible" excuse? Sadly, it wouldn't even surprise me. But it would motivate me to invest heavily in companies supporting coreboot, such as AMD.
Re: ignore BIOS^wEFI
Date: 2011-05-26 01:56 pm (UTC)The only thing EFI adds is some additional runtime services. The only one of those we actually need is the ability to set nvram variables. If we didn't have EFI we'd need to know the location of the flash and we'd need to know the internal storage format, and that's not something that's standardised.
Advocato, Livejournal, dreamwidth accounts
Date: 2011-05-26 04:25 pm (UTC)Re: Advocato, Livejournal, dreamwidth accounts
Date: 2011-05-26 04:36 pm (UTC)The descriptions are quire amusing
Date: 2011-06-02 02:11 am (UTC)