The objective of this project is to fix and complete SMP support (multiprocessing) in GNU/Hurd. This support must be implemented in GNU/Hurd's microkernel (aka GNU Mach)
GNU/Hurd includes a tiny SMP support, as this FAQ explain.
The GNU Mach source code includes many special cases for multiprocessor, controlled by #if NCPUS > 1
macro.
But this support is very limited:
-
GNU Mach don't detect CPUs in runtime: The number of CPUs must be hardcoded in compilation time.
The number of cpus is set inmach_ncpus
configuration variable, set to 1 by default, inconfigfrag.ac
file. This variable will generateNCPUS
macro, used by gnumach to control the special cases for multiprocessor.
IfNCPUS > 1
, gnumach will enable multiprocessor support, with the number of cpus set by the user inmach_ncpus
variable. In other case, this support will be unabled. -
The special cases to multicore in gnumach source code have never been tested, so these can contain many errors. Furthermore, these special case are incomplete: many functions, as
cpu_number()
orintel_startCPU()
aren't written. -
GNU Mach doesn't initialize the processor with the properly options to multiprocessing. By this reason, the current support is only multithread, not real multiprocessor
Added to this, there are other problem: Many drivers included in Hurd aren't thread-safe, and these could crash in a SMP environment. So, It's necessary to isolate this drivers, to avoid concurrency problems
To solve this, we need to implement some routines to detect the number of processors, assign an identifier to each processor, and configure the lapic and IPI support. These routines must been executed during Mach boot.
"Really, all the support you want to get from the hardware is just getting the number of processors, initializing them, and support for interprocessor interrupts (IPI) for signaling." - Samuel Thibault link
"The process scheduler probably already has the support. What is missing is the hardware driver for SMP: enumeration and initialization." - Samuel Thibault link
The current necessary functions are cpu_number()
(in kern/cpu_number.h) and intel_startCPU()
.
Another not-implemented function, but don't critical, is cpu_control()
Reference
Other interesting files are pmap.c
and sched_prim.c
Added to this, we have to build an isolated environment to execute the non-thread-safe drivers.
"Yes, this is a real concern. For the Linux drivers, the long-term goal is to move them to userland anyway. For Mach drivers, quite often they are not performance-sensitive, so big locks would be enough." - Samuel Thibault link
You can read the full project draft in Hurd SMP Project draft
To test the software you will need:
-
Debian GNU/Hurd installation: The Debian GNU/Hurd installer is pretty similar to a standard Debian installer.
-
You can follow this guide to learn how to install Debian GNU/Hurd
Install Debian GNU/Hurd in real hardware -
If you prefer to use a virtual machine as Qemu, you can use this script: qemu_hurd script.
- Also, you can install It in VirtualBox!! ;)
-
-
Compile the sources: From Debian GNU/Hurd, follow this steps:
-
Clone the repository:
git clone https://github.com/AlmuHS/GNUMach_SMP
-
Install the dependencies
apt-get install build-essential fakeroot apt-get build-dep gnumach apt-get install mig
-
Configure preliminary steps
cd GNUMach_SMP autoreconf --install #create build directory mkdir build cd build ../configure --prefix=
-
Compile!!
make gnumach.gz
-
Copy the new image to /boot directory (as root)
cp gnumach.gz /boot/gnumach-smp.gz
-
Update grub (as root)
update-grub
-
Reboot
reboot
After reboot, you must to select gnumach-smp.gz in GRUB menu
-
More info in: https://www.gnu.org/software/hurd/microkernel/mach/gnumach/building.html
- Recovered and updated old APIC headers from Mach 4
- Modified
configfrag.ac
- Now, if
mach_ncpus > 1
,NCPUS
will be set to 255
- Now, if
- Integrated cpu detection and enumeration from acpi tables
- Solved memory mapping for
*lapic
. Now It's possible to read the Local APIC of the current processsor. - Implemented
cpu_number()
function - Solved ioapic enumeration: changed linked list to array
- Initialized master_cpu variable to 0
- Initialized ktss for master_cpu
- Enabled cpus using StartUp IPI, and switched them to protected mode
- Loaded temporary GDT and IDT
- Implemented assembly
CPU_NUMBER()
- Refactorized
cpu_number()
with a more efficient implementation - Added interrupt stack to cpus
- Improve memory reserve to cpu stack, using Mach style (similar to interrupt stack)
- Enabled paging in AP processors
- Loaded final GDT and IDT
- Added cpus to scheduler
- In the Min_SMP test environment, the cpus are detected and started correctly
- I need to implement APIC configuration
- In gnumach, the number of cpus and its lapic structures are detected and enumerated correctly
- ioapic enumeration feels to work correctly
- Mach use PIC 8259 controller, so ioapic is not necessary. Migrate Mach to ioapic is a future TODO
- gnumach enable all cpus during the boot successfully
- The cpus are added successfully to the kernel
- gnumach boots with 2 cpu
- It fails with more than 2 cpu, and with a only cpu. TODO: fix It
- Some Hurd servers fails
- DHCP client crash during the boot
- Login screen don't receive keyboard touch
-
The cpu detection and enumeration are implemented in
acpi_rdsp.c
andacpi_rdsp.h
.- The main function
acpi_setup()
is called frommodel_dep.c
- This function generates some structures:
- The
apic_id
is stored inmachine_slot
- The main function
-
The APIC structures, recovered from old gnumach code, are stored in
apic.h
-
cpu_number()
C implementation was added tokern/cpu_number()
. -
The
CPU_NUMBER()
assembly implementation was added toi386/i386/cpu_number.h
-
Function
start_other_cpus()
was modified, to changeNCPUS
macro toncpu
variable -
The memory mapping is implemented in
vm_map_physical.c
andvm_map_physical.h
- The lapic mapping is in
extra_setup()
- This call require that pagging is configured, so the call is added in
kern/startup.c
, after pagging configuration
- The lapic mapping is in
-
The cpus enabling is implemented in
mp_desc.c
- The routine to switch the cpus to protected mode is
cpuboot.S
- The routine to switch the cpus to protected mode is
-
cpu_number()
has been refactorized, replacing the while loop with the arrayapic2kernel[]
, indexed by apic_id -
CPU_NUMBER()
assembly function has been implemented usingapic2kernel[]
array -
Added call to
interrupt_stack_alloc()
beforemp_desc_init()
-
Added paging configuration in
cpuboot.S
-
Added calls to
gdt_init()
andidt_init()
before call toslave_main()
, to load final GDT and IDT. -
Enabled call to
slave_main()
, to add AP processors to the kernel -
Moved paging configuration to
paging_setup()
function -
Solved little problem with AP stack: now each AP has their own stack
We have recovered the apic.h
header, original from Mach 4, with Local APIC and IOAPIC structs, and an old implementation of cpu_number()
.
-
cpu_number()
C implementation was moved tokern/cpu_number.c
, and the assemblyCPU_NUMBER()
implementation was moved toi386/i386/cpu_number.h
-
struct ApicLocalUnit
was updated to the latest Local APIC fields, and stored inimps/apic.h
In this step, we find the Local APIC and IOAPIC registers in the ACPI tables, and enumerate them.
The implementation of this step is based in Min_SMP acpi.c implementation. The main function is acpi_setup()
, who call to other functions to go across ACPI tables.
To adapt the code to gnumach, It was necessary some changes:
-
Copy and rename files
The
acpi.c
andacpi.h
files were renamed toacpi_rsdp.c
andacpi_rsdp.h
These files were copied in
i386/i386at/..
directory -
Change headers and move variables
The #include headers must be changed to the gnumach equivalent. Some variables declared in
cpu.c
were moved toacpi_rsdp.c
or other files:- The number of cpus,
ncpu
, was moved toacpi_rsdp.c
- The lapic ID, stored in
cpus[]
array, was added tomachine_slot[NCPUS]
array, and the cpus[] array was removed. - The lapic pointer extern declaration was added to
kern/machine.h
- The
struct list ioapics
was changed toioapics[16]
array, inacpi_rsdp.c
struct ioapic
was moved toimps/apic.h
- The number of cpus,
-
Replace physical address with logical address
The most important modification is to replace the physical address with the equivalent logical address. To ease this task, this function is called before configure pagging.
The memory address below 0xc0000000 are mapped directly by the kernel, and their logical address can be got using the macro
phystokv(address)
. This way is used to get the logical address of ACPI tables pointers.But the lapic pointer is sitted in a high memory position, up to 0xf0000000, so It must be mapped manually. To map this address, we need to use pagging, which is not configured yet. To solve this, we split the process in two steps:
- In APIC enumeration step, we store the lapic address in a temporary variable:
lapic_addr
- After pagging is configured, we call to function
extra_setup()
which reserve the memory address to the lapic pointer and initialize the real pointer,*lapic
.
- In APIC enumeration step, we store the lapic address in a temporary variable:
Once get the lapic pointer, we could use this pointer to access to the Local APIC of the current processor. Using this, we have implemented cpu_number()
function, which search in machine_slot[]
array the apic_id of the current processor, and return the index as kernel ID.
A newer implementation get the Kernel ID from the apic2kernel[]
array, using the apic_id as index.
This function will be used later to get the cpu currently working.
In this step, we enable the cpus using the StartUp IPI. To do this, we need to write the ICR register in the Local APIC of the processor who raise the IPI (in this case, the BSP raise the IPI to each processor).
To implement this step, we have been inspired in Min_SMP mp.c
and cpu.c
files, and based in the existent work in i386/i386/mp_desc.c
We have split this task in some steps:
-
Modify
start_other_cpus()
The
start_other_cpus()
function calls tocpu_start(cpu)
for each cpu, to enable It. We have modified this function to change theNCPUS
macro toncpu
variable, reserve memory to the cpu stack, and initialize themachine_slot[]
to indicate cpu is unabled.Furthermore, we have added some printf to show the number of cpus and the kernel ID of current cpu.
-
Reserve memory for cpu stack
To implement this step, we token the interrupt stack code as base, using the function
interrupt_stack_alloc()
.We have added two new arrays, to store the pointer to the stack of each cpu.
cpu_stack[]
store the pointer to the stack_cpu_stack_top[]
store the address of stack top
All stack use a single memory reserve. In this way, we only reserve a single memory block, which will be splited to each cpu stack. To reserve the memory, we call to
init_alloc_aligned()
, which reserve memory from the BIOS area. This function return the initial address of the memory block, which is stored instack_start
.All stack have the same size, which is stored in
STACK_SIZE
macro.Once reserved the memory, we assing the slides to each cpu using
stack_start
as base address. In each step, we assignstack_start
tocpu_stack[cpu]
,stack_start+STACK_SIZE
to_cpu_stack_top[cpu]
, and increasestack_size
withSTACK_SIZE
To ease the stack loading to each cpu, we have added a unique stack pointer, called
stack_ptr
. Before enable each cpu, this pointer is updated to thecpu_stack
of the current cpu. This pointer will be used in thecpuboot.S
assembly routine to load the stack in the current cpu.
-
-
Complete
intel_startCPU()
The
intel_startCPU()
function has the purpose of enable the cpu indicated by parameter, calling tostartup_cpu()
to raise the Startup IPI, and check if the cpu has been enabled correctly.To write this function, we have based in XNU's
intel_startCPU()
function, replacing its calls to the gnumach equivalent, and removing garbage code blocks. -
Raise Startup IPI and initialize cpu
gnumach doesn't include any function to raise the Startup IPI, so we have implemented this functions based in Min_SMP
cpu.c
andmp.c
functions:startup_cpu()
: This function is called byintel_startCPU()
to start the Startup IPI sequence in the cpu.send_ipi()
: function to write the IPI fields in the ICR register of the current cpucpu_ap_main()
: The first function executed by the new cpu after startup. Calls tocpu_setup()
and check errors.cpu_setup()
: Initialize themachine_slot
fields of the cpu
This functions has been added to
i386/i386/mp_desc.c
-
Implement assembly routine to switch the cpu to protected mode
After raise Startup IPI to the cpu, the cpu starts in real mode, so we need to add a routine to switch the cpu to protected mode. Because the real mode is 16 bit, we can't use C instructions (32 bit), so this routine must be written in assembly.
This routine load the GDT and IDT registers in the cpu, and call to
cpu_ap_main()
to initialize themachine_slot
of the cpu.To write the routine, we has taken the Min_SMP
boot.S
as base, with a few modifications:-
The GDT descriptor are replaced with gnumach GDT descriptor (
boot_gdt:
andboot_gdt_descr:
), taken fromboothdr.S
. We also copied the register initialization after GDT loading -
The
_start
routine is unnecessary and has been removed -
The physical address has been replaced with their equivalent logical address, using the same shift used in
boothdr.S
-
We have removed the
hlt
instruction aftercall cpu_ap_main
The final code is stored in
i386/i386/cpuboot.S
-
To allow cpus execute interrupt handlers, It's needed a interrupt stack. Each cpu has its own interrupt stack.
To get this, we've added a call to interrupt_stack_alloc()
to initialize the cpus interrupt stack array before call to mp_desc_init()
.
This step don't shows any new effect yet.
Before add the cpus to the kernel, we need to configure paging in them, to allow fully access to the memory.
To enable paging, we need to initialize CR0, CR3 and CR4 registers. in a similar way to this.
This code has been copied in paging_setup()
function, in mp_desc.c
. The processor, at starts, isn't capable to read the content from a pointer, so we copied the memory address of kernel_page_dir
and pdpbase
in two temporary integer variables: kernel_page_dir_addr
, and pdpbase_addr
.
The paging initialization also requires a temporary mapping in some low memory address. We keep the temporary mapping done in BSP processor until all AP will be enabled.
Once paging is enabled, each cpu will can to read its own Local APIC, using the *lapic
pointer. It also allows to execute cpu_number()
function, which is necessary to execute the slave_main()
function to add the cpu to the kernel.
Before call to slave_main()
, we need to load the final GDT and IDT, to get the same value than BSP processor, and be able to load correctly the LDT entries.
To do this, we call to gdt_init()
and idt_init()
in cpu_setup()
, just before call to slave_main()
.
Once final GDT and IDT are loaded, slave_main()
finish successfully, and the AP processors are added to the kernel.
-
Bosco García: Development guidance, original MinSMP developer, explainations and documentation about MultiProcessor architecture, helpful with gnumach development.
-
Guillermo Bernaldo de Quiros Maraver: Helpful in original development, first full compilation, and find original SMP problem (SMP without APIC support)
-
Samuel Thibault: Hurd core dev. Clarifications about SMP status in gnumach, helpful with gnumach questions.
-
Rodrigo V. G.: Helpful with debugging and memory addressing
-
Damien Zammit: Helpful with IOAPIC, I/O Management and memory mapping
- Comments about the project bug-hurd maillist
- Initial thread in bug-hurd maillist
- SMP in GNU/Hurd FAQ
- GNU Mach git repository
- Old Mach 4 source code
- GNUMach_SMP repository
- GNU Mach reference manual
- FOSDEM speaking: roadmap for the Hurd
- GENIAL ARTICLE about GNU/Hurd architecture
- MultiProcessor Specification
- ACPI Specification
- Mach boot trace
- Book: The Mach System
- X15 operating system
- Symmetric Multiprocessing - OSDev Wiki
- Intel Developer Guide, Volume 3: System Programming Guide