document: update HLD for hypervisor startup

updated this chapter based on latest master

Tracked-On: #3882
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
This commit is contained in:
Jason Chen CJ 2019-10-21 22:15:27 +08:00 committed by deb-intel
parent cfcdd8ad09
commit c43e70b544
3 changed files with 107 additions and 58 deletions

View File

@ -6,7 +6,7 @@ Hypervisor Startup
This section is an overview of the ACRN hypervisor startup. This section is an overview of the ACRN hypervisor startup.
The ACRN hypervisor The ACRN hypervisor
compiles to a 32-bit multiboot-compliant ELF file. compiles to a 32-bit multiboot-compliant ELF file.
The bootloader (ABL or SBL) loads the hypervisor according to the The bootloader (ABL/SBL or UEFI) loads the hypervisor according to the
addresses specified in the ELF header. The BSP starts the hypervisor addresses specified in the ELF header. The BSP starts the hypervisor
with an initial state compliant to multiboot 1 specification, after the with an initial state compliant to multiboot 1 specification, after the
bootloader prepares full configurations including ACPI, E820, etc. bootloader prepares full configurations including ACPI, E820, etc.
@ -14,6 +14,15 @@ bootloader prepares full configurations including ACPI, E820, etc.
The HV startup has two parts: the native startup followed by The HV startup has two parts: the native startup followed by
VM startup. VM startup.
Multiboot Header
****************
The ACRN hypervisor is built with multiboot header, which presents
``MULTIBOOT_HEADER_MAGIC`` and ``MULTIBOOT_HEADER_FLAGS`` at the beginning
of the image, and it sets bit 6 in ``MULTIBOOT_HEADER_FLAGS`` which request
bootloader passing memory mmap information(like e820 entries) through
Multiboot Information(MBI) structure.
Native Startup Native Startup
************** **************
@ -36,11 +45,13 @@ description for the flow:
- **UART Init:** Initialize a pre-configured UART device used - **UART Init:** Initialize a pre-configured UART device used
as the base physical console for HV and Service OS. as the base physical console for HV and Service OS.
- **Shell Init:** Start a command shell for HV accessible via the UART.
- **Memory Init:** Initialize memory type and cache policy, and creates - **Memory Init:** Initialize memory type and cache policy, and creates
MMU page table mapping for HV. MMU page table mapping for HV.
- **Scheduler Init:** Initialize scheduler framework, which provide the
capability to switch different threads(like vcpu vs. idle thread) on a
physical CPU, and to support CPU sharing.
- **Interrupt Init:** Initialize interrupt and exception for native HV - **Interrupt Init:** Initialize interrupt and exception for native HV
including IDT and ``do_IRQ`` infrastructure; a timer interrupt including IDT and ``do_IRQ`` infrastructure; a timer interrupt
framework is then built. The native/physical interrupts will go framework is then built. The native/physical interrupts will go
@ -52,6 +63,8 @@ description for the flow:
own memory and interrupts, notifies the BSP on completion and own memory and interrupts, notifies the BSP on completion and
enter the default idle loop. enter the default idle loop.
- **Shell Init:** Start a command shell for HV accessible via the UART.
Symbols in the hypervisor are placed with an assumed base address, but Symbols in the hypervisor are placed with an assumed base address, but
the bootloader may not place the hypervisor at that specified base. In the bootloader may not place the hypervisor at that specified base. In
such case the hypervisor will relocate itself to where the bootloader such case the hypervisor will relocate itself to where the bootloader
@ -98,40 +111,63 @@ Memory
Refer to :ref:`physical-interrupt-initialization` for a detailed description of interrupt-related Refer to :ref:`physical-interrupt-initialization` for a detailed description of interrupt-related
initial states, including IDT and physical PICs. initial states, including IDT and physical PICs.
After BSP detects that all APs are up, BSP will start creating the first After BSP detects that all APs are up, it will continue to enter guest mode; similar, after one AP
VM, i.e. SOS, as explained in the next section. complete its initialization, it will start entering guest mode as well.
When BSP & APs enter guest mode, they will try to launch pre-defined VMs whose vBSP associated with
this physical core; these pre-defined VMs are static configured in ``vm config`` and they could be
pre-launched Safety VM or Service VM; the VM startup will be explained in next section.
.. _vm-startup: .. _vm-startup:
VM Startup VM Startup
********** **********
SOS is created and launched on the physical BSP after the hypervisor The Service VM or a pre-launched VM is created and launched on the physical
initializes itself. Meanwhile, the APs enter the default idle loop CPU which configured as its vBSP. Meanwhile, for the physical CPUs which
configured as vAPs for dedicated VMs, they will enter the default idle loop
(refer to :ref:`VCPU_lifecycle` for details), waiting for any vCPU to be (refer to :ref:`VCPU_lifecycle` for details), waiting for any vCPU to be
scheduled to them. scheduled to them.
:numref:`hvstart-vmflow` illustrates a high-level execution flow of :numref:`hvstart-vmflow` illustrates a high-level execution flow of
creating and launching a VM, applicable to both SOS and UOS. One major creating and launching a VM, applicable to pre-launched VM, Service VM
difference in the creation of SOS and UOS is that SOS is created by the and User VM. One major difference in the creation of User VM and pre-launched
hypervisor, while the creation of UOSes is triggered by the DM in SOS. /Service VM is that pre-launched/Service VM is created by the hypervisor,
while the creation of User VMs is triggered by the DM in Service OS.
The main steps include: The main steps include:
- **Create VM**: A VM structure is allocated and initialized. A unique - **Create VM**: A VM structure is allocated and initialized. A unique
VM ID is picked, EPT is created, I/O bitmap is set up, I/O VM ID is picked, EPT is initialized, e820 table for this VM is prepared,
emulation handlers initialized and registered and virtual CPUID I/O bitmap is set up, virtual PIC/IOAPIC/PCI/UART is initialized, EPC for
entries filled. For SOS an addition e820 table is prepared. virtual SGX is prepared, guest PM IO is set up, IOMMU for PT dev support
is enabled, virtual CPUID entries are filled, and vCPUs configred in this VM's
``vm config`` are prepared. For post-launched User VM, the EPT page table and
e820 table is actually prepared by DM instead of hypervisor.
- **Create vCPUs:** Create the vCPUs, assign the physical processor it - **Prepare vCPUs:** Create the vCPUs, assign the physical processor it
is pinned to, a unique-per-VM vCPU ID and a globally unique VPID, is pinned to, a unique-per-VM vCPU ID and a globally unique VPID,
and initializes its virtual lapic and MTRR. For SOS one vCPU is and initializes its virtual lapic and MTRR, and its vCPU thread object got setup
created for each physical CPU on the platform. For UOS the DM for vcpu scheduling. The vCPU number and affinity are defined in corresponding
determines the number of vCPUs to be created. ``vm config`` for this VM.
- **SW Load:** The BSP of a VM also prepares for each VM's SW - **Build vACPI:** For Service VM, the hypervisor will customize a virtual ACPI
configuration including kernel entry address, ramdisk address, table based on native ACPI table (this is in the TODO).
bootargs, zero page etc. This is done by the hypervisor for SOS For pre-launched VM, the hypervisor will build a simple ACPI table with necessary
while by DM for UOS. information like MADT.
For post-launched User VM, DM will build its ACPI table dynamically.
- **SW Load:** Prepares for each VM's SW configuration according to guest OS
requirement, which may include kernel entry address, ramdisk address,
bootargs, or zero page for launching bzImage etc.
This is done by the hypervisor for pre-launched or Service VM, while by DM
for post-launched User VMs.
Meanwhile, there are two kinds of boot mode - de-privilege and direct boot
mode. The de-privilege boot mode is combined with ACRN UEFI-stub, and only
apply to Service VM, which ensure native UEFI environment could be restored
and keep running in the Service VM. The direct boot mode is applied to both
pre-launched and Service VM, in this mode, the VM will start from standard
real or proteted mode which is not related with native environment.
- **Start VM:** The vBSP of vCPUs in this VM is kick to do schedule.
- **Schedule vCPUs:** The vCPUs are scheduled to the corresponding - **Schedule vCPUs:** The vCPUs are scheduled to the corresponding
physical processors for execution. physical processors for execution.
@ -140,10 +176,10 @@ The main steps include:
state, execution control, entry control and exit control. It's state, execution control, entry control and exit control. It's
the last configuration before vCPU runs. the last configuration before vCPU runs.
- **vCPU thread:** vCPU kicks out to run. For "Primary CPU" it will - **vCPU thread:** vCPU kicks out to run. For vBSP of vCPUs, it will
start running into kernel image which SW Load is configured; for start running into kernel image which SW Load is configured; for
"Non-Primary CPU" it will wait for INIT-SIPI-SIPI IPI sequence any vAP of vCPUs, it will wait for INIT-SIPI-SIPI IPI sequence
trigger from its "Primary CPU". trigger from its vBSP.
.. figure:: images/hld-image104.png .. figure:: images/hld-image104.png
:align: center :align: center
@ -151,57 +187,70 @@ The main steps include:
Hypervisor VM Startup Flow Hypervisor VM Startup Flow
SW configuration for Service OS (SOS_VM): SW configuration for Service VM (bzimage SW load as example):
- **ACPI**: HV passes the entire ACPI table from bootloader to Service - **ACPI**: HV passes the entire ACPI table from bootloader to Service
OS directly. Legacy mode is currently supported as the ACPI table VM directly. Legacy mode is currently supported as the ACPI table
is loaded at F-Segment. is loaded at F-Segment.
- **E820**: HV passes e820 table from bootloader through multi-boot - **E820**: HV passes e820 table from bootloader through zero-page
information after the HV reserved memory (32M for example) is after the HV reserved (32M for example) and pre-launched VM owned
filtered out. memory is filtered out.
- **Zero Page**: HV prepares the zero page at the high end of Service - **Zero Page**: HV prepares the zero page at the high end of Service
OS memory which is determined by SOS_VM guest FIT binary build. The VM memory which is determined by SOS_VM guest FIT binary build. The
zero page includes configuration for ramdisk, bootargs and e820 zero page includes configuration for ramdisk, bootargs and e820
entries. The zero page address will be set to "Primary CPU" RSI entries. The zero page address will be set to vBSP RSI register
register before VCPU gets run. before VCPU gets run.
- **Entry address**: HV will copy Service OS kernel image to 0x1000000 - **Entry address**: HV will copy Service OS kernel image to
as entry address for SOS_VM's "Primary CPU". This entry address will kernel_load_addr, which could be got from "pref_addr" field in bzimage
be set to "Primary CPU" RIP register before VCPU gets run. header; the entry address will be calculated based on kernel_load_addr,
and will be set to vBSP RIP register before VCPU gets run.
SW configuration for User OS (VMx): SW configuration for post-launched User VMs (OVMF SW load as example):
- **ACPI**: the virtual ACPI table is built by DM and put at VMx's - **ACPI**: the virtual ACPI table is built by DM and put at User VM's
F-Segment. Refer to :ref:`hld-io-emulation` for details. F-Segment. Refer to :ref:`hld-io-emulation` for details.
- **E820**: the virtual E820 table is built by the DM then passed to - **E820**: the virtual E820 table is built by the DM then passed to
the zero page. Refer to :ref:`hld-io-emulation` for details. the virtual bootloader. Refer to :ref:`hld-io-emulation` for detais.
- **Zero Page**: the DM prepares the zero page at location of - **Entry address**: the DM will copy User OS kernel(OVMF) image to
"lowmem_top - 4K" in VMx. This location is set into VMx's OVMF_NVSTORAGE_OFFSET - normally is @(4G - 2M), and set the entry
"Primary CPU" RSI register in **SW Load**. address to 0xFFFFFFF0. As the vBSP will kick to run virtual bootloader
(OVMF) from real-mode, so its CS base will be set as 0xFFFF0000, and
RIP register will be set as 0xFFF0.
- **Entry address**: the DM will copy User OS kernel image to 0x1000000 SW configuration for pre-launched VMs (raw SW load as example):
as entry address for VMx's "Primary CPU". This entry address will
be set to "Primary CPU" RIP register before VCPU gets run. - **ACPI**: the virtual ACPI table is built by the hypervisor and put at
this VM's F-Segment.
- **E820**: the virtual E820 table is built by the hypervisor then passed to
the VM according to different SW loaders. For raw SW load here, it's not
used.
- **Entry address**: the hypervisor will copy User OS kernel image to
kernel_load_addr which set by ``vm config``, and set the entry
address to kernel_entry_addr which set by ``vm config`` as well.
Here is initial mode of vCPUs: Here is initial mode of vCPUs:
+------------------------------+-------------------------------+ +----------------------------------+----------------------------------------------------------+
| VM and Processor Type | Initial Mode | | VM and Processor Type | Initial Mode |
+=============+================+===============================+ +=================+================+==========================================================+
| SOS | BSP | Same as physical BSP | | Service VM | BSP | Same as physical BSP, or Real Mode if SOS boot w/ OVMF |
| +----------------+-------------------------------+ | +----------------+----------------------------------------------------------+
| | AP | Real Mode | | | AP | Real Mode |
+-------------+----------------+-------------------------------+ +-----------------+----------------+----------------------------------------------------------+
| UOS | BSP | Real Mode | | User VM | BSP | Real Mode |
| +----------------+-------------------------------+ | +----------------+----------------------------------------------------------+
| | AP | Real Mode | | | AP | Real Mode |
+-------------+----------------+-------------------------------+ +-----------------+----------------+----------------------------------------------------------+
| Pre-launched VM | BSP | Real Mode or Protected Mode |
| +----------------+----------------------------------------------------------+
| | AP | Real Mode |
+-----------------+----------------+----------------------------------------------------------+
Note that SOS is started with the same number of vCPUs as the physical
CPUs to boost the boot-up. SOS will offline the APs right before it
starts any UOS.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 20 KiB

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

After

Width:  |  Height:  |  Size: 28 KiB