diff --git a/doc/developer-guides/hld/hld-hypervisor.rst b/doc/developer-guides/hld/hld-hypervisor.rst index 195a7bbff..39d4682ca 100644 --- a/doc/developer-guides/hld/hld-hypervisor.rst +++ b/doc/developer-guides/hld/hld-hypervisor.rst @@ -11,4 +11,4 @@ Hypervisor high-level design hv-cpu-virt Memory management I/O Emulation - Interrupt management + Physical Interrupt diff --git a/doc/developer-guides/hld/images/interrupt-image39.png b/doc/developer-guides/hld/images/interrupt-image39.png new file mode 100644 index 000000000..513ccf895 Binary files /dev/null and b/doc/developer-guides/hld/images/interrupt-image39.png differ diff --git a/doc/developer-guides/hld/images/interrupt-image4.png b/doc/developer-guides/hld/images/interrupt-image4.png deleted file mode 100644 index ccd337b6e..000000000 Binary files a/doc/developer-guides/hld/images/interrupt-image4.png and /dev/null differ diff --git a/doc/developer-guides/hld/images/interrupt-image46.png b/doc/developer-guides/hld/images/interrupt-image46.png new file mode 100644 index 000000000..d67193063 Binary files /dev/null and b/doc/developer-guides/hld/images/interrupt-image46.png differ diff --git a/doc/developer-guides/hld/images/interrupt-image48.png b/doc/developer-guides/hld/images/interrupt-image48.png new file mode 100644 index 000000000..87b1ecab2 Binary files /dev/null and b/doc/developer-guides/hld/images/interrupt-image48.png differ diff --git a/doc/developer-guides/hld/images/interrupt-image5.png b/doc/developer-guides/hld/images/interrupt-image5.png deleted file mode 100644 index 24338deb8..000000000 Binary files a/doc/developer-guides/hld/images/interrupt-image5.png and /dev/null differ diff --git a/doc/developer-guides/hld/images/interrupt-image6.png b/doc/developer-guides/hld/images/interrupt-image6.png deleted file mode 100644 index edb06fd1d..000000000 Binary files a/doc/developer-guides/hld/images/interrupt-image6.png and /dev/null differ diff --git a/doc/developer-guides/hld/images/interrupt-image66.png b/doc/developer-guides/hld/images/interrupt-image66.png new file mode 100644 index 000000000..5f6e9ef89 Binary files /dev/null and b/doc/developer-guides/hld/images/interrupt-image66.png differ diff --git a/doc/developer-guides/hld/images/interrupt-image7.png b/doc/developer-guides/hld/images/interrupt-image7.png deleted file mode 100644 index c73da36fb..000000000 Binary files a/doc/developer-guides/hld/images/interrupt-image7.png and /dev/null differ diff --git a/doc/developer-guides/hld/images/interrupt-image76.png b/doc/developer-guides/hld/images/interrupt-image76.png new file mode 100644 index 000000000..3a971245c Binary files /dev/null and b/doc/developer-guides/hld/images/interrupt-image76.png differ diff --git a/doc/developer-guides/hld/images/interrupt-image89.png b/doc/developer-guides/hld/images/interrupt-image89.png new file mode 100644 index 000000000..f380024a0 Binary files /dev/null and b/doc/developer-guides/hld/images/interrupt-image89.png differ diff --git a/doc/developer-guides/hld/interrupt-hld.rst b/doc/developer-guides/hld/interrupt-hld.rst index c48be271d..995912204 100644 --- a/doc/developer-guides/hld/interrupt-hld.rst +++ b/doc/developer-guides/hld/interrupt-hld.rst @@ -1,15 +1,11 @@ .. _interrupt-hld: -Interrupt Management high-level design -###################################### - +Physical Interrupt high-level design +#################################### Overview ******** -This document describes the interrupt management high-level design for -the ACRN hypervisor. - The ACRN hypervisor implements a simple but fully functional framework to manage interrupts and exceptions, as show in :numref:`interrupt-modules-overview`. In its native layer, it configures @@ -42,445 +38,451 @@ necessary virtual interrupt into the specific VM ACRN Interrupt SW Modules Overview -Hypervisor Physical Interrupt Management -**************************************** -The ACRN hypervisor is responsible for all the physical interrupt -handling. All physical interrupts are first handled in VMX root-mode. -The "external-interrupt exiting" bit in the VM-Execution controls field -is set to support this. The ACRN hypervisor also initializes all the -interrupt related modules such as IDT, PIC, IOAPIC, and LAPIC. +The hypervisor implements the following functionalities for handling +physical interrupts: -Only a few physical interrupts (such as TSC-Deadline timer and IOMMU) -are fully serviced in the hypervisor. Most interrupts come from pass-thru -devices whose interrupt are remapped to a virtual INTx/MSI source and -injected to the SOS or UOS, according to the pass-thru device -configuration. +- Configure interrupt-related hardware including IDT, PIC, LAPIC, and + IOAPIC on startup. -The ACRN hypervisor does handle exceptions and any exception coming from -the VMX root-mode will lead to the CPU halting. For guest exception, the -hypervisor only traps #MC (machine check), prints a warning message, and -injects the exception back into the guest OS. +- Provide APIs to manipulate the registers of LAPIC and IOAPIC. + +- Acknowledge physical interrupts. + +- Set up a callback mechanism for the other components in the + hypervisor to request for an interrupt vector and register a + handler for that interrupt. + +HV owns all native physical interrupts and manages 256 vectors per CPU. +All physical interrupts are first handled in VMX root-mode. The +"external-interrupt exiting" bit in VM-Execution controls field is set +to support this. The ACRN hypervisor also initializes all the interrupt +related modules like IDT, PIC, IOAPIC, and LAPIC. + +HV does not own any host devices (except UART). All devices are by +default assigned to SOS. Any interrupts received by Guest VM (SOS or +UOS) device drivers are virtual interrupts injected by HV (via vLAPIC). +HV manages a Host-to-Guest mapping. When a native IRQ/interrupt occurs, +HV decides whether this IRQ/interrupt should be forwarded to a VM and +which VM to forward to (if any). Refer to section 3.7.6 for virtual +interrupt injection and section 3.9.6 for the management of interrupt +remapping. + +HV does not own any exceptions. Guest VMCS are configured so no VM Exit +happens, with some exceptions such as #INT3 and #MC. This is to +simplify the design as HV does not support any exception handling +itself. HV supports only static memory mapping, so there should be no +#PF or #GP. If HV receives an exception indicating an error, an assert +function is then executed with an error message print out, and the +system then halts. + +Native interrupts could be generated from one of the following +sources: + +- GSI interrupts + + - PIC or Legacy devices IRQ (0~15) + - IOAPIC pin + +- PCI MSI/MSI-X vectors +- Inter CPU IPI +- LAPIC timer Physical Interrupt Initialization -================================= +********************************* -After the ACRN hypervisor get control from the bootloader, it -initializes all physical interrupt-related modules for all the CPUs. The -ACRN hypervisor creates a framework to manage the physical interrupt for -hypervisor-local devices, pass-thru devices, and IPI between CPUs. +After ACRN hypervisor gets control from the bootloader, it +initializes all physical interrupt-related modules for all the CPUs. ACRN +hypervisor creates a framework to manage the physical interrupt for +hypervisor local devices, pass-thru devices, and IPI between CPUs, as +shown in :numref:`hv-interrupt-init`: -IDT ---- - -The ACRN hypervisor builds its native Interrupt Descriptor Table (IDT) during -interrupt initialization. For exceptions, it links to function -``dispatch_exception``, and for external interrupts it links to function -``dispatch_interrupt``. Please refer to ``arch/x86/idt.S`` for more details. - -LAPIC ------ - -The ACRN hypervisor resets LAPIC for each CPU, and provides basic APIs -used, for example, by the local timer (TSC Deadline) -program and IPI notification program. These APIs include -write_laipic_reg32, send_lapic_eoi, send_startup_ipi, and -send_single_ipi. - - -.. comment - - Need reference to API doc generated from doxygen comments - in hypervisor/include/arch/x86/lapic.h - -PIC/IOAPIC ----------- - -The ACRN hypervisor masks all interrupts from PIC, so all the -legacy interrupts from PIC (<16) are linked to IOAPIC, as shown in -:numref:`interrupt-pic-pin`. - -ACRN will pre-allocate vectors and mask them for these legacy interrupts -in IOAPIC RTE. For others (>= 16) ACRN will mask them with vector 0 in -RTE, and the vector will be dynamically allocated on demand. - -.. figure:: images/interrupt-image5.png +.. figure:: images/interrupt-image66.png :align: center - :width: 600px - :name: interrupt-pic-pin + :name: hv-interrupt-init - PIC & IOAPIC Pin Connection + Physical Interrupt Initialization -Irq Desc --------- +IDT Initialization +================== -The ACRN hypervisor maintains a global ``irq_desc[]`` array shared among the -CPUs and uses a flat mode to manage the interrupts. The same -vector is linked to the same IRQ number for all CPUs. +ACRN hypervisor builds its native IDT (interrupt descriptor table) +during interrupt initialization and set up the following handlers: -.. comment +- On an exception, the hypervisor dumps its context and halts the current + physical processor (because physical exceptions are not expected). - Need reference to API doc generated from doxygen comments - for ``struct irq_desc`` in hypervisor/include/common/irq.h +- For external interrupts, HV may mask the interrupt (depending on the + trigger mode), followed by interrupt acknowledgement and dispatch + to the registered handler, if any. +Most interrupts and exceptions are handled without a stack switch, +except for machine-check, double fault, and stack fault exceptions which +have their own stack set in TSS. -The ``irq_desc[]`` array is indexed by the IRQ number. An -``irq_handler`` field can be set to a common edge, level, or quick -handler called from ``interrupt_dispatch``. The ``irq_desc`` structure -also contains the ``dev_list`` field to maintain this IRQ's action -handler list. - -The global array ``vector_to_irq[]`` is used to manage the vector -resource. This array is initialized with value ``IRQ_INVALID`` for all -vectors, and will be set to a valid IRQ number after the corresponding -vector is registered. - -For example, if the local timer registers interrupt with IRQ number 271 and -vector 0xEF, then the arrays mentioned above will be set to:: - - irq_desc[271].irq = 271; - irq_desc[271].vector = 0xEF; - vector_to_irq[0xEF] = 271; - -Physical Interrupt Flow -======================= - - -When an physical interrupt occurs, and the CPU is running under VMX root -mode, the interrupt is triggered from the standard native irq flow: -interrupt gate to irq handler. However, if the CPU is running under VMX -non-root mode, an external interrupt will trigger a VM exit for reason -"external-interrupt". See :numref:`interrupt-handle-flow`. - -.. figure:: images/interrupt-image4.png - :align: center - :width: 800px - :name: interrupt-handle-flow - - ACRN Hypervisor Interrupt Handle Flow - -After an interrupt happens (in either case noted above), the ACRN -hypervisor jumps to ``dispatch_interrupt``. This function will check -which vector caused this interrupt, and the corresponding ``irq_desc`` -structure's ``irq_handler`` will be called for the service. - -There are several irq_handler's defined in the ACRN hypervisor, as shown -in :numref:`interrupt-handle-flow`, designed for different uses. For -example, ``quick_handler_nolock`` is used when no critical data needs -protection in the action handlers; the VCPU notification IPI and local -timer are good example of this use case. - -The more complicated ``common_dev_handler_level`` handler is intended -for pass-thru devices with level triggered interrupts. To avoid -continuously triggering the interrupt, it initially masks IOAPIC pin and -unmasks it only when the corresponding vIOAPIC pin gets an explicit EOI -ACK from the guest. - -All the irq handler's finally call their own action handler list, as -shown here: - -.. code-block: c - - struct dev_handler_node \*dev = desc->dev_list; - while (dev != NULL) { - if (dev->dev_handler != NULL) - dev->dev_handler(desc->irq, dev->dev_data); - dev = dev->next; - } - -The common APIs for registering, updating, and unregistering -interrupt handlers include irq_to_vector, dev_to_irq, dev_to_vector, -pri_register_handler, normal_register_handler, -unregister_handler_common, and update_irq_handler. - -.. comment - - Need reference to API doc generated from doxygen comments - in hypervisor/include/common/irq.h - -.. _physical_interrupt_source: - -Physical Interrupt Source +PIC/IOAPIC Initialization ========================= -The ACRN hypervisor handles interrupts from many different sources, as -shown in :numref:`interrupt-source`: +ACRN hypervisor masks all interrupts from the PIC. All legacy interrupts +from PIC (<16) will be linked to IOAPIC, as shown in the connections in +:numref:`hv-pic-config`. +ACRN will pre-allocate vectors and mask them for these legacy interrupt +in IOAPIC RTE. For others (>= 16), ACRN will mask them with vector 0 in +RTE, and the vector will be dynamically allocate on demand. -.. list-table:: Physical Interrupt Source - :widths: 15 10 60 +All external IOAPIC pins are categorized as GSI interrupt according to +ACPI definition. HV supports multiple IOAPIC components. IRQ PIN to GSI +mappings are maintained internally to determine GSI source IOAPIC. +Native PIC is not used in the system. + +.. figure:: images/interrupt-image46.png + :align: center + :name: hv-pic-config + + HV PIC/IOAPIC/LAPIC configuration + +LAPIC Initialization +==================== + +Physical LAPICs are in xAPIC mode in ACRN hypervisor. The hypervisor +initializes LAPIC for each physical CPU by masking all interrupts in the +local vector table (LVT), clearing all ISRs, and enabling LAPIC. + +APIs are provided to access LAPIC for the other components in the +hypervisor, aiming for further usage of local timer (TSC Deadline) +program, IPI notification program, etc. See :ref:`hv_interrupt-data-api` +for a complete list. + +HV Interrupt Vectors and Delivery Mode +====================================== + +The interrupt vectors are assigned as shown here: + +**Vector 0-0x1F** + are exceptions that are not handled by HV. If + such an exception does occur, the system then halts. + +**Vector: 0x20-0x2F** + are allocated statically for legacy IRQ0-15. + +**Vector: 0x30-0xDF** + are dynamically allocated vectors for PCI devices + INTx or MSI/MIS-X usage. According to different interrupt delivery mode + (FLAT or PER_CPU mode), an interrupt will be assigned to a vector for + all the CPUs or a particular CPU. + +**Vector: 0xE0-0xFE** + are high priority vectors reserved by HV for + dedicated purposes. For example, 0xEF is used for timer, 0xF0 is used + for IPI. + +.. list-table:: + :widths: 30 70 :header-rows: 1 - :name: interrupt-source - * - Interrupt Source - - Vector - - Description - * - TSC Deadline Timer - - 0xEF - - The TSC deadline timer implements the timer framework in - the hypervisor based on the LAPIC TSC deadline. This interrupt's - target is specific to the CPU to which the LAPIC belongs. - * - CPU Startup IPI - - N/A - - The BSP needs to trigger an INIT-SIPI sequence to wake up the - APs. This interrupt's target is specified by the BSP calling - `` start_cpus()``. - * - VCPU Notify IPI - - 0xF0 - - When the hypervisor needs to kick the VCPU out of VMX non-root - mode to do requests such as virtual interrupt injection, EPT - flush, etc. This interrupt's target is specified by function - ``send_single_ipi()``. - * - IOMMU MSI - - dynamic - - IOMMU device supports an MSI interrupt. The vtd device driver in - the hypervisor will register an interrupt to handle dmar fault. - This interrupt's target is specified by vtd device driver. - * - PTdev INTx - - dynamic - - All native devices are owned by the guest (SOS or UOS), taking - advantage of the pass-thru method. Each pass-thru device connected - with IOAPIC/PIC (PTdev INTx) will register an interrupt when - its attached interrupt controller pin first gets unmasked. - This interrupt's target is defined by and RTE entry in the IOAPIC. - * - PTdev MSI - - dynamic - - All native devices are owned by the guest (SOS or UOS), taking - advantage of pass-thru method. Each pass-thru device with - enabled MSI (PTdev MSI) will register an interrupt when the SOS - does an explicit hypercall. This interrupt's target is defined - by an MSI address entry. + * - Vectors + - Usage -Softirq -======= + * - 0x0-0x13 + - Exceptions: NMI, INT3, page dault, GP, debug. -ACRN hypervisor implements a simple bottom-half softirq to execute the -interrupt handler, as showed in :numref:`interrupt-handle-flow`. -The softirq is executed when an interrupt is enabled. Several APIs for softirq -are defined including enable_softirq, disable_softirq, raise_softirq, -and exec_softirq. + * - 0x14-0x1F + - Reserved -.. comment + * - 0x20-0x2F + - Statically allocated for external IRQ (IRQ0-IRQ15) - Need reference to API doc generated from doxygen comments - in hypervisor/include/common/softirq.h + * - 0x30-0xDF + - Dynamically allocated for IOAPIC IRQ from PCI INTx/MSI -Physical Exception Handling -=========================== + * - 0xE0-0xFE + - Static allocated for HV -As mentioned earlier, the ACRN hypervisor does not handle any -physical exceptions. The VMX root mode code path should guarantee no -exceptions are triggered while the hypervisor is running. + * - 0xEF + - Timer -Guest Virtual Interrupt Management -********************************** + * - 0xF0 + - IPI -The previous sections describe physical interrupt management in the ACRN -hypervisor. After a physical interrupt happens, a registered action -handler is executed. Usually, the action handler represents a service -for virtual interrupt injection. For example, if an interrupt is -triggered from a pass-thru device, the appropriate virtual interrupt -should be injected into its guest VM. + * - 0xFF + - SPURIOUS_APIC_VECTOR -The virtual interrupt injection could also come from an emulated device. -The I/O mediator in the Service OS (SOS) could trigger an interrupt -through a hypercall, and then do the virtual interrupt injection in the -hypervisor. +Interrupts from either IOAPIC or MSI can be delivered to a target CPU. +By default they are configured as Lowest Priority (FLAT mode), i.e. they +are delivered to a CPU core that is currently idle or executing lowest +priority ISR. There is no guarantee a device's interrupt will be +delivered to a specific Guest's CPU. Timer interrupts are an exception - +these are always delivered to the CPU which programs the LAPIC timer. -The following sections give an introduction to the ACRN guest virtual -interrupt management, including VCPU request for virtual interrupt kick -off, vPIC/vIOAPIC/vLAPIC for virtual interrupt injection interfaces, -physical-to-virtual interrupt mapping for a pass-thru device, and the -process of VMX interrupt/exception injection. +There are two interrupt delivery modes: FLAT mode and PER_CPU mode. ACRN +uses FLAT MODE where the interrupt/irq to vector mapping is the same on all CPUs. Every +CPU receives same interrupts. IOAPIC and LAPIC MSI delivery mode are +configured to Lowest Priority. -VCPU Request -============ +Vector allocation for CPUs is shown here: -As mentioned in `physical_interrupt_source`_, physical vector 0xF0 is -used to kick the VCPU out of its VMX non-root mode, and make a request -for virtual interrupt injection or other requests such as flush EPT. - -The request-make API (vcpu_make_request) and eventid supports virtual interrupt -injection. - -.. comment - - Need reference to API doc generated from doxygen comments - in hypervisor/include/common/irq.h - -There are requests for exception injection (ACRN_REQUEST_EXCP), vLAPIC -event (ACRN_REQUEST_EVENT), external interrupt from vPIC -(ACRN_REQUEST_EXTINT) and non-maskable-interrupt (ACRN_REQUEST_NMI). - -The ``vcpu_make_request`` is necessary for a virtual interrupt -injection. If the target VCPU is running under VMX non-root mode, it -will send an IPI to kick it out and results in an external-interrupt -VM-Exit. The flow of :numref:`interrupt-handle-flow` could be executed -to complete the injection of a virtual interrupt. - -There are some cases that do not need to send an IPI when making a -request because the CPU making the request is the target VCPU. For -example, the #GP exception request always happens on the current CPU -when an invalid emulation happens. An external interrupt for a pass-thru -device always happens on the VCPUs the device belongs to, so after it -triggers an external-interrupt VM-Exit, the current CPU is also the -target VCPU. - -Virtual PIC -=========== - -The ACRN hypervisor emulates a vPIC for each VM based on IO ranges -0x20-0x21, 0xa0-0xa1, or 0x4d0-0x4d1. - -If an interrupt source from vPIC needs to inject an interrupt, -the vpic_assert_irq, vpic_deassert_irq, or vpic_pulse_irq functions can -be called to make a request for ACRN_REQUEST_EXTINT or -ACRN_REQUEST_EVENT: - -.. comment - - Need reference to API doc generated from doxygen comments - in hypervisor/include/common/vpic.h - -The vpic_pending_intr and vpic_intr_accepted APIs are used to query the -vector being injected and ACK the service, by moving the interrupt from -request service (IRR) to in service (ISR). - - -Virtual IOAPIC -============== - -ACRN hypervisor emulates a vIOAPIC for each VM based on MMIO -VIOAPIC_BASE. - -If an interrupt source from vIOAPIC needs to inject an interrupt, the -vioapic_assert_irq, vioapic_dessert_irq, and vioapic_pulse_irq APIs are -used to make a request for ACRN_REQUEST_EVENT. - -As the vIOAPIC is always associated with a vLAPIC, the virtual interrupt -injection from vIOAPIC will finally trigger a request for an vLAPIC -event. - -Virtual LAPIC -============= - -The ACRN hypervisor emulates a vLAPIC for each VCPU based on MMIO -DEFAULT_APIC_BASE. - -If an interrupt source from vLAPIC needs to inject an interrupt (e.g., -from LVT such as an LAPIC timer, from vIOAPIC for a pass-thru device -interrupt, or from an emulated device for a MSI), vlapic_intr_level, -vlapic_intr_edge, vlapic_set_local_intr, vlapic_intr_msi, -vlapic_deliver_intr APIs need to be called, resulting in a request for -ACRN_REQUEST_EVENT. - -.. comment - - Need reference to API doc generated from doxygen comments - in hypervisor/include/common/vlapic.h - - -The vlapic_pending_intr and vlapic_intr_accepted APIs are used to query -the vector that needs to be injected and ACK -the service that move the interrupt from request service (IRR) to in -service (ISR). - -By default, the ACRN hypervisor enables vAPIC to improve the performance of -a vLAPIC emulation. - -Virtual Exception -================= - -When doing emulation, an exception may be triggered in the hypervisor, -for example, if guest accesses an invalid vMSR register, or the -hypervisor needs to inject a #GP, or during instruction emulation, an -instruction fetch may access a non-exist page from rip_gva, and a #PF -must be injected. - -ACRN hypervisor implements virtual exception injection using the -vcpu_queue_exception, vcpu_inject_gq, and vcpu_inject_pf APIs. - -.. comment - - Need reference to API doc generated from doxygen comments - in hypervisor/include/common/irq.h - -The ACRN hypervisor uses vcpu_inject_gp/vcpu_inject_pf functions to -queue exception requests, and follows `Intel Software -Developer Manual, Vol 3. `_ - 6.15, Table 6-5 -listing conditions for generating a double fault. - -.. _SDM vol3: https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html - -Interrupt Mapping for a Pass-thru Device -======================================== - -A VM can control a PCI device directly through pass-thru device -assignment. The pass-thru entry is the major info object, and it is: - -- A physical interrupt source, and could be a MSI/MSIX entry, PIC pins, or - IOAPIC pins -- Pass-thru remapping information between physical and virtual interrupt - source, for MSI/MSIX it is identified by a PCI device's BDF. For - PIC/IOAPIC it is identified by the pin number. - -.. figure:: images/interrupt-image7.png +.. figure:: images/interrupt-image89.png :align: center - :width: 600px - :name: interrupt-pass-thru - Pass-thru Device Entry Assignment + FLAT mode vector allocation -As shown in :numref:`interrupt-pass-thru` above, a UOS will assign its -pass-thru device entry by the DM, and it will fill its entry info from: +IRQ Descriptor Table +==================== -- vPIC/vIOAPIC interrupt mask/unmask -- MSI IOReq from UOS then MSI hypercall from SOS +ACRN hypervisor maintains a global IRQ Descriptor Table shared among the +physical CPUs. ACRN use FLAT MODE to manage the interrupts so the +same vector will link to same the IRQ number for all CPUs. -The SOS adds its pass-thru device entry at runtime and fills info for: +.. note:: need to reference API doc for irq_desc -- vPIC/vIOAPIC interrupt mask/unmask -- MSI hypercall from SOS -During the pass-thru device entry info filling, the hypervisor builds -native IOAPIC RTE/MSI entry based on vIOAPIC/vPIC/vMSI configuration, -and register the physical interrupt handler for it. Then with the pass-thru -device entry as the handler private data, the physical interrupt can -be linked to a virtual pin of a guest's vPIC/vIOAPIC or virtual vector of -a guest's vMSI. The handler then injects the corresponding virtual -interrupt into the guest, based on vPIC/vIOAPIC/vLAPIC APIs described -earlier. +The *irq_desc[]* array's index represents IRQ number. An *irq_handler* +field could be set to common edge/level/quick handler which will be +called from *interrupt_dispatch*. The *irq_desc* structure also +contains the *dev_list* field to maintain this IRQ's action handler +list. -Interrupt Storm Mitigation -========================== +Another reverse mapping from vector to IRQ is used in addition to the +IRQ descriptor table which maintains the mapping from IRQ to vector. -When the Device Model (DM) launches a User OS (UOS), the ACRN hypervisor -will remap the interrupt for this user OS's pass-through devices. When -an interrupt occurs for a pass-through device, the CPU core is assigned -to that User OS gets trapped into the hypervisor. The benefit of such a -mechanism is that, should an interrupt storm happen in a particular UOS, -it will have only a minimal effect on the performance of the Service OS. +On initialization, the descriptor of the legacy IRQs are initialized with +proper vectors and the corresponding reverse mapping is set up. +The descriptor of other IRQs are filled with an invalid +vector which will be updated on IRQ allocation. -Interrupt/Exception Injection Process -===================================== +For example, if local timer registers an interrupt with IRQ number 271 and +vector 0xEF, then this date will be set up: -As shown in :numref:`interrupt-handle-flow`, the ACRN hypervisor injects -virtual interrupt/exception to the guest before its VM-Entry. +.. code-block:: c -This is done by updating the VMX_ENTRY_INT_INFO_FIELD of the VCPU's -VMCS. As this field is unique, the interrupt/exception injection must -follow a priority rule to handle one-by-one. + irq_desc[271].irq = 271 + irq_desc[271].vector = 0xEF + vector_to_irq[0xEF] = 271 -:numref:`interrupt-injection` below shows the rules about how to inject -virtual interrupt/exception one-by-one. If a high priority -interrupt/exception was already injected, the next pending -interrupt/exception will enable an interrupt window where the next -injection will be done by the following VM-Exit, triggered by the -interrupt window. +External Interrupt Handling +*************************** -.. figure:: images/interrupt-image6.png +CPU runs under VMX non-root mode and inside Guest VMs. +``MSR_IA32_VMX_PINBASED_CTLS.bit[0]`` and +``MSR_IA32_VMX_EXIT_CTLS.bit[15]`` are set to allow vCPU VM Exit to HV +whenever there are interrupts to that physical CPU under +non-root mode. HV ACKs the interrupts in VMX non-root and saves the +interrupt vector to the relevant VM Exit field for HV IRQ processing. + +Note that as discussed above, an external interrupt causing vCPU VM Exit +to HV does not mean that the interrupt belongs to that Guest VM. When +CPU executes VM Exit into root-mode, interrupt handling will be enabled +and the interrupt will be delivered and processed as quickly as possible +inside HV. HV may emulate a virtual interrupt and inject to Guest if +necessary. + +When an physical interrupt happened on a CPU, this CPU could be running +under VMX root mode or non-root mode. If the CPU is running under VMX +root mode, the interrupt is triggered from standard native IRQ flow - +interrupt gate to IRQ handler. If the CPU is running under VMX non-root +mode, an external interrupt will trigger a VM exit for reason +"external-interrupt". + +Interrupt and IRQ processing flow diagrams are shown below: + +.. figure:: images/interrupt-image48.png :align: center - :width: 600px - :name: interrupt-injection + :name: phy-interrupt-processing - ACRN Hypervisor Interrupt/Exception Injection Process + Processing of physical interrupts + +.. figure:: images/interrupt-image39.png + :align: center + + IRQ processing control flow + +When a physical interrupt is raised and delivered to a physical CPU, the +CPU may be running under either VMX root mode or non-root mode. + +- If the CPU is running under VMX root mode, the interrupt is handled + following the standard native IRQ flow: interrupt gate to + dispatch_interrupt(), IRQ handler, and finally the registered callback. +- If the CPU is running under VMX non-root mode, an external interrupt + calls a VM exit for reason "external-interrupt", and then the VM + exit processing flow will call dispatch_interrupt() to dispatch and + handle the interrupt. + +After an interrupt occures from either path shown in +:numref:`phy-interrupt-processing`, ACRN hypervisor will jump to +dispatch_interrupt. This function gets the vector of the generated +interrupt from the context, gets IRQ number from vector_to_irq[], and +then gets the corresponding irq_desc. + +Though there is only one generic IRQ handler for registered interrupt, +there are three different handling flows according to flags: + +- ``!IRQF_LEVEL`` +- ``IRQF_LEVEL && !IRQF_PT`` + + To avoid continuous interrupt triggers, it masks the IOAPIC pin and + unmask it only after IRQ action callback is executed + +- ``IRQF_LEVEL && IRQF_PT`` + + For pass-thru devices, to avoid continuous interrupt triggers, it masks + the IOAPIC pin and leaves it unmasked until corresponding vIOAPIC + pin gets an explicit EOI ACK from guest. + +Since interrupts are not shared for multiple devices, there is only one +IRQ action registered for each interrupt + +The IRQ number inside HV is a software concept to identify GSI and +Vectors. Each GSI will be mapped to one IRQ. The GSI number is usually the same +as the IRQ number. IRQ numbers greater than max GSI (nr_gsi) number are dynamically +assigned. For example, HV allocates an interrupt vector to a PCI device, +an IRQ number is then assigned to that vector. When the vector later +reaches a CPU, the corresponding IRQ routine is located and executed. + +See :numref:`request-irq` for request IRQ control flow for different +conditions: + +.. figure:: images/interrupt-image76.png + :align: center + :name: request-irq + + Request IRQ for different conditions + +IPI Management +************** + +The only purpose of IPI use in HV is to kick a vCPU out of non-root mode +and enter to HV mode. This requires I/O request and virtual interrupt +injection be distributed to different IPI vectors. The I/O request uses +IPI vector 0xF4 upcall (refer to Chapter 5.4). The virtual interrupt +injection uses IPI vector 0xF0. + +0xF4 upcall + A Guest vCPU VM Exit exits due to EPT violation or IO instruction trap. + It requires Device Module to emulate the MMIO/PortIO instruction. + However it could be that the Service OS (SOS) vCPU0 is still in non-root + mode. So an IPI (0xF4 upcall vector) should be sent to the physical CPU0 + (with non-root mode as vCPU0 inside SOS) to force vCPU0 to VM Exit due + to the external interrupt. The virtual upcall vector is then injected to + SOS, and the vCPU0 inside SOS then will pick up the IO request and do + emulation for other Guest. + +0xF0 IPI flow + If Device Module inside SOS needs to inject an interrupt to other Guest + such as vCPU1, it will issue an IPI first to kick CPU1 (assuming CPU1 is + running on vCPU1) to root-hv_interrupt-data-apmode. CPU1 will inject the + interrupt before VM Enter. + +.. _hv_interrupt-data-api: + +Data structures and interfaces +****************************** + +IOAPIC +====== + +The following APIs are external interfaces for IOAPIC related +operations. + +.. code-block:: c + + void ioapic_get_rte(uint32_t irq, union ioapic_rte *rte) + /* get the redirection table entry of an irq. */ + + void ioapic_set_rte(uint32_t irq, union ioapic_rte rte) + /* Set the redirection table entry of an irq. */ + + uint32_t pin_to_irq(uint8_t pin) + /* Get irq num from physical irq pin num */ + + void suspend_ioapic(void) + /* Suspended ioapic, mainly save the RTEs. */ + + void resume_ioapic(void) + /* Resume ioapic, mainly restore the RTEs. */ + + int get_ioapic_info(char *str_arg, int str_max_len) + /* Dump information of ioapic for debug, such as irq num, pin, + * RTE, vector, trigger mode etc. For debugging only. + */ + +LAPIC +===== + +The following APIs are external interfaces for LAPIC related operations. + +.. code-block:: c + + void write_lapic_reg32(uint32_t offset, uint32_t value) + /* Write to lapic register. */ + + void early_init_lapic(void) + /* To get the local apic base addr, map lapic registers and check the + * xAPIC/x2APIC capability. + */ + + void save_lapic(struct lapic_regs *regs) + /* Save context of lapic before entering s3. */ + + void restore_lapic(struct lapic_regs *regs) + /* Restore context of lapic when resume from s3. */ + + void resume_lapic(void) + /* Resume lapic by setting the apic base addr and restore the registers. */ + + uint8_t get_cur_lapic_id(void) + /* Get the lapic id. */ + +IPI +=== + +The following APIs are external interfaces for IPI related operations. + +.. code-block:: c + + void send_startup_ipi(enum intr_cpu_startup_shorthand cpu_startup_shorthand, + uint16_t dest_pcpu_id, uint64_t cpu_startup_start_address) + /* Send an SIPI to a specific cpu, to notify the cpu to start booting. */ + + void send_dest_ipi(uint32_t dest, uint32_t vector, uint32_t dest_mode) + /* Send an IPI to a specific cpu with dest mode specified. */ + + void send_single_ipi(uint16_t pcpu_id, uint32_t vector) + /* Send an IPI to a specific cpu with physical dest mode. */ + + +Physical Interrupt +================== + +The following APIs are external interfaces for physical interrupt +related operations. + +.. code-block:: c + + int32_t request_irq(uint32_t req_irq, irq_action_t action_fn, void *priv_data, + uint32_t flags) + /* Request interrupt num if not specified, and register irq action for the + * specified/allocated irq. + */ + + void free_irq(uint32_t irq) + /* Free irq num and unregister the irq action. */ + + void set_irq_trigger_mode(uint32_t irq, bool is_level_trigger) + /* Set the irq trigger mode: edge-triggered or level-triggered */ + + uint32_t irq_to_vector(uint32_t irq) + /* Convert irq num to vector */ + + void get_cpu_interrupt_info(char *str_arg, int str_max) + /* To dump interrupt statistics info, such as irq num, vector, + * irq count on each physical cpu. + */ + + void dispatch_interrupt(struct intr_excp_ctx *ctx) + /* To dispatch an interrupt, an action callback will be called if registered. */ + + void interrupt_init(uint16_t pcpu_id) + /* To do interrupt initialization for a cpu, will be called for + * each physical cpu. + */