doc: update interrupt hld section

Transcode, edit, and upload HLD 0.7 sections 3.5 (Physical Interrupts)

Tracked-on: #1610

Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
This commit is contained in:
David B. Kinder 2018-10-25 16:30:03 -07:00 committed by David Kinder
parent f84547cad2
commit 70e13bf8f4
12 changed files with 402 additions and 400 deletions

View File

@ -11,4 +11,4 @@ Hypervisor high-level design
hv-cpu-virt
Memory management <memmgt-hld>
I/O Emulation <hld-io-emulation>
Interrupt management <interrupt-hld>
Physical Interrupt <interrupt-hld>

Binary file not shown.

After

Width:  |  Height:  |  Size: 115 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 121 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

View File

@ -1,15 +1,11 @@
.. _interrupt-hld:
Interrupt Management high-level design
######################################
Physical Interrupt high-level design
####################################
Overview
********
This document describes the interrupt management high-level design for
the ACRN hypervisor.
The ACRN hypervisor implements a simple but fully functional framework
to manage interrupts and exceptions, as show in
:numref:`interrupt-modules-overview`. In its native layer, it configures
@ -42,445 +38,451 @@ necessary virtual interrupt into the specific VM
ACRN Interrupt SW Modules Overview
Hypervisor Physical Interrupt Management
****************************************
The ACRN hypervisor is responsible for all the physical interrupt
handling. All physical interrupts are first handled in VMX root-mode.
The "external-interrupt exiting" bit in the VM-Execution controls field
is set to support this. The ACRN hypervisor also initializes all the
interrupt related modules such as IDT, PIC, IOAPIC, and LAPIC.
The hypervisor implements the following functionalities for handling
physical interrupts:
Only a few physical interrupts (such as TSC-Deadline timer and IOMMU)
are fully serviced in the hypervisor. Most interrupts come from pass-thru
devices whose interrupt are remapped to a virtual INTx/MSI source and
injected to the SOS or UOS, according to the pass-thru device
configuration.
- Configure interrupt-related hardware including IDT, PIC, LAPIC, and
IOAPIC on startup.
The ACRN hypervisor does handle exceptions and any exception coming from
the VMX root-mode will lead to the CPU halting. For guest exception, the
hypervisor only traps #MC (machine check), prints a warning message, and
injects the exception back into the guest OS.
- Provide APIs to manipulate the registers of LAPIC and IOAPIC.
- Acknowledge physical interrupts.
- Set up a callback mechanism for the other components in the
hypervisor to request for an interrupt vector and register a
handler for that interrupt.
HV owns all native physical interrupts and manages 256 vectors per CPU.
All physical interrupts are first handled in VMX root-mode. The
"external-interrupt exiting" bit in VM-Execution controls field is set
to support this. The ACRN hypervisor also initializes all the interrupt
related modules like IDT, PIC, IOAPIC, and LAPIC.
HV does not own any host devices (except UART). All devices are by
default assigned to SOS. Any interrupts received by Guest VM (SOS or
UOS) device drivers are virtual interrupts injected by HV (via vLAPIC).
HV manages a Host-to-Guest mapping. When a native IRQ/interrupt occurs,
HV decides whether this IRQ/interrupt should be forwarded to a VM and
which VM to forward to (if any). Refer to section 3.7.6 for virtual
interrupt injection and section 3.9.6 for the management of interrupt
remapping.
HV does not own any exceptions. Guest VMCS are configured so no VM Exit
happens, with some exceptions such as #INT3 and #MC. This is to
simplify the design as HV does not support any exception handling
itself. HV supports only static memory mapping, so there should be no
#PF or #GP. If HV receives an exception indicating an error, an assert
function is then executed with an error message print out, and the
system then halts.
Native interrupts could be generated from one of the following
sources:
- GSI interrupts
- PIC or Legacy devices IRQ (0~15)
- IOAPIC pin
- PCI MSI/MSI-X vectors
- Inter CPU IPI
- LAPIC timer
Physical Interrupt Initialization
=================================
*********************************
After the ACRN hypervisor get control from the bootloader, it
initializes all physical interrupt-related modules for all the CPUs. The
ACRN hypervisor creates a framework to manage the physical interrupt for
hypervisor-local devices, pass-thru devices, and IPI between CPUs.
After ACRN hypervisor gets control from the bootloader, it
initializes all physical interrupt-related modules for all the CPUs. ACRN
hypervisor creates a framework to manage the physical interrupt for
hypervisor local devices, pass-thru devices, and IPI between CPUs, as
shown in :numref:`hv-interrupt-init`:
IDT
---
The ACRN hypervisor builds its native Interrupt Descriptor Table (IDT) during
interrupt initialization. For exceptions, it links to function
``dispatch_exception``, and for external interrupts it links to function
``dispatch_interrupt``. Please refer to ``arch/x86/idt.S`` for more details.
LAPIC
-----
The ACRN hypervisor resets LAPIC for each CPU, and provides basic APIs
used, for example, by the local timer (TSC Deadline)
program and IPI notification program. These APIs include
write_laipic_reg32, send_lapic_eoi, send_startup_ipi, and
send_single_ipi.
.. comment
Need reference to API doc generated from doxygen comments
in hypervisor/include/arch/x86/lapic.h
PIC/IOAPIC
----------
The ACRN hypervisor masks all interrupts from PIC, so all the
legacy interrupts from PIC (<16) are linked to IOAPIC, as shown in
:numref:`interrupt-pic-pin`.
ACRN will pre-allocate vectors and mask them for these legacy interrupts
in IOAPIC RTE. For others (>= 16) ACRN will mask them with vector 0 in
RTE, and the vector will be dynamically allocated on demand.
.. figure:: images/interrupt-image5.png
.. figure:: images/interrupt-image66.png
:align: center
:width: 600px
:name: interrupt-pic-pin
:name: hv-interrupt-init
PIC & IOAPIC Pin Connection
Physical Interrupt Initialization
Irq Desc
--------
IDT Initialization
==================
The ACRN hypervisor maintains a global ``irq_desc[]`` array shared among the
CPUs and uses a flat mode to manage the interrupts. The same
vector is linked to the same IRQ number for all CPUs.
ACRN hypervisor builds its native IDT (interrupt descriptor table)
during interrupt initialization and set up the following handlers:
.. comment
- On an exception, the hypervisor dumps its context and halts the current
physical processor (because physical exceptions are not expected).
Need reference to API doc generated from doxygen comments
for ``struct irq_desc`` in hypervisor/include/common/irq.h
- For external interrupts, HV may mask the interrupt (depending on the
trigger mode), followed by interrupt acknowledgement and dispatch
to the registered handler, if any.
Most interrupts and exceptions are handled without a stack switch,
except for machine-check, double fault, and stack fault exceptions which
have their own stack set in TSS.
The ``irq_desc[]`` array is indexed by the IRQ number. An
``irq_handler`` field can be set to a common edge, level, or quick
handler called from ``interrupt_dispatch``. The ``irq_desc`` structure
also contains the ``dev_list`` field to maintain this IRQ's action
handler list.
The global array ``vector_to_irq[]`` is used to manage the vector
resource. This array is initialized with value ``IRQ_INVALID`` for all
vectors, and will be set to a valid IRQ number after the corresponding
vector is registered.
For example, if the local timer registers interrupt with IRQ number 271 and
vector 0xEF, then the arrays mentioned above will be set to::
irq_desc[271].irq = 271;
irq_desc[271].vector = 0xEF;
vector_to_irq[0xEF] = 271;
Physical Interrupt Flow
=======================
When an physical interrupt occurs, and the CPU is running under VMX root
mode, the interrupt is triggered from the standard native irq flow:
interrupt gate to irq handler. However, if the CPU is running under VMX
non-root mode, an external interrupt will trigger a VM exit for reason
"external-interrupt". See :numref:`interrupt-handle-flow`.
.. figure:: images/interrupt-image4.png
:align: center
:width: 800px
:name: interrupt-handle-flow
ACRN Hypervisor Interrupt Handle Flow
After an interrupt happens (in either case noted above), the ACRN
hypervisor jumps to ``dispatch_interrupt``. This function will check
which vector caused this interrupt, and the corresponding ``irq_desc``
structure's ``irq_handler`` will be called for the service.
There are several irq_handler's defined in the ACRN hypervisor, as shown
in :numref:`interrupt-handle-flow`, designed for different uses. For
example, ``quick_handler_nolock`` is used when no critical data needs
protection in the action handlers; the VCPU notification IPI and local
timer are good example of this use case.
The more complicated ``common_dev_handler_level`` handler is intended
for pass-thru devices with level triggered interrupts. To avoid
continuously triggering the interrupt, it initially masks IOAPIC pin and
unmasks it only when the corresponding vIOAPIC pin gets an explicit EOI
ACK from the guest.
All the irq handler's finally call their own action handler list, as
shown here:
.. code-block: c
struct dev_handler_node \*dev = desc->dev_list;
while (dev != NULL) {
if (dev->dev_handler != NULL)
dev->dev_handler(desc->irq, dev->dev_data);
dev = dev->next;
}
The common APIs for registering, updating, and unregistering
interrupt handlers include irq_to_vector, dev_to_irq, dev_to_vector,
pri_register_handler, normal_register_handler,
unregister_handler_common, and update_irq_handler.
.. comment
Need reference to API doc generated from doxygen comments
in hypervisor/include/common/irq.h
.. _physical_interrupt_source:
Physical Interrupt Source
PIC/IOAPIC Initialization
=========================
The ACRN hypervisor handles interrupts from many different sources, as
shown in :numref:`interrupt-source`:
ACRN hypervisor masks all interrupts from the PIC. All legacy interrupts
from PIC (<16) will be linked to IOAPIC, as shown in the connections in
:numref:`hv-pic-config`.
ACRN will pre-allocate vectors and mask them for these legacy interrupt
in IOAPIC RTE. For others (>= 16), ACRN will mask them with vector 0 in
RTE, and the vector will be dynamically allocate on demand.
.. list-table:: Physical Interrupt Source
:widths: 15 10 60
All external IOAPIC pins are categorized as GSI interrupt according to
ACPI definition. HV supports multiple IOAPIC components. IRQ PIN to GSI
mappings are maintained internally to determine GSI source IOAPIC.
Native PIC is not used in the system.
.. figure:: images/interrupt-image46.png
:align: center
:name: hv-pic-config
HV PIC/IOAPIC/LAPIC configuration
LAPIC Initialization
====================
Physical LAPICs are in xAPIC mode in ACRN hypervisor. The hypervisor
initializes LAPIC for each physical CPU by masking all interrupts in the
local vector table (LVT), clearing all ISRs, and enabling LAPIC.
APIs are provided to access LAPIC for the other components in the
hypervisor, aiming for further usage of local timer (TSC Deadline)
program, IPI notification program, etc. See :ref:`hv_interrupt-data-api`
for a complete list.
HV Interrupt Vectors and Delivery Mode
======================================
The interrupt vectors are assigned as shown here:
**Vector 0-0x1F**
are exceptions that are not handled by HV. If
such an exception does occur, the system then halts.
**Vector: 0x20-0x2F**
are allocated statically for legacy IRQ0-15.
**Vector: 0x30-0xDF**
are dynamically allocated vectors for PCI devices
INTx or MSI/MIS-X usage. According to different interrupt delivery mode
(FLAT or PER_CPU mode), an interrupt will be assigned to a vector for
all the CPUs or a particular CPU.
**Vector: 0xE0-0xFE**
are high priority vectors reserved by HV for
dedicated purposes. For example, 0xEF is used for timer, 0xF0 is used
for IPI.
.. list-table::
:widths: 30 70
:header-rows: 1
:name: interrupt-source
* - Interrupt Source
- Vector
- Description
* - TSC Deadline Timer
- 0xEF
- The TSC deadline timer implements the timer framework in
the hypervisor based on the LAPIC TSC deadline. This interrupt's
target is specific to the CPU to which the LAPIC belongs.
* - CPU Startup IPI
- N/A
- The BSP needs to trigger an INIT-SIPI sequence to wake up the
APs. This interrupt's target is specified by the BSP calling
`` start_cpus()``.
* - VCPU Notify IPI
- 0xF0
- When the hypervisor needs to kick the VCPU out of VMX non-root
mode to do requests such as virtual interrupt injection, EPT
flush, etc. This interrupt's target is specified by function
``send_single_ipi()``.
* - IOMMU MSI
- dynamic
- IOMMU device supports an MSI interrupt. The vtd device driver in
the hypervisor will register an interrupt to handle dmar fault.
This interrupt's target is specified by vtd device driver.
* - PTdev INTx
- dynamic
- All native devices are owned by the guest (SOS or UOS), taking
advantage of the pass-thru method. Each pass-thru device connected
with IOAPIC/PIC (PTdev INTx) will register an interrupt when
its attached interrupt controller pin first gets unmasked.
This interrupt's target is defined by and RTE entry in the IOAPIC.
* - PTdev MSI
- dynamic
- All native devices are owned by the guest (SOS or UOS), taking
advantage of pass-thru method. Each pass-thru device with
enabled MSI (PTdev MSI) will register an interrupt when the SOS
does an explicit hypercall. This interrupt's target is defined
by an MSI address entry.
* - Vectors
- Usage
Softirq
=======
* - 0x0-0x13
- Exceptions: NMI, INT3, page dault, GP, debug.
ACRN hypervisor implements a simple bottom-half softirq to execute the
interrupt handler, as showed in :numref:`interrupt-handle-flow`.
The softirq is executed when an interrupt is enabled. Several APIs for softirq
are defined including enable_softirq, disable_softirq, raise_softirq,
and exec_softirq.
* - 0x14-0x1F
- Reserved
.. comment
* - 0x20-0x2F
- Statically allocated for external IRQ (IRQ0-IRQ15)
Need reference to API doc generated from doxygen comments
in hypervisor/include/common/softirq.h
* - 0x30-0xDF
- Dynamically allocated for IOAPIC IRQ from PCI INTx/MSI
Physical Exception Handling
===========================
* - 0xE0-0xFE
- Static allocated for HV
As mentioned earlier, the ACRN hypervisor does not handle any
physical exceptions. The VMX root mode code path should guarantee no
exceptions are triggered while the hypervisor is running.
* - 0xEF
- Timer
Guest Virtual Interrupt Management
**********************************
* - 0xF0
- IPI
The previous sections describe physical interrupt management in the ACRN
hypervisor. After a physical interrupt happens, a registered action
handler is executed. Usually, the action handler represents a service
for virtual interrupt injection. For example, if an interrupt is
triggered from a pass-thru device, the appropriate virtual interrupt
should be injected into its guest VM.
* - 0xFF
- SPURIOUS_APIC_VECTOR
The virtual interrupt injection could also come from an emulated device.
The I/O mediator in the Service OS (SOS) could trigger an interrupt
through a hypercall, and then do the virtual interrupt injection in the
hypervisor.
Interrupts from either IOAPIC or MSI can be delivered to a target CPU.
By default they are configured as Lowest Priority (FLAT mode), i.e. they
are delivered to a CPU core that is currently idle or executing lowest
priority ISR. There is no guarantee a device's interrupt will be
delivered to a specific Guest's CPU. Timer interrupts are an exception -
these are always delivered to the CPU which programs the LAPIC timer.
The following sections give an introduction to the ACRN guest virtual
interrupt management, including VCPU request for virtual interrupt kick
off, vPIC/vIOAPIC/vLAPIC for virtual interrupt injection interfaces,
physical-to-virtual interrupt mapping for a pass-thru device, and the
process of VMX interrupt/exception injection.
There are two interrupt delivery modes: FLAT mode and PER_CPU mode. ACRN
uses FLAT MODE where the interrupt/irq to vector mapping is the same on all CPUs. Every
CPU receives same interrupts. IOAPIC and LAPIC MSI delivery mode are
configured to Lowest Priority.
VCPU Request
============
Vector allocation for CPUs is shown here:
As mentioned in `physical_interrupt_source`_, physical vector 0xF0 is
used to kick the VCPU out of its VMX non-root mode, and make a request
for virtual interrupt injection or other requests such as flush EPT.
The request-make API (vcpu_make_request) and eventid supports virtual interrupt
injection.
.. comment
Need reference to API doc generated from doxygen comments
in hypervisor/include/common/irq.h
There are requests for exception injection (ACRN_REQUEST_EXCP), vLAPIC
event (ACRN_REQUEST_EVENT), external interrupt from vPIC
(ACRN_REQUEST_EXTINT) and non-maskable-interrupt (ACRN_REQUEST_NMI).
The ``vcpu_make_request`` is necessary for a virtual interrupt
injection. If the target VCPU is running under VMX non-root mode, it
will send an IPI to kick it out and results in an external-interrupt
VM-Exit. The flow of :numref:`interrupt-handle-flow` could be executed
to complete the injection of a virtual interrupt.
There are some cases that do not need to send an IPI when making a
request because the CPU making the request is the target VCPU. For
example, the #GP exception request always happens on the current CPU
when an invalid emulation happens. An external interrupt for a pass-thru
device always happens on the VCPUs the device belongs to, so after it
triggers an external-interrupt VM-Exit, the current CPU is also the
target VCPU.
Virtual PIC
===========
The ACRN hypervisor emulates a vPIC for each VM based on IO ranges
0x20-0x21, 0xa0-0xa1, or 0x4d0-0x4d1.
If an interrupt source from vPIC needs to inject an interrupt,
the vpic_assert_irq, vpic_deassert_irq, or vpic_pulse_irq functions can
be called to make a request for ACRN_REQUEST_EXTINT or
ACRN_REQUEST_EVENT:
.. comment
Need reference to API doc generated from doxygen comments
in hypervisor/include/common/vpic.h
The vpic_pending_intr and vpic_intr_accepted APIs are used to query the
vector being injected and ACK the service, by moving the interrupt from
request service (IRR) to in service (ISR).
Virtual IOAPIC
==============
ACRN hypervisor emulates a vIOAPIC for each VM based on MMIO
VIOAPIC_BASE.
If an interrupt source from vIOAPIC needs to inject an interrupt, the
vioapic_assert_irq, vioapic_dessert_irq, and vioapic_pulse_irq APIs are
used to make a request for ACRN_REQUEST_EVENT.
As the vIOAPIC is always associated with a vLAPIC, the virtual interrupt
injection from vIOAPIC will finally trigger a request for an vLAPIC
event.
Virtual LAPIC
=============
The ACRN hypervisor emulates a vLAPIC for each VCPU based on MMIO
DEFAULT_APIC_BASE.
If an interrupt source from vLAPIC needs to inject an interrupt (e.g.,
from LVT such as an LAPIC timer, from vIOAPIC for a pass-thru device
interrupt, or from an emulated device for a MSI), vlapic_intr_level,
vlapic_intr_edge, vlapic_set_local_intr, vlapic_intr_msi,
vlapic_deliver_intr APIs need to be called, resulting in a request for
ACRN_REQUEST_EVENT.
.. comment
Need reference to API doc generated from doxygen comments
in hypervisor/include/common/vlapic.h
The vlapic_pending_intr and vlapic_intr_accepted APIs are used to query
the vector that needs to be injected and ACK
the service that move the interrupt from request service (IRR) to in
service (ISR).
By default, the ACRN hypervisor enables vAPIC to improve the performance of
a vLAPIC emulation.
Virtual Exception
=================
When doing emulation, an exception may be triggered in the hypervisor,
for example, if guest accesses an invalid vMSR register, or the
hypervisor needs to inject a #GP, or during instruction emulation, an
instruction fetch may access a non-exist page from rip_gva, and a #PF
must be injected.
ACRN hypervisor implements virtual exception injection using the
vcpu_queue_exception, vcpu_inject_gq, and vcpu_inject_pf APIs.
.. comment
Need reference to API doc generated from doxygen comments
in hypervisor/include/common/irq.h
The ACRN hypervisor uses vcpu_inject_gp/vcpu_inject_pf functions to
queue exception requests, and follows `Intel Software
Developer Manual, Vol 3. <SDM vol3>`_ - 6.15, Table 6-5
listing conditions for generating a double fault.
.. _SDM vol3: https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
Interrupt Mapping for a Pass-thru Device
========================================
A VM can control a PCI device directly through pass-thru device
assignment. The pass-thru entry is the major info object, and it is:
- A physical interrupt source, and could be a MSI/MSIX entry, PIC pins, or
IOAPIC pins
- Pass-thru remapping information between physical and virtual interrupt
source, for MSI/MSIX it is identified by a PCI device's BDF. For
PIC/IOAPIC it is identified by the pin number.
.. figure:: images/interrupt-image7.png
.. figure:: images/interrupt-image89.png
:align: center
:width: 600px
:name: interrupt-pass-thru
Pass-thru Device Entry Assignment
FLAT mode vector allocation
As shown in :numref:`interrupt-pass-thru` above, a UOS will assign its
pass-thru device entry by the DM, and it will fill its entry info from:
IRQ Descriptor Table
====================
- vPIC/vIOAPIC interrupt mask/unmask
- MSI IOReq from UOS then MSI hypercall from SOS
ACRN hypervisor maintains a global IRQ Descriptor Table shared among the
physical CPUs. ACRN use FLAT MODE to manage the interrupts so the
same vector will link to same the IRQ number for all CPUs.
The SOS adds its pass-thru device entry at runtime and fills info for:
.. note:: need to reference API doc for irq_desc
- vPIC/vIOAPIC interrupt mask/unmask
- MSI hypercall from SOS
During the pass-thru device entry info filling, the hypervisor builds
native IOAPIC RTE/MSI entry based on vIOAPIC/vPIC/vMSI configuration,
and register the physical interrupt handler for it. Then with the pass-thru
device entry as the handler private data, the physical interrupt can
be linked to a virtual pin of a guest's vPIC/vIOAPIC or virtual vector of
a guest's vMSI. The handler then injects the corresponding virtual
interrupt into the guest, based on vPIC/vIOAPIC/vLAPIC APIs described
earlier.
The *irq_desc[]* array's index represents IRQ number. An *irq_handler*
field could be set to common edge/level/quick handler which will be
called from *interrupt_dispatch*. The *irq_desc* structure also
contains the *dev_list* field to maintain this IRQ's action handler
list.
Interrupt Storm Mitigation
==========================
Another reverse mapping from vector to IRQ is used in addition to the
IRQ descriptor table which maintains the mapping from IRQ to vector.
When the Device Model (DM) launches a User OS (UOS), the ACRN hypervisor
will remap the interrupt for this user OS's pass-through devices. When
an interrupt occurs for a pass-through device, the CPU core is assigned
to that User OS gets trapped into the hypervisor. The benefit of such a
mechanism is that, should an interrupt storm happen in a particular UOS,
it will have only a minimal effect on the performance of the Service OS.
On initialization, the descriptor of the legacy IRQs are initialized with
proper vectors and the corresponding reverse mapping is set up.
The descriptor of other IRQs are filled with an invalid
vector which will be updated on IRQ allocation.
Interrupt/Exception Injection Process
=====================================
For example, if local timer registers an interrupt with IRQ number 271 and
vector 0xEF, then this date will be set up:
As shown in :numref:`interrupt-handle-flow`, the ACRN hypervisor injects
virtual interrupt/exception to the guest before its VM-Entry.
.. code-block:: c
This is done by updating the VMX_ENTRY_INT_INFO_FIELD of the VCPU's
VMCS. As this field is unique, the interrupt/exception injection must
follow a priority rule to handle one-by-one.
irq_desc[271].irq = 271
irq_desc[271].vector = 0xEF
vector_to_irq[0xEF] = 271
:numref:`interrupt-injection` below shows the rules about how to inject
virtual interrupt/exception one-by-one. If a high priority
interrupt/exception was already injected, the next pending
interrupt/exception will enable an interrupt window where the next
injection will be done by the following VM-Exit, triggered by the
interrupt window.
External Interrupt Handling
***************************
.. figure:: images/interrupt-image6.png
CPU runs under VMX non-root mode and inside Guest VMs.
``MSR_IA32_VMX_PINBASED_CTLS.bit[0]`` and
``MSR_IA32_VMX_EXIT_CTLS.bit[15]`` are set to allow vCPU VM Exit to HV
whenever there are interrupts to that physical CPU under
non-root mode. HV ACKs the interrupts in VMX non-root and saves the
interrupt vector to the relevant VM Exit field for HV IRQ processing.
Note that as discussed above, an external interrupt causing vCPU VM Exit
to HV does not mean that the interrupt belongs to that Guest VM. When
CPU executes VM Exit into root-mode, interrupt handling will be enabled
and the interrupt will be delivered and processed as quickly as possible
inside HV. HV may emulate a virtual interrupt and inject to Guest if
necessary.
When an physical interrupt happened on a CPU, this CPU could be running
under VMX root mode or non-root mode. If the CPU is running under VMX
root mode, the interrupt is triggered from standard native IRQ flow -
interrupt gate to IRQ handler. If the CPU is running under VMX non-root
mode, an external interrupt will trigger a VM exit for reason
"external-interrupt".
Interrupt and IRQ processing flow diagrams are shown below:
.. figure:: images/interrupt-image48.png
:align: center
:width: 600px
:name: interrupt-injection
:name: phy-interrupt-processing
ACRN Hypervisor Interrupt/Exception Injection Process
Processing of physical interrupts
.. figure:: images/interrupt-image39.png
:align: center
IRQ processing control flow
When a physical interrupt is raised and delivered to a physical CPU, the
CPU may be running under either VMX root mode or non-root mode.
- If the CPU is running under VMX root mode, the interrupt is handled
following the standard native IRQ flow: interrupt gate to
dispatch_interrupt(), IRQ handler, and finally the registered callback.
- If the CPU is running under VMX non-root mode, an external interrupt
calls a VM exit for reason "external-interrupt", and then the VM
exit processing flow will call dispatch_interrupt() to dispatch and
handle the interrupt.
After an interrupt occures from either path shown in
:numref:`phy-interrupt-processing`, ACRN hypervisor will jump to
dispatch_interrupt. This function gets the vector of the generated
interrupt from the context, gets IRQ number from vector_to_irq[], and
then gets the corresponding irq_desc.
Though there is only one generic IRQ handler for registered interrupt,
there are three different handling flows according to flags:
- ``!IRQF_LEVEL``
- ``IRQF_LEVEL && !IRQF_PT``
To avoid continuous interrupt triggers, it masks the IOAPIC pin and
unmask it only after IRQ action callback is executed
- ``IRQF_LEVEL && IRQF_PT``
For pass-thru devices, to avoid continuous interrupt triggers, it masks
the IOAPIC pin and leaves it unmasked until corresponding vIOAPIC
pin gets an explicit EOI ACK from guest.
Since interrupts are not shared for multiple devices, there is only one
IRQ action registered for each interrupt
The IRQ number inside HV is a software concept to identify GSI and
Vectors. Each GSI will be mapped to one IRQ. The GSI number is usually the same
as the IRQ number. IRQ numbers greater than max GSI (nr_gsi) number are dynamically
assigned. For example, HV allocates an interrupt vector to a PCI device,
an IRQ number is then assigned to that vector. When the vector later
reaches a CPU, the corresponding IRQ routine is located and executed.
See :numref:`request-irq` for request IRQ control flow for different
conditions:
.. figure:: images/interrupt-image76.png
:align: center
:name: request-irq
Request IRQ for different conditions
IPI Management
**************
The only purpose of IPI use in HV is to kick a vCPU out of non-root mode
and enter to HV mode. This requires I/O request and virtual interrupt
injection be distributed to different IPI vectors. The I/O request uses
IPI vector 0xF4 upcall (refer to Chapter 5.4). The virtual interrupt
injection uses IPI vector 0xF0.
0xF4 upcall
A Guest vCPU VM Exit exits due to EPT violation or IO instruction trap.
It requires Device Module to emulate the MMIO/PortIO instruction.
However it could be that the Service OS (SOS) vCPU0 is still in non-root
mode. So an IPI (0xF4 upcall vector) should be sent to the physical CPU0
(with non-root mode as vCPU0 inside SOS) to force vCPU0 to VM Exit due
to the external interrupt. The virtual upcall vector is then injected to
SOS, and the vCPU0 inside SOS then will pick up the IO request and do
emulation for other Guest.
0xF0 IPI flow
If Device Module inside SOS needs to inject an interrupt to other Guest
such as vCPU1, it will issue an IPI first to kick CPU1 (assuming CPU1 is
running on vCPU1) to root-hv_interrupt-data-apmode. CPU1 will inject the
interrupt before VM Enter.
.. _hv_interrupt-data-api:
Data structures and interfaces
******************************
IOAPIC
======
The following APIs are external interfaces for IOAPIC related
operations.
.. code-block:: c
void ioapic_get_rte(uint32_t irq, union ioapic_rte *rte)
/* get the redirection table entry of an irq. */
void ioapic_set_rte(uint32_t irq, union ioapic_rte rte)
/* Set the redirection table entry of an irq. */
uint32_t pin_to_irq(uint8_t pin)
/* Get irq num from physical irq pin num */
void suspend_ioapic(void)
/* Suspended ioapic, mainly save the RTEs. */
void resume_ioapic(void)
/* Resume ioapic, mainly restore the RTEs. */
int get_ioapic_info(char *str_arg, int str_max_len)
/* Dump information of ioapic for debug, such as irq num, pin,
* RTE, vector, trigger mode etc. For debugging only.
*/
LAPIC
=====
The following APIs are external interfaces for LAPIC related operations.
.. code-block:: c
void write_lapic_reg32(uint32_t offset, uint32_t value)
/* Write to lapic register. */
void early_init_lapic(void)
/* To get the local apic base addr, map lapic registers and check the
* xAPIC/x2APIC capability.
*/
void save_lapic(struct lapic_regs *regs)
/* Save context of lapic before entering s3. */
void restore_lapic(struct lapic_regs *regs)
/* Restore context of lapic when resume from s3. */
void resume_lapic(void)
/* Resume lapic by setting the apic base addr and restore the registers. */
uint8_t get_cur_lapic_id(void)
/* Get the lapic id. */
IPI
===
The following APIs are external interfaces for IPI related operations.
.. code-block:: c
void send_startup_ipi(enum intr_cpu_startup_shorthand cpu_startup_shorthand,
uint16_t dest_pcpu_id, uint64_t cpu_startup_start_address)
/* Send an SIPI to a specific cpu, to notify the cpu to start booting. */
void send_dest_ipi(uint32_t dest, uint32_t vector, uint32_t dest_mode)
/* Send an IPI to a specific cpu with dest mode specified. */
void send_single_ipi(uint16_t pcpu_id, uint32_t vector)
/* Send an IPI to a specific cpu with physical dest mode. */
Physical Interrupt
==================
The following APIs are external interfaces for physical interrupt
related operations.
.. code-block:: c
int32_t request_irq(uint32_t req_irq, irq_action_t action_fn, void *priv_data,
uint32_t flags)
/* Request interrupt num if not specified, and register irq action for the
* specified/allocated irq.
*/
void free_irq(uint32_t irq)
/* Free irq num and unregister the irq action. */
void set_irq_trigger_mode(uint32_t irq, bool is_level_trigger)
/* Set the irq trigger mode: edge-triggered or level-triggered */
uint32_t irq_to_vector(uint32_t irq)
/* Convert irq num to vector */
void get_cpu_interrupt_info(char *str_arg, int str_max)
/* To dump interrupt statistics info, such as irq num, vector,
* irq count on each physical cpu.
*/
void dispatch_interrupt(struct intr_excp_ctx *ctx)
/* To dispatch an interrupt, an action callback will be called if registered. */
void interrupt_init(uint16_t pcpu_id)
/* To do interrupt initialization for a cpu, will be called for
* each physical cpu.
*/