mirror of
https://github.com/projectacrn/acrn-hypervisor.git
synced 2025-06-21 13:08:42 +00:00
doc: update hv device passthrough document
Fixed misspellings and rst formatting issues. Added ptdev.h to the list of include file for doxygen Tracked-On: #3882 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
This commit is contained in:
parent
b05c1afa0b
commit
76f2e28e13
@ -808,6 +808,7 @@ INPUT = custom-doxygen/mainpage.md \
|
||||
../hypervisor/include/arch/x86/guest/vmx_io.h \
|
||||
../hypervisor/include/arch/x86/guest/assign.h \
|
||||
../hypervisor/include/common/hypercall.h \
|
||||
../hypervisor/include/common/ptdev.h \
|
||||
../hypervisor/include/public/acrn_common.h \
|
||||
../hypervisor/include/public/acrn_hv_defs.h \
|
||||
../hypervisor/include/arch/x86/guest/vcpu.h \
|
||||
|
@ -7,9 +7,9 @@ A critical part of virtualization is virtualizing devices: exposing all
|
||||
aspects of a device including its I/O, interrupts, DMA, and configuration.
|
||||
There are three typical device
|
||||
virtualization methods: emulation, para-virtualization, and passthrough.
|
||||
Both emulation and passthrough are used in ACRN project. Device
|
||||
emulation is discussed in :ref:`hld-io-emulation` and
|
||||
device passthrough will be discussed here.
|
||||
All emulation, para-virtualization and passthrough are used in ACRN project. Device
|
||||
emulation is discussed in :ref:`hld-io-emulation`, para-virtualization is discussed
|
||||
in :ref:`hld-virtio-devices` and device passthrough will be discussed here.
|
||||
|
||||
In the ACRN project, device emulation means emulating all existing hardware
|
||||
resource through a software component device model running in the
|
||||
@ -34,15 +34,18 @@ can't support device sharing.
|
||||
Passthrough in the hypervisor provides the following functionalities to
|
||||
allow VM to access PCI devices directly:
|
||||
|
||||
- DMA Remapping by VT-d for PCI device: hypervisor will setup DMA
|
||||
- VT-d DMA Remapping for PCI devices: hypervisor will setup DMA
|
||||
remapping during VM initialization phase.
|
||||
- VT-d Interrupt-remapping for PCI devices: hypervisor will enable
|
||||
VT-d interrupt-remapping for PCI devices for security considerations.
|
||||
- MMIO Remapping between virtual and physical BAR
|
||||
- Device configuration Emulation
|
||||
- Remapping interrupts for PCI device
|
||||
- Remapping interrupts for PCI devices
|
||||
- ACPI configuration Virtualization
|
||||
- GSI sharing violation check
|
||||
|
||||
The following diagram details passthrough initialization control flow in ACRN:
|
||||
The following diagram details passthrough initialization control flow in ACRN
|
||||
for post-launched VM:
|
||||
|
||||
.. figure:: images/passthru-image22.png
|
||||
:align: center
|
||||
@ -60,14 +63,39 @@ passthrough, as detailed here:
|
||||
|
||||
Passthrough Device Status
|
||||
|
||||
DMA Remapping
|
||||
*************
|
||||
Owner of Passthrough Devices
|
||||
****************************
|
||||
|
||||
ACRN hypervisor will do PCI enumeration to discover the PCI devices on the platform.
|
||||
According to the hypervisor/VM configurations, the owner of these PCI devices can be
|
||||
one the following 4 cases:
|
||||
|
||||
- **Hypervisor**: hypervisor uses UART device as the console in debug version for
|
||||
debug purpose, so the UART device is owned by hypervisor and is not visible
|
||||
to any VM. For now, UART is the only pci device could be owned by hypervisor.
|
||||
- **Pre-launched VM**: The passthrough devices will be used in a pre-launched VM is
|
||||
pre-defined in VM configuration. These passthrough devices are owned by the
|
||||
pre-launched VM after the VM is created. These devices will not be removed
|
||||
from the pre-launched VM. There could be pre-launched VM(s) in logical partition
|
||||
mode and hybrid mode.
|
||||
- **Service VM**: All the passthrough devices except these described above (owned by
|
||||
hypervisor or pre-launched VM(s)) are assigned to Service VM. And some of these devices
|
||||
can be assigned to a post-launched VM according to the passthrough device list
|
||||
specified in the parameters of the ACRN DM.
|
||||
- **Post-launched VM**: A list of passthrough devices can be specified in the parameters of
|
||||
the ACRN DM. When creating a post-launched VM, these specified devices will be moved
|
||||
from Service VM domain to the post-launched VM domain. After the post-launched VM is
|
||||
powered-off, these devices will be moved back to Service VM domain.
|
||||
|
||||
|
||||
VT-d DMA Remapping
|
||||
******************
|
||||
|
||||
To enable passthrough, for VM DMA access the VM can only
|
||||
support GPA, while physical DMA requires HPA. One work-around
|
||||
is building identity mapping so that GPA is equal to HPA, but this
|
||||
is not recommended as some VM don't support relocation well. To
|
||||
address this issue, Intel introduces VT-d in chipset to add one
|
||||
address this issue, Intel introduces VT-d in the chipset to add one
|
||||
remapping engine to translate GPA to HPA for DMA operations.
|
||||
|
||||
Each VT-d engine (DMAR Unit), maintains a remapping structure
|
||||
@ -76,21 +104,16 @@ page table for GPA/HPA translation as output. The GPA/HPA translation
|
||||
page table is similar to a normal multi-level page table.
|
||||
|
||||
VM DMA depends on Intel VT-d to do the translation from GPA to HPA, so we
|
||||
need to enable VT-d IOMMU engine in ACRN before we can passthrough any device. SOS
|
||||
need to enable VT-d IOMMU engine in ACRN before we can passthrough any device. Service VM
|
||||
in ACRN is a VM running in non-root mode which also depends
|
||||
on VT-d to access a device. In SOS DMA remapping
|
||||
on VT-d to access a device. In Service VM DMA remapping
|
||||
engine settings, GPA is equal to HPA.
|
||||
|
||||
ACRN hypervisor checks DMA-Remapping Hardware unit Definition (DRHD) in
|
||||
host DMAR ACPI table to get basic info, then sets up each DMAR unit. For
|
||||
simplicity, ACRN reuses EPT table as the translation table in DMAR
|
||||
unit for each passthrough device. The control flow is shown in the
|
||||
following figures:
|
||||
|
||||
.. figure:: images/passthru-image72.png
|
||||
:align: center
|
||||
|
||||
DMA Remapping control flow during HV init
|
||||
unit for each passthrough device. The control flow of assigning and de-assigning
|
||||
a passthrough device to/from a post-launched VM is shown in the following figures:
|
||||
|
||||
.. figure:: images/passthru-image86.png
|
||||
:align: center
|
||||
@ -102,25 +125,45 @@ following figures:
|
||||
|
||||
ptdev de-assignment control flow
|
||||
|
||||
VT-d Interrupt-remapping
|
||||
************************
|
||||
|
||||
The VT-d interrupt-remapping architecture enables system software to
|
||||
control and censor external interrupt requests generated by all sources
|
||||
including those from interrupt controllers (I/OxAPICs), MSI/MSI-X capable
|
||||
devices including endpoints, root-ports and Root-Complex integrated
|
||||
end-points.
|
||||
ACRN forces to enabled VT-d interrupt-remapping feature for security reasons.
|
||||
If the VT-d hardware doesn't support interrupt-remapping, then ACRN will
|
||||
refuse to boot VMs.
|
||||
VT-d Interrupt-remapping is NOT related to the translation from physical
|
||||
interrupt to virtual interrupt or vice versa. The term VT-d interrupt-remapping
|
||||
remaps the interrupt index in the VT-d interrupt-remapping table to the physical
|
||||
interrupt vector after checking the external interrupt request is valid. Translation
|
||||
physical vector to virtual vector is still needed to be done by hypervisor, which is
|
||||
also described in the below section :ref:`_interrupt-remapping`.
|
||||
|
||||
MMIO Remapping
|
||||
**************
|
||||
|
||||
For PCI MMIO BAR, hypervisor builds EPT mapping between virtual BAR and
|
||||
physical BAR, then VM can access MMIO directly.
|
||||
There is one exception, MSI-X table is also in a MMIO BAR. Hypervisor needs to trap the
|
||||
accesses to MSI-X table. So the page(s) having MSI-X table should not be accessed by guest
|
||||
directly. EPT mapping is not built for these pages having MSI-X table.
|
||||
|
||||
Device configuration emulation
|
||||
******************************
|
||||
|
||||
PCI configuration is based on access of port 0xCF8/CFC. ACRN
|
||||
implements PCI configuration emulation to handle 0xCF8/CFC to control
|
||||
PCI device through two paths: implemented in hypervisor or in SOS device
|
||||
PCI device through two paths: implemented in hypervisor or in Service VM device
|
||||
model.
|
||||
|
||||
- When configuration emulation is in the hypervisor, the interception of
|
||||
0xCF8/CFC port and emulation of PCI configuration space access are
|
||||
tricky and unclean. Therefore the final solution is to reuse the
|
||||
PCI emulation infrastructure of SOS device model. The hypervisor
|
||||
PCI emulation infrastructure of Service VM device model. The hypervisor
|
||||
routes the UOS 0xCF8/CFC access to device model, and keeps blind to the
|
||||
physical PCI devices. Upon receiving UOS PCI configuration space access
|
||||
request, device model needs to emulate some critical space, for instance,
|
||||
@ -131,6 +174,25 @@ model.
|
||||
this, device model is linked with lib pci access to access physical PCI
|
||||
device.
|
||||
|
||||
MSI-X table emulation
|
||||
*********************
|
||||
|
||||
VM accesses to MSI-X table should be trapped so that hypervisor has the
|
||||
information to map the virtual vector and physical vector. EPT mapping should
|
||||
be skipped for the 4KB pages having MSI-X table.
|
||||
|
||||
There are three situations for the emulation of MSI-X table:
|
||||
|
||||
- **Service VM**: accesses to MSI-X table are handled by HV MMIO handler (4KB adjusted up
|
||||
and down). HV will remap interrupts.
|
||||
- **Post-launched VM**: accesses to MSI-X Tables are handled by DM MMIO handler
|
||||
(4KB adjusted up and down) and when DM (Service VM) writes to the table, it will be
|
||||
intercepted by HV MMIO handler and HV will remap interrupts.
|
||||
- **Pre-launched VM**: Writes to MMIO region in MSI-X Table BAR handled by HV MMIO
|
||||
handler. If the offset falls within the MSI-X table (offset, offset+tables_size),
|
||||
HV remaps interrupts.
|
||||
|
||||
|
||||
.. _interrupt-remapping:
|
||||
|
||||
Interrupt Remapping
|
||||
@ -152,17 +214,17 @@ The hypervisor will record different information for interrupt
|
||||
distribution: physical and virtual IOAPIC pin for IOAPIC source,
|
||||
physical and virtual BDF and other info for MSI source.
|
||||
|
||||
SOS passthrough is also in the scope of interrupt remapping which is
|
||||
Service VM passthrough is also in the scope of interrupt remapping which is
|
||||
done on-demand rather than on hypervisor initialization.
|
||||
|
||||
.. figure:: images/passthru-image102.png
|
||||
:align: center
|
||||
:name: init-remapping
|
||||
|
||||
Initialization of remapping of virtual IOAPIC interrupts for SOS
|
||||
Initialization of remapping of virtual IOAPIC interrupts for Service VM
|
||||
|
||||
:numref:`init-remapping` above illustrates how remapping of (virtual) IOAPIC
|
||||
interrupts are remapped for SOS. VM exit occurs whenever SOS tries to
|
||||
interrupts are remapped for Service VM. VM exit occurs whenever Service VM tries to
|
||||
unmask an interrupt in (virtual) IOAPIC by writing to the Redirection
|
||||
Table Entry (or RTE). The hypervisor then invokes the IOAPIC emulation
|
||||
handler (refer to :ref:`hld-io-emulation` for details on I/O emulation) which
|
||||
@ -173,13 +235,13 @@ Remapping of (virtual) PIC interrupts are set up in a similar sequence:
|
||||
.. figure:: images/passthru-image98.png
|
||||
:align: center
|
||||
|
||||
Initialization of remapping of virtual MSI for SOS
|
||||
Initialization of remapping of virtual MSI for Service VM
|
||||
|
||||
This figure illustrates how mappings of MSI or MSIX are set up for
|
||||
SOS. SOS is responsible for issuing an hypercall to notify the
|
||||
Service VM. Service VM is responsible for issuing a hypercall to notify the
|
||||
hypervisor before it configures the PCI configuration space to enable an
|
||||
MSI. The hypervisor takes this opportunity to set up a remapping for the
|
||||
given MSI or MSIX before it is actually enabled by SOS.
|
||||
given MSI or MSIX before it is actually enabled by Service VM.
|
||||
|
||||
When the UOS needs to access the physical device by passthrough, it uses
|
||||
the following steps:
|
||||
@ -191,15 +253,15 @@ the following steps:
|
||||
according to ptirq_remapping_info.
|
||||
- Hypervisor delivers the interrupt to UOS.
|
||||
|
||||
When the SOS needs to use the physical device, the passthrough is also
|
||||
active because the SOS is the first VM. The detail steps are:
|
||||
When the Service VM needs to use the physical device, the passthrough is also
|
||||
active because the Service VM is the first VM. The detail steps are:
|
||||
|
||||
- SOS get all physical interrupts. It assigns different interrupts for
|
||||
- Service VM get all physical interrupts. It assigns different interrupts for
|
||||
different VMs during initialization and reassign when a VM is created or
|
||||
deleted.
|
||||
- When physical interrupt is trapped, an exception will happen after VMCS
|
||||
has been set.
|
||||
- Hypervisor will handle the vm exit issue according to
|
||||
- Hypervisor will handle the VM exit issue according to
|
||||
ptirq_remapping_info and translates the vector.
|
||||
- The interrupt will be injected the same as a virtual interrupt.
|
||||
|
||||
@ -209,32 +271,40 @@ ACPI Virtualization
|
||||
ACPI virtualization is designed in ACRN with these assumptions:
|
||||
|
||||
- HV has no knowledge of ACPI,
|
||||
- SOS owns all physical ACPI resources,
|
||||
- Service VM owns all physical ACPI resources,
|
||||
- UOS sees virtual ACPI resources emulated by device model.
|
||||
|
||||
Some passthrough devices require physical ACPI table entry for
|
||||
initialization. The device model will create such device entry based on
|
||||
the physical one according to vendor ID and device ID. Virtualization is
|
||||
implemented in SOS device model and not in scope of the hypervisor.
|
||||
implemented in Service VM device model and not in scope of the hypervisor.
|
||||
For pre-launched VM, ACRN hypervisor doesn't support the ACPI virtualization,
|
||||
so devices relying on ACPI table are not supported.
|
||||
|
||||
GSI Sharing Violation Check
|
||||
***************************
|
||||
|
||||
All the PCI devices that are sharing the same GSI should be assigned to
|
||||
the same VM to avoid physical GSI sharing between multiple VMs. For
|
||||
devices that don't support MSI, ACRN DM
|
||||
shares the same GSI pin to a GSI
|
||||
the same VM to avoid physical GSI sharing between multiple VMs.
|
||||
In logical partition mode or hybrid mode, the PCI devices assigned to
|
||||
pre-launched VM is statically pre-defined. Developers should take care not to
|
||||
violate the rule.
|
||||
For post-launched VM, devices that don't support MSI, ACRN DM puts the devices
|
||||
sharing the same GSI pin to a GSI
|
||||
sharing group. The devices in the same group should be assigned together to
|
||||
the current VM, otherwise, none of them should be assigned to the
|
||||
current VM. A device that violates the rule will be rejected to be
|
||||
passthrough. The checking logic is implemented in Device Mode and not
|
||||
passed-through. The checking logic is implemented in Device Model and not
|
||||
in scope of hypervisor.
|
||||
The platform GSI information is in devicemodel/hw/pci/platform_gsi_info.c
|
||||
for limited platform (currently, only APL MRB). For other platforms, the platform
|
||||
specific GSI information should be added to activate the checking of GSI sharing violation.
|
||||
|
||||
Data structures and interfaces
|
||||
******************************
|
||||
|
||||
The following APIs are provided to initialize interrupt remapping for
|
||||
SOS:
|
||||
The following APIs are common APIs provided to initialize interrupt remapping for
|
||||
VMs:
|
||||
|
||||
.. doxygenfunction:: ptirq_intx_pin_remap
|
||||
:project: Project ACRN
|
||||
@ -242,8 +312,9 @@ SOS:
|
||||
.. doxygenfunction:: ptirq_prepare_msix_remap
|
||||
:project: Project ACRN
|
||||
|
||||
The following APIs are provided to manipulate the interrupt remapping
|
||||
for UOS.
|
||||
Post-launched VM needs to pre-allocate interrupt entries during VM initialization.
|
||||
Post-launched VM needs to free interrupt entries during VM de-initialization.
|
||||
The following APIs are provided to pre-allocate/free interrupt entries for post-launched VM:
|
||||
|
||||
.. doxygenfunction:: ptirq_add_intx_remapping
|
||||
:project: Project ACRN
|
||||
@ -258,3 +329,32 @@ The following APIs are provided to acknowledge a virtual interrupt.
|
||||
|
||||
.. doxygenfunction:: ptirq_intx_ack
|
||||
:project: Project ACRN
|
||||
|
||||
The following APIs are provided to handle ptdev interrupt:
|
||||
|
||||
.. doxygenfunction:: ptdev_init
|
||||
:project: Project ACRN
|
||||
|
||||
.. doxygenfunction:: ptirq_softirq
|
||||
:project: Project ACRN
|
||||
|
||||
.. doxygenfunction:: ptirq_alloc_entry
|
||||
:project: Project ACRN
|
||||
|
||||
.. doxygenfunction:: ptirq_release_entry
|
||||
:project: Project ACRN
|
||||
|
||||
.. doxygenfunction:: ptdev_release_all_entries
|
||||
:project: Project ACRN
|
||||
|
||||
.. doxygenfunction:: ptirq_activate_entry
|
||||
:project: Project ACRN
|
||||
|
||||
.. doxygenfunction:: ptirq_deactivate_entry
|
||||
:project: Project ACRN
|
||||
|
||||
.. doxygenfunction:: ptirq_dequeue_softirq
|
||||
:project: Project ACRN
|
||||
|
||||
.. doxygenfunction:: ptirq_get_intr_data
|
||||
:project: Project ACRN
|
||||
|
Binary file not shown.
Before Width: | Height: | Size: 40 KiB After Width: | Height: | Size: 25 KiB |
Binary file not shown.
Before Width: | Height: | Size: 4.2 KiB |
@ -156,14 +156,14 @@ extern spinlock_t ptdev_lock;
|
||||
/**
|
||||
* @brief Handler of softirq for passthrough device.
|
||||
*
|
||||
* When hypervisor receive a physcial interrupt from passthrough device, it
|
||||
* When hypervisor receive a physical interrupt from passthrough device, it
|
||||
* will enqueue a ptirq entry and raise softirq SOFTIRQ_PTDEV. This function
|
||||
* is the handler of the softirq, it handles the interrupt and injects the
|
||||
* virtual into VM.
|
||||
* The handler is reigstered by calling @ref ptdev_init during hypervisor
|
||||
* intialization.
|
||||
* The handler is registered by calling @ref ptdev_init during hypervisor
|
||||
* initialization.
|
||||
*
|
||||
* @param[in] pcpu_id physcial cpu id of the soft irq
|
||||
* @param[in] pcpu_id physical cpu id of the soft irq
|
||||
*
|
||||
*/
|
||||
void ptirq_softirq(uint16_t pcpu_id);
|
||||
@ -178,14 +178,14 @@ void ptirq_softirq(uint16_t pcpu_id);
|
||||
*/
|
||||
void ptdev_init(void);
|
||||
/**
|
||||
* @brief Deactiveate and release all ptirq entries for a VM.
|
||||
* @brief Deactivate and release all ptirq entries for a VM.
|
||||
*
|
||||
* This function deactiveates and releases all ptirq entries for a VM. The function
|
||||
* This function deactivates and releases all ptirq entries for a VM. The function
|
||||
* should only be called after the VM is already down.
|
||||
*
|
||||
* @param[in] vm acrn_vm on which the ptirq entries will be released
|
||||
*
|
||||
* @pre VM is realdy down
|
||||
* @pre VM is already down
|
||||
*
|
||||
*/
|
||||
void ptdev_release_all_entries(const struct acrn_vm *vm);
|
||||
@ -193,9 +193,9 @@ void ptdev_release_all_entries(const struct acrn_vm *vm);
|
||||
/**
|
||||
* @brief Dequeue an entry from per cpu ptdev softirq queue.
|
||||
*
|
||||
* Dequeue an entry from the ptdev softirq queue on the specific physcial cpu.
|
||||
* Dequeue an entry from the ptdev softirq queue on the specific physical cpu.
|
||||
*
|
||||
* @param[in] pcpu_id physcial cpu id
|
||||
* @param[in] pcpu_id physical cpu id
|
||||
*
|
||||
* @retval NULL when \p when the queue is empty
|
||||
* @retval !NULL when \p there is available ptirq_remapping_info entry in the queue
|
||||
@ -227,11 +227,11 @@ void ptirq_release_entry(struct ptirq_remapping_info *entry);
|
||||
/**
|
||||
* @brief Activate a irq for the associated passthrough device.
|
||||
*
|
||||
* After activating the ptirq entry, the physcial interrupt irq of passthrough device will be handled
|
||||
* After activating the ptirq entry, the physical interrupt irq of passthrough device will be handled
|
||||
* by the handler ptirq_interrupt_handler.
|
||||
*
|
||||
* @param[in] entry the ptirq_remapping_info entry that will be associated with the physcial irq.
|
||||
* @param[in] phys_irq physcial interrupt irq for the entry
|
||||
* @param[in] entry the ptirq_remapping_info entry that will be associated with the physical irq.
|
||||
* @param[in] phys_irq physical interrupt irq for the entry
|
||||
*
|
||||
* @retval success when \p return value >=0
|
||||
* @retval success when \p return <0
|
||||
@ -252,7 +252,7 @@ void ptirq_deactivate_entry(struct ptirq_remapping_info *entry);
|
||||
* @param[out] buffer the buffer to interrupt information stored to.
|
||||
* @param[in] buffer_cnt the size of the buffer.
|
||||
*
|
||||
* @retval the actural size the buffer filled with the interrupt information
|
||||
* @retval the actual size the buffer filled with the interrupt information
|
||||
*
|
||||
*/
|
||||
uint32_t ptirq_get_intr_data(const struct acrn_vm *target_vm, uint64_t *buffer, uint32_t buffer_cnt);
|
||||
|
Loading…
Reference in New Issue
Block a user