doc: update hv device passthrough document

Fixed misspellings and rst formatting issues.
Added ptdev.h to the list of include file for doxygen

Tracked-On: #3882
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
This commit is contained in:
Binbin Wu 2019-10-18 16:21:23 +08:00 committed by deb-intel
parent b05c1afa0b
commit 76f2e28e13
5 changed files with 154 additions and 53 deletions

View File

@ -808,6 +808,7 @@ INPUT = custom-doxygen/mainpage.md \
../hypervisor/include/arch/x86/guest/vmx_io.h \
../hypervisor/include/arch/x86/guest/assign.h \
../hypervisor/include/common/hypercall.h \
../hypervisor/include/common/ptdev.h \
../hypervisor/include/public/acrn_common.h \
../hypervisor/include/public/acrn_hv_defs.h \
../hypervisor/include/arch/x86/guest/vcpu.h \

View File

@ -7,9 +7,9 @@ A critical part of virtualization is virtualizing devices: exposing all
aspects of a device including its I/O, interrupts, DMA, and configuration.
There are three typical device
virtualization methods: emulation, para-virtualization, and passthrough.
Both emulation and passthrough are used in ACRN project. Device
emulation is discussed in :ref:`hld-io-emulation` and
device passthrough will be discussed here.
All emulation, para-virtualization and passthrough are used in ACRN project. Device
emulation is discussed in :ref:`hld-io-emulation`, para-virtualization is discussed
in :ref:`hld-virtio-devices` and device passthrough will be discussed here.
In the ACRN project, device emulation means emulating all existing hardware
resource through a software component device model running in the
@ -34,15 +34,18 @@ can't support device sharing.
Passthrough in the hypervisor provides the following functionalities to
allow VM to access PCI devices directly:
- DMA Remapping by VT-d for PCI device: hypervisor will setup DMA
- VT-d DMA Remapping for PCI devices: hypervisor will setup DMA
remapping during VM initialization phase.
- VT-d Interrupt-remapping for PCI devices: hypervisor will enable
VT-d interrupt-remapping for PCI devices for security considerations.
- MMIO Remapping between virtual and physical BAR
- Device configuration Emulation
- Remapping interrupts for PCI device
- Remapping interrupts for PCI devices
- ACPI configuration Virtualization
- GSI sharing violation check
The following diagram details passthrough initialization control flow in ACRN:
The following diagram details passthrough initialization control flow in ACRN
for post-launched VM:
.. figure:: images/passthru-image22.png
:align: center
@ -60,14 +63,39 @@ passthrough, as detailed here:
Passthrough Device Status
DMA Remapping
*************
Owner of Passthrough Devices
****************************
ACRN hypervisor will do PCI enumeration to discover the PCI devices on the platform.
According to the hypervisor/VM configurations, the owner of these PCI devices can be
one the following 4 cases:
- **Hypervisor**: hypervisor uses UART device as the console in debug version for
debug purpose, so the UART device is owned by hypervisor and is not visible
to any VM. For now, UART is the only pci device could be owned by hypervisor.
- **Pre-launched VM**: The passthrough devices will be used in a pre-launched VM is
pre-defined in VM configuration. These passthrough devices are owned by the
pre-launched VM after the VM is created. These devices will not be removed
from the pre-launched VM. There could be pre-launched VM(s) in logical partition
mode and hybrid mode.
- **Service VM**: All the passthrough devices except these described above (owned by
hypervisor or pre-launched VM(s)) are assigned to Service VM. And some of these devices
can be assigned to a post-launched VM according to the passthrough device list
specified in the parameters of the ACRN DM.
- **Post-launched VM**: A list of passthrough devices can be specified in the parameters of
the ACRN DM. When creating a post-launched VM, these specified devices will be moved
from Service VM domain to the post-launched VM domain. After the post-launched VM is
powered-off, these devices will be moved back to Service VM domain.
VT-d DMA Remapping
******************
To enable passthrough, for VM DMA access the VM can only
support GPA, while physical DMA requires HPA. One work-around
is building identity mapping so that GPA is equal to HPA, but this
is not recommended as some VM don't support relocation well. To
address this issue, Intel introduces VT-d in chipset to add one
address this issue, Intel introduces VT-d in the chipset to add one
remapping engine to translate GPA to HPA for DMA operations.
Each VT-d engine (DMAR Unit), maintains a remapping structure
@ -76,21 +104,16 @@ page table for GPA/HPA translation as output. The GPA/HPA translation
page table is similar to a normal multi-level page table.
VM DMA depends on Intel VT-d to do the translation from GPA to HPA, so we
need to enable VT-d IOMMU engine in ACRN before we can passthrough any device. SOS
need to enable VT-d IOMMU engine in ACRN before we can passthrough any device. Service VM
in ACRN is a VM running in non-root mode which also depends
on VT-d to access a device. In SOS DMA remapping
on VT-d to access a device. In Service VM DMA remapping
engine settings, GPA is equal to HPA.
ACRN hypervisor checks DMA-Remapping Hardware unit Definition (DRHD) in
host DMAR ACPI table to get basic info, then sets up each DMAR unit. For
simplicity, ACRN reuses EPT table as the translation table in DMAR
unit for each passthrough device. The control flow is shown in the
following figures:
.. figure:: images/passthru-image72.png
:align: center
DMA Remapping control flow during HV init
unit for each passthrough device. The control flow of assigning and de-assigning
a passthrough device to/from a post-launched VM is shown in the following figures:
.. figure:: images/passthru-image86.png
:align: center
@ -102,25 +125,45 @@ following figures:
ptdev de-assignment control flow
VT-d Interrupt-remapping
************************
The VT-d interrupt-remapping architecture enables system software to
control and censor external interrupt requests generated by all sources
including those from interrupt controllers (I/OxAPICs), MSI/MSI-X capable
devices including endpoints, root-ports and Root-Complex integrated
end-points.
ACRN forces to enabled VT-d interrupt-remapping feature for security reasons.
If the VT-d hardware doesn't support interrupt-remapping, then ACRN will
refuse to boot VMs.
VT-d Interrupt-remapping is NOT related to the translation from physical
interrupt to virtual interrupt or vice versa. The term VT-d interrupt-remapping
remaps the interrupt index in the VT-d interrupt-remapping table to the physical
interrupt vector after checking the external interrupt request is valid. Translation
physical vector to virtual vector is still needed to be done by hypervisor, which is
also described in the below section :ref:`_interrupt-remapping`.
MMIO Remapping
**************
For PCI MMIO BAR, hypervisor builds EPT mapping between virtual BAR and
physical BAR, then VM can access MMIO directly.
There is one exception, MSI-X table is also in a MMIO BAR. Hypervisor needs to trap the
accesses to MSI-X table. So the page(s) having MSI-X table should not be accessed by guest
directly. EPT mapping is not built for these pages having MSI-X table.
Device configuration emulation
******************************
PCI configuration is based on access of port 0xCF8/CFC. ACRN
implements PCI configuration emulation to handle 0xCF8/CFC to control
PCI device through two paths: implemented in hypervisor or in SOS device
PCI device through two paths: implemented in hypervisor or in Service VM device
model.
- When configuration emulation is in the hypervisor, the interception of
0xCF8/CFC port and emulation of PCI configuration space access are
tricky and unclean. Therefore the final solution is to reuse the
PCI emulation infrastructure of SOS device model. The hypervisor
PCI emulation infrastructure of Service VM device model. The hypervisor
routes the UOS 0xCF8/CFC access to device model, and keeps blind to the
physical PCI devices. Upon receiving UOS PCI configuration space access
request, device model needs to emulate some critical space, for instance,
@ -131,6 +174,25 @@ model.
this, device model is linked with lib pci access to access physical PCI
device.
MSI-X table emulation
*********************
VM accesses to MSI-X table should be trapped so that hypervisor has the
information to map the virtual vector and physical vector. EPT mapping should
be skipped for the 4KB pages having MSI-X table.
There are three situations for the emulation of MSI-X table:
- **Service VM**: accesses to MSI-X table are handled by HV MMIO handler (4KB adjusted up
and down). HV will remap interrupts.
- **Post-launched VM**: accesses to MSI-X Tables are handled by DM MMIO handler
(4KB adjusted up and down) and when DM (Service VM) writes to the table, it will be
intercepted by HV MMIO handler and HV will remap interrupts.
- **Pre-launched VM**: Writes to MMIO region in MSI-X Table BAR handled by HV MMIO
handler. If the offset falls within the MSI-X table (offset, offset+tables_size),
HV remaps interrupts.
.. _interrupt-remapping:
Interrupt Remapping
@ -152,17 +214,17 @@ The hypervisor will record different information for interrupt
distribution: physical and virtual IOAPIC pin for IOAPIC source,
physical and virtual BDF and other info for MSI source.
SOS passthrough is also in the scope of interrupt remapping which is
Service VM passthrough is also in the scope of interrupt remapping which is
done on-demand rather than on hypervisor initialization.
.. figure:: images/passthru-image102.png
:align: center
:name: init-remapping
Initialization of remapping of virtual IOAPIC interrupts for SOS
Initialization of remapping of virtual IOAPIC interrupts for Service VM
:numref:`init-remapping` above illustrates how remapping of (virtual) IOAPIC
interrupts are remapped for SOS. VM exit occurs whenever SOS tries to
interrupts are remapped for Service VM. VM exit occurs whenever Service VM tries to
unmask an interrupt in (virtual) IOAPIC by writing to the Redirection
Table Entry (or RTE). The hypervisor then invokes the IOAPIC emulation
handler (refer to :ref:`hld-io-emulation` for details on I/O emulation) which
@ -173,13 +235,13 @@ Remapping of (virtual) PIC interrupts are set up in a similar sequence:
.. figure:: images/passthru-image98.png
:align: center
Initialization of remapping of virtual MSI for SOS
Initialization of remapping of virtual MSI for Service VM
This figure illustrates how mappings of MSI or MSIX are set up for
SOS. SOS is responsible for issuing an hypercall to notify the
Service VM. Service VM is responsible for issuing a hypercall to notify the
hypervisor before it configures the PCI configuration space to enable an
MSI. The hypervisor takes this opportunity to set up a remapping for the
given MSI or MSIX before it is actually enabled by SOS.
given MSI or MSIX before it is actually enabled by Service VM.
When the UOS needs to access the physical device by passthrough, it uses
the following steps:
@ -191,15 +253,15 @@ the following steps:
according to ptirq_remapping_info.
- Hypervisor delivers the interrupt to UOS.
When the SOS needs to use the physical device, the passthrough is also
active because the SOS is the first VM. The detail steps are:
When the Service VM needs to use the physical device, the passthrough is also
active because the Service VM is the first VM. The detail steps are:
- SOS get all physical interrupts. It assigns different interrupts for
- Service VM get all physical interrupts. It assigns different interrupts for
different VMs during initialization and reassign when a VM is created or
deleted.
- When physical interrupt is trapped, an exception will happen after VMCS
has been set.
- Hypervisor will handle the vm exit issue according to
- Hypervisor will handle the VM exit issue according to
ptirq_remapping_info and translates the vector.
- The interrupt will be injected the same as a virtual interrupt.
@ -209,32 +271,40 @@ ACPI Virtualization
ACPI virtualization is designed in ACRN with these assumptions:
- HV has no knowledge of ACPI,
- SOS owns all physical ACPI resources,
- Service VM owns all physical ACPI resources,
- UOS sees virtual ACPI resources emulated by device model.
Some passthrough devices require physical ACPI table entry for
initialization. The device model will create such device entry based on
the physical one according to vendor ID and device ID. Virtualization is
implemented in SOS device model and not in scope of the hypervisor.
implemented in Service VM device model and not in scope of the hypervisor.
For pre-launched VM, ACRN hypervisor doesn't support the ACPI virtualization,
so devices relying on ACPI table are not supported.
GSI Sharing Violation Check
***************************
All the PCI devices that are sharing the same GSI should be assigned to
the same VM to avoid physical GSI sharing between multiple VMs. For
devices that don't support MSI, ACRN DM
shares the same GSI pin to a GSI
the same VM to avoid physical GSI sharing between multiple VMs.
In logical partition mode or hybrid mode, the PCI devices assigned to
pre-launched VM is statically pre-defined. Developers should take care not to
violate the rule.
For post-launched VM, devices that don't support MSI, ACRN DM puts the devices
sharing the same GSI pin to a GSI
sharing group. The devices in the same group should be assigned together to
the current VM, otherwise, none of them should be assigned to the
current VM. A device that violates the rule will be rejected to be
passthrough. The checking logic is implemented in Device Mode and not
passed-through. The checking logic is implemented in Device Model and not
in scope of hypervisor.
The platform GSI information is in devicemodel/hw/pci/platform_gsi_info.c
for limited platform (currently, only APL MRB). For other platforms, the platform
specific GSI information should be added to activate the checking of GSI sharing violation.
Data structures and interfaces
******************************
The following APIs are provided to initialize interrupt remapping for
SOS:
The following APIs are common APIs provided to initialize interrupt remapping for
VMs:
.. doxygenfunction:: ptirq_intx_pin_remap
:project: Project ACRN
@ -242,8 +312,9 @@ SOS:
.. doxygenfunction:: ptirq_prepare_msix_remap
:project: Project ACRN
The following APIs are provided to manipulate the interrupt remapping
for UOS.
Post-launched VM needs to pre-allocate interrupt entries during VM initialization.
Post-launched VM needs to free interrupt entries during VM de-initialization.
The following APIs are provided to pre-allocate/free interrupt entries for post-launched VM:
.. doxygenfunction:: ptirq_add_intx_remapping
:project: Project ACRN
@ -258,3 +329,32 @@ The following APIs are provided to acknowledge a virtual interrupt.
.. doxygenfunction:: ptirq_intx_ack
:project: Project ACRN
The following APIs are provided to handle ptdev interrupt:
.. doxygenfunction:: ptdev_init
:project: Project ACRN
.. doxygenfunction:: ptirq_softirq
:project: Project ACRN
.. doxygenfunction:: ptirq_alloc_entry
:project: Project ACRN
.. doxygenfunction:: ptirq_release_entry
:project: Project ACRN
.. doxygenfunction:: ptdev_release_all_entries
:project: Project ACRN
.. doxygenfunction:: ptirq_activate_entry
:project: Project ACRN
.. doxygenfunction:: ptirq_deactivate_entry
:project: Project ACRN
.. doxygenfunction:: ptirq_dequeue_softirq
:project: Project ACRN
.. doxygenfunction:: ptirq_get_intr_data
:project: Project ACRN

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 4.2 KiB

View File

@ -156,14 +156,14 @@ extern spinlock_t ptdev_lock;
/**
* @brief Handler of softirq for passthrough device.
*
* When hypervisor receive a physcial interrupt from passthrough device, it
* When hypervisor receive a physical interrupt from passthrough device, it
* will enqueue a ptirq entry and raise softirq SOFTIRQ_PTDEV. This function
* is the handler of the softirq, it handles the interrupt and injects the
* virtual into VM.
* The handler is reigstered by calling @ref ptdev_init during hypervisor
* intialization.
* The handler is registered by calling @ref ptdev_init during hypervisor
* initialization.
*
* @param[in] pcpu_id physcial cpu id of the soft irq
* @param[in] pcpu_id physical cpu id of the soft irq
*
*/
void ptirq_softirq(uint16_t pcpu_id);
@ -178,14 +178,14 @@ void ptirq_softirq(uint16_t pcpu_id);
*/
void ptdev_init(void);
/**
* @brief Deactiveate and release all ptirq entries for a VM.
* @brief Deactivate and release all ptirq entries for a VM.
*
* This function deactiveates and releases all ptirq entries for a VM. The function
* This function deactivates and releases all ptirq entries for a VM. The function
* should only be called after the VM is already down.
*
* @param[in] vm acrn_vm on which the ptirq entries will be released
*
* @pre VM is realdy down
* @pre VM is already down
*
*/
void ptdev_release_all_entries(const struct acrn_vm *vm);
@ -193,9 +193,9 @@ void ptdev_release_all_entries(const struct acrn_vm *vm);
/**
* @brief Dequeue an entry from per cpu ptdev softirq queue.
*
* Dequeue an entry from the ptdev softirq queue on the specific physcial cpu.
* Dequeue an entry from the ptdev softirq queue on the specific physical cpu.
*
* @param[in] pcpu_id physcial cpu id
* @param[in] pcpu_id physical cpu id
*
* @retval NULL when \p when the queue is empty
* @retval !NULL when \p there is available ptirq_remapping_info entry in the queue
@ -227,11 +227,11 @@ void ptirq_release_entry(struct ptirq_remapping_info *entry);
/**
* @brief Activate a irq for the associated passthrough device.
*
* After activating the ptirq entry, the physcial interrupt irq of passthrough device will be handled
* After activating the ptirq entry, the physical interrupt irq of passthrough device will be handled
* by the handler ptirq_interrupt_handler.
*
* @param[in] entry the ptirq_remapping_info entry that will be associated with the physcial irq.
* @param[in] phys_irq physcial interrupt irq for the entry
* @param[in] entry the ptirq_remapping_info entry that will be associated with the physical irq.
* @param[in] phys_irq physical interrupt irq for the entry
*
* @retval success when \p return value >=0
* @retval success when \p return <0
@ -252,7 +252,7 @@ void ptirq_deactivate_entry(struct ptirq_remapping_info *entry);
* @param[out] buffer the buffer to interrupt information stored to.
* @param[in] buffer_cnt the size of the buffer.
*
* @retval the actural size the buffer filled with the interrupt information
* @retval the actual size the buffer filled with the interrupt information
*
*/
uint32_t ptirq_get_intr_data(const struct acrn_vm *target_vm, uint64_t *buffer, uint32_t buffer_cnt);