mirror of
https://github.com/projectacrn/acrn-hypervisor.git
synced 2025-06-19 20:22:46 +00:00
doc: update HLD VT-d
transcode, edit, and upload HLD 0.7 section 3.8 (VT-d) Tracked-on: #1643 Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
This commit is contained in:
parent
e141150e4c
commit
7c192db1ba
@ -13,4 +13,5 @@ Hypervisor high-level design
|
||||
I/O Emulation <hv-io-emulation>
|
||||
Physical Interrupt <hv-interrupt>
|
||||
Timer <hv-timer>
|
||||
Virtual Interrupt <hv-virt-interrupt.rst>
|
||||
Virtual Interrupt <hv-virt-interrupt>
|
||||
VT-d <hv-vt-d>
|
||||
|
372
doc/developer-guides/hld/hv-vt-d.rst
Normal file
372
doc/developer-guides/hld/hv-vt-d.rst
Normal file
@ -0,0 +1,372 @@
|
||||
.. _vt-d-hld:
|
||||
|
||||
VT-d
|
||||
####
|
||||
|
||||
VT-d stands for Intel Virtual Technology for Directed IO, and provides
|
||||
hardware capabilities to assign I/O devices to VMs and extending the
|
||||
protection and isolation properties of VMs for I/O operations.
|
||||
|
||||
VT-d provides the following main functions:
|
||||
|
||||
- **DMA remapping**: for supporting address translations for DMA from
|
||||
devices.
|
||||
|
||||
- **Interrupt remapping**: for supporting isolation and routing of
|
||||
interrupts from devices and external interrupt controllers to
|
||||
appropriate VMs.
|
||||
|
||||
- **Interrupt posting**: for supporting direct delivery of virtual
|
||||
interrupts from devices and external controllers to virtual
|
||||
processors.
|
||||
|
||||
ACRN hypervisor supports DMA remapping that provides address translation
|
||||
capability for PCI pass-through devices, and second-level translation,
|
||||
which applies to requests-without-PASID. ACRN does not support
|
||||
First-level / nested translation.
|
||||
|
||||
DMAR Engines Discovery
|
||||
**********************
|
||||
|
||||
DMA Remapping Report ACPI table
|
||||
===============================
|
||||
|
||||
For generic platforms, ACRN hypervisor retrieves DMAR information from
|
||||
the ACPI table, and parses the DMAR reporting structure to discover the
|
||||
number of DMA-remapping hardware units present in the platform as well as
|
||||
the devices under the scope of a remapping hardware unit, as shown in
|
||||
:numref:`dma-remap-report`:
|
||||
|
||||
.. figure:: images/vt-d-image90.png
|
||||
:align: center
|
||||
:name: dma-remap-report
|
||||
|
||||
DMA Remapping Reporting Structure
|
||||
|
||||
Pre-parsed DMAR information
|
||||
===========================
|
||||
|
||||
For specific platforms, ACRN hypervisor uses pre-parsed DMA remapping
|
||||
reporting information directly to save time for hypervisor boot-up.
|
||||
|
||||
DMA remapping unit for integrated graphics device
|
||||
=================================================
|
||||
|
||||
Generally, there is a dedicated remapping hardware unit for the Intel
|
||||
integrated graphics device. ACRN implements GVT-g for graphics, but
|
||||
GVT-g is not compatible with VT-d. The remapping hardware unit for
|
||||
graphics device is disabled on ACRN if GVT-g is enabled. If the graphics
|
||||
device needs to pass-through to a VM, then the remapping hardware unit
|
||||
must be enabled.
|
||||
|
||||
DMA Remapping
|
||||
*************
|
||||
|
||||
DMA remapping hardware is used to isolate device access to memory,
|
||||
enabling each device in the system to be assigned to a specific domain
|
||||
through a distinct set of paging structures.
|
||||
|
||||
Domains
|
||||
=======
|
||||
|
||||
A domain is abstractly defined as an isolated environment in the
|
||||
platform, to which a subset of the host physical memory is allocated.
|
||||
The memory resource of a domain is specified by the address translation
|
||||
tables.
|
||||
|
||||
Device to Domain Mapping Structure
|
||||
==================================
|
||||
|
||||
VT-d hardware uses root-table and context-tables to build the mapping
|
||||
between devices and domains as shown in :numref:`vt-d-mapping`.
|
||||
|
||||
.. figure:: images/vt-d-image44.png
|
||||
:align: center
|
||||
:name: vt-d-mapping
|
||||
|
||||
Device to Domain Mapping structures
|
||||
|
||||
The root-table is 4-KByte in size and contains 256 root-entries to cover
|
||||
the PCI bus number space (0-255). Each root-entry contains a
|
||||
context-table pointer to reference the context-table for devices on the
|
||||
bus identified by the root-entry, if the present flag of the root-entry
|
||||
is set.
|
||||
|
||||
Each context-table contains 256 entries, with each entry corresponding
|
||||
to a PCI device function on the bus. For a PCI device, the device and
|
||||
function numbers (8-bits) are used to index into the context-table. Each
|
||||
context-entry contains a Second-level Page-table Pointer, which provides
|
||||
the host physical address of the address translation structure in system
|
||||
memory to be used for remapping requests-without-PASID processed through
|
||||
the context-entry.
|
||||
|
||||
For a given Bus, Device, and Function combination as shown in
|
||||
:numref:`bdf-passthru`, a pass-through device can be associated with
|
||||
address translation structures for a domain.
|
||||
|
||||
.. figure:: images/vt-d-image19.png
|
||||
:align: center
|
||||
:name: bdf-passthru
|
||||
|
||||
BDF Format of Pass-through Device
|
||||
|
||||
Refer to the `VT-d spec`_ for the more details of Device to domain
|
||||
mapping structures.
|
||||
|
||||
.. _VT-d spec:
|
||||
https://software.intel.com/sites/default/files/managed/c5/15/vt-directed-io-spec.pdf
|
||||
|
||||
Address Translation Structures
|
||||
==============================
|
||||
|
||||
On ACRN, EPT table of a domain is used as the address translation
|
||||
structures for the devices assigned to the domain, as shown
|
||||
:numref:`vt-d-DMA`.
|
||||
|
||||
.. figure:: images/vt-d-image40.png
|
||||
:align: center
|
||||
:name: vt-d-DMA
|
||||
|
||||
DMA Remapping Diagram
|
||||
|
||||
When the device attempts to access system memory, the DMA
|
||||
remapping hardware intercepts the access, utilizes the EPT table of the
|
||||
domain to determine whether the access is allowed, and translates the DMA
|
||||
address according to the EPT table from guest physical address (GPA) to
|
||||
host physical address (HPA).
|
||||
|
||||
Domains and Memory Isolation
|
||||
============================
|
||||
|
||||
There are no DMA operations inside the hypervisor, so ACRN doesn’t
|
||||
create a domain for the hypervisor. No DMA operations from pass-through
|
||||
devices can access the hypervisor memory.
|
||||
|
||||
ACRN treats each virtual machine (VM) as a separate domain. For a VM,
|
||||
there is a EPT table for Normal world, and there may be a EPT table for
|
||||
Secure World. Secure world can access Normal World's memory, but Normal
|
||||
world cannot access Secure World's memory.
|
||||
|
||||
VM0 domain
|
||||
VM0 domain is created when ithe hypervisor creates VM0 for the
|
||||
Service OS.
|
||||
|
||||
IOMMU uses the EPT table of Normal world of VM0 as the address
|
||||
translation structures for the devices in VM0 domain. The Normal world’s
|
||||
EPT table of VM0 doesn’t include the memory resource of ithe hypervisor
|
||||
and Secure worlds if any. So the devices in VM0 domain can’t access the
|
||||
memory belong to hypervisor or secure worlds.
|
||||
|
||||
Other domains
|
||||
Other VM domains will be created when hypervisor creates User OS. One
|
||||
domain for each User OS.
|
||||
|
||||
IOMMU uses the EPT table of Normal world of a VM as the address
|
||||
translation structures for the devices in the domain. The Normal world’s
|
||||
EPT table of the VM only allows devices to access the memory
|
||||
allocated for Normal world of the VM.
|
||||
|
||||
Page-walk coherency
|
||||
===================
|
||||
|
||||
For the VT-d hardware, which doesn’t support page-walk coherency,
|
||||
hypervisor needs to make sure the updates of VT-d tables are synced in
|
||||
memory:
|
||||
|
||||
- Device to Domain Mapping Structures, including Root-entries and
|
||||
Context-entries
|
||||
|
||||
- EPT table of a VM.
|
||||
|
||||
ACRN will flush the related cache line after updates of these structures
|
||||
if the VT-d hardware doesn’t support page-walk coherency.
|
||||
|
||||
Super-page support
|
||||
==================
|
||||
|
||||
ACRN VT-d reuses the EPT table as address a translation table. VT-d capability
|
||||
for super-page support should be identical with the usage of EPT table.
|
||||
|
||||
Snoop control
|
||||
=============
|
||||
|
||||
If VT-d hardware supports snoop control, it allows VT-d to control to
|
||||
ignore the “no-snoop attribute” in PCI-E transactions.
|
||||
|
||||
The following table shows the snoop behavior of DMA operation controlled by the
|
||||
combination of:
|
||||
|
||||
- Snoop Control capability of VT-d DMAR unit
|
||||
- The setting of SNP filed in leaf PTE
|
||||
- No-snoop attribute in PCI-e request
|
||||
|
||||
.. list-table::
|
||||
:widths: 25 25 25 25
|
||||
:header-rows: 1
|
||||
|
||||
* - SC cap of VT-d
|
||||
- SNP filed in leaf PTE
|
||||
- No-snoop attribute in request
|
||||
- Snoop behavior
|
||||
|
||||
* - 0
|
||||
- 0 (must be 0)
|
||||
- no snoop
|
||||
- No snoop
|
||||
|
||||
* - 0
|
||||
- 0 (must be 0)
|
||||
- snoop
|
||||
- Snoop
|
||||
|
||||
* - 1
|
||||
- 1
|
||||
- snoop / no snoop
|
||||
- Snoop
|
||||
|
||||
* - 1
|
||||
- 0
|
||||
- no snoop
|
||||
- No snoop
|
||||
|
||||
* - 1
|
||||
- 0
|
||||
- snoop
|
||||
- Snoop
|
||||
|
||||
ACRN enable Snoop Control by default if all enabled VT-d DMAR units
|
||||
support Snoop Control by setting bit 11 of leaf PTE of EPT table. Bit 11
|
||||
of leaf PTE of EPT is ignored by MMU. So no side effect for MMU.
|
||||
|
||||
If one of the enabled VT-d DMAR units doesn’t support Snoop Control,
|
||||
then Bit 11 of leaf PET of EPT is not set since the field is treated as
|
||||
reserved(0) by VT-d hardware implementations not supporting Snoop
|
||||
Control.
|
||||
|
||||
Initialization
|
||||
**************
|
||||
|
||||
During hypervisor initialization, it registers DMAR units on the
|
||||
platform according to the reparsed information or DMAR table. There may
|
||||
be multiple DMAR units on the platform, ACRN allows some of the DMAR
|
||||
units to be ignored. If some DMAR unit(s) are marked as ignored, they
|
||||
would not be enabled.
|
||||
|
||||
Hypervisor creates VM0 domain using the Normal World’s EPT table of VM0
|
||||
as address translation table when creating VM0 as Service OS. And all
|
||||
PCI devices on the platform are added to VM0 domain. Then enable DMAR
|
||||
translation for DMAR unit(s) if they are not marked as ignored.
|
||||
|
||||
Device assignment
|
||||
*****************
|
||||
|
||||
All devices are initially added to VM0 domain.
|
||||
To assign a device means to assign the device to an User OS. The device
|
||||
is remove from VM0 domain and added to the VM domain related to the User
|
||||
OS, which changes the address translation table from EPT of VM0 to EPT
|
||||
of User OS for the device.
|
||||
|
||||
To un-assign a device means to un-assign the device from an User OS. The
|
||||
device is remove from the VM domain related to the User OS, then added
|
||||
back to VM0 domain, which changes the address translation table from EPT
|
||||
of User OS to EPT of VM0 for the device.
|
||||
|
||||
Power Management support for S3
|
||||
*******************************
|
||||
|
||||
During platform S3 suspend and resume, the VT-d register values will be
|
||||
lost. ACRN VT-d provide APIs to be called during S3 suspend and resume.
|
||||
|
||||
During S3 suspend, some register values are saved in the memory, and
|
||||
DMAR translation is disabled. During S3 resume, the register values
|
||||
saved are restored. Root table address register is set. DMAR translation
|
||||
is enabled.
|
||||
|
||||
All the operations for S3 suspend and resume are performed on all DMAR
|
||||
units on the platform, except for the DMAR units marked ignored.
|
||||
|
||||
Error Handling
|
||||
**************
|
||||
|
||||
ACRN VT-d supports DMA remapping error reporting. ACRN VT-d requests a
|
||||
IRQ / vector for DMAR error reporting. A DMAR fault handler is
|
||||
registered for the IRQ. DMAR unit supports report fault event via MSI.
|
||||
When a fault event occurs, a MSI is generated, so that the DMAR fault
|
||||
handler will be called to report error event.
|
||||
|
||||
Data structures and interfaces
|
||||
******************************
|
||||
|
||||
.. note:: Needs API reference to include/arch/x86/vtd.h
|
||||
|
||||
initialization and deinitialization
|
||||
===================================
|
||||
|
||||
The following APIs are provided during initialization and
|
||||
deinitialization:
|
||||
|
||||
- void init_iommu(void)
|
||||
|
||||
Register DMAR units on the platform according to the reparsed
|
||||
information or DMAR table.
|
||||
|
||||
- void init_iommu_vm0_domain(struct vm \*vm0)
|
||||
|
||||
Create VM0 domain using the Normal World’s EPT table of VM0 as address
|
||||
translation table. Add all PCI devices on the platform to VM0 domain. Then enable
|
||||
DMAR translation.
|
||||
|
||||
- void destroy_iommu_domain(struct iommu_domain \*domain)
|
||||
|
||||
Destroy the iommu domain.
|
||||
|
||||
VT-d
|
||||
====
|
||||
|
||||
The following API are provided during runtime:
|
||||
|
||||
- void suspend_iommu(void)
|
||||
|
||||
Suspend IOMMU.
|
||||
|
||||
- void resume_iommu(void)
|
||||
|
||||
Resume IOMMU.
|
||||
|
||||
- struct iommu_domain \*create_iommu_domain(uint16_t vm_id,
|
||||
uint64_t translation_table, uint32_t addr_width)
|
||||
|
||||
Create a iommu domain for a VM specified by vm_id.
|
||||
translation_table should be the physical address of EPT table of the VM
|
||||
specified by the vm_id, the value cannot be NULL.
|
||||
Return the iommu_domain created for the VM if not NULL.
|
||||
Error if the return NULL.
|
||||
|
||||
- int assign_iommu_device(struct iommu_domain \*domain, uint8_t
|
||||
bus, uint8_t devfun)
|
||||
|
||||
Assign a device specified by bus & devfun to a iommu domain. The device
|
||||
is removed from VM0 domain and added to the domain specified.
|
||||
|
||||
domain: specified the domain the device should be assigned to.
|
||||
|
||||
bus: the 8-bit bus value of the pass-through device.
|
||||
|
||||
devfun: the 8-bit device function value of the pass-through device.
|
||||
|
||||
return: return 0 if success, other if error.
|
||||
|
||||
- int unassign_iommu_device(struct iommu_domain \*domain,
|
||||
|
||||
uint8_t bus, uint8_t devfun);
|
||||
|
||||
Unassign a device specified by bus & devfun from a iommu domain.
|
||||
|
||||
domain: specified the domain the device should be removed from.
|
||||
|
||||
bus: the 8-bit bus value of the pass-through device.
|
||||
|
||||
devfun: the 8-bit device function value of the pass-through device.
|
||||
|
||||
return: return 0 if success, other if error.
|
||||
|
BIN
doc/developer-guides/hld/images/vt-d-image19.png
Normal file
BIN
doc/developer-guides/hld/images/vt-d-image19.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 1.9 KiB |
BIN
doc/developer-guides/hld/images/vt-d-image40.png
Normal file
BIN
doc/developer-guides/hld/images/vt-d-image40.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 7.8 KiB |
BIN
doc/developer-guides/hld/images/vt-d-image44.png
Normal file
BIN
doc/developer-guides/hld/images/vt-d-image44.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 48 KiB |
BIN
doc/developer-guides/hld/images/vt-d-image90.png
Normal file
BIN
doc/developer-guides/hld/images/vt-d-image90.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 7.5 KiB |
Loading…
Reference in New Issue
Block a user