mirror of
https://github.com/projectacrn/acrn-hypervisor.git
synced 2025-09-10 13:19:31 +00:00
Doc: Edits to MBA and CAT documentation.
Signed-off-by: Deb Taylor <deb.taylor@intel.com>
This commit is contained in:
@@ -64,7 +64,7 @@ for its RT VM:
|
||||
- LAPIC pass-thru
|
||||
- Polling mode driver
|
||||
- ART (always running timer)
|
||||
- other TCC feautres like split lock detection, Pseudo locking for cache
|
||||
- other TCC features like split lock detection, Pseudo locking for cache
|
||||
|
||||
|
||||
Hardware Requirements
|
||||
@@ -112,7 +112,7 @@ provides I/O mediation to VMs. Some of the PCIe devices function as a
|
||||
pass-through mode to User VMs according to VM configuration. In addition,
|
||||
the Service VM could run the IC applications and HV helper applications such
|
||||
as the Device Model, VM manager, etc. where the VM manager is responsible
|
||||
for VM start/stop/pause, virtual CPU pause/resume,etc.
|
||||
for VM start/stop/pause, virtual CPU pause/resume, etc.
|
||||
|
||||
.. figure:: images/over-image34.png
|
||||
:align: center
|
||||
@@ -130,7 +130,7 @@ and Real-Time (RT) VM.
|
||||
compared to ACRN 1.0 is that:
|
||||
|
||||
- a pre-launched VM is supported in ACRN 2.0, with isolated resources, including
|
||||
CPU, memory, and HW devices etc
|
||||
CPU, memory, and HW devices, etc
|
||||
|
||||
- ACRN 2.0 adds a few necessary device emulations in hypervisor like vPCI and vUART to avoid
|
||||
interference between different VMs
|
||||
@@ -236,7 +236,7 @@ Hypervisor
|
||||
|
||||
ACRN takes advantage of Intel Virtualization Technology (Intel VT).
|
||||
The ACRN HV runs in Virtual Machine Extension (VMX) root operation,
|
||||
host mode, or VMM mode, while the Serivce and User VM guests run
|
||||
host mode, or VMM mode, while the Service and User VM guests run
|
||||
in VMX non-root operation, or guest mode. (We'll use "root mode"
|
||||
and "non-root mode" for simplicity).
|
||||
|
||||
@@ -266,7 +266,7 @@ used by commercial OS).
|
||||
managing physical resources at runtime. Examples include handling
|
||||
physical interrupts and low power state changes.
|
||||
|
||||
- A layer siting on top of hardware management enables virtual
|
||||
- A layer sitting on top of hardware management enables virtual
|
||||
CPUs (or vCPUs), leveraging Intel VT. A vCPU loop runs a vCPU in
|
||||
non-root mode and handles VM exit events triggered by the vCPU.
|
||||
This layer handles CPU and memory-related VM
|
||||
@@ -365,7 +365,7 @@ User VM
|
||||
|
||||
Currently, ACRN can boot Linux and Android guest OSes. For Android guest OS, ACRN
|
||||
provides a VM environment with two worlds: normal world and trusty
|
||||
world. The Android OS runs in the the normal world. The trusty OS and
|
||||
world. The Android OS runs in the normal world. The trusty OS and
|
||||
security sensitive applications run in the trusty world. The trusty
|
||||
world can see the memory of normal world, but normal world cannot see
|
||||
trusty world.
|
||||
@@ -436,10 +436,10 @@ to boot Linux or Android guest OS.
|
||||
|
||||
The vSBL image is released as a part of the Service OS root
|
||||
filesystem (rootfs). The vSBL is copied to the User VM memory by the VM manager
|
||||
in the Service VM while creating the the User VM virtual BSP of the User VM. The Service VM passes the
|
||||
in the Service VM while creating the User VM virtual BSP of the User VM. The Service VM passes the
|
||||
start of vSBL and related information to HV. HV sets the guest RIP of the User VM's
|
||||
virtual BSP as the start of vSBL and related guest registers, and
|
||||
launches the the User VM virtual BSP. The vSBL starts running in the virtual
|
||||
launches the User VM virtual BSP. The vSBL starts running in the virtual
|
||||
real mode within the User VM. Conceptually, vSBL is part of the User VM runtime.
|
||||
|
||||
In the current design, the vSBL supports booting Android guest OS or
|
||||
@@ -458,8 +458,8 @@ the EFI boot of the User VM on the ACRN hypervisor platform.
|
||||
The OVMF is copied to the User VM memory by the VM manager in the Service VM while creating
|
||||
the User VM virtual BSP of the User VM. The Service VM passes the start of OVMF and related
|
||||
information to HV. HV sets guest RIP of the User VM virtual BSP as the start of OVMF
|
||||
and related guest registers, and launches the the User VM virtual BSP. The OVMF starts
|
||||
running in the virtual real mode within the the User VM. Conceptually, OVMF is part of the User VM runtime.
|
||||
and related guest registers, and launches the User VM virtual BSP. The OVMF starts
|
||||
running in the virtual real mode within the User VM. Conceptually, OVMF is part of the User VM runtime.
|
||||
|
||||
Freedom From Interference
|
||||
*************************
|
||||
@@ -495,7 +495,7 @@ the following mechanisms:
|
||||
|
||||
2. The User VM cannot access the memory of the Service VM and the hypervisor
|
||||
|
||||
3. The hypervisor does not unintendedly access the memory of the Serivce or User VM.
|
||||
3. The hypervisor does not unintendedly access the memory of the Service or User VM.
|
||||
|
||||
- Destination of external interrupts are set to be the physical core
|
||||
where the VM that handles them is running.
|
||||
|
@@ -94,7 +94,7 @@ CPU management in the Service VM under flexing CPU sharing
|
||||
==========================================================
|
||||
|
||||
As all Service VM CPUs could share with different UOSs, ACRN can still pass-thru
|
||||
MADT to Service VM, and the Service VM is still able to see all physcial CPUs.
|
||||
MADT to Service VM, and the Service VM is still able to see all physical CPUs.
|
||||
|
||||
But as under CPU sharing, the Service VM does not need offline/release the physical
|
||||
CPUs intended for UOS use.
|
||||
@@ -102,8 +102,8 @@ CPUs intended for UOS use.
|
||||
CPU management in UOS
|
||||
=====================
|
||||
|
||||
From the UOS point of view, CPU management is very simple - when DM do
|
||||
hypercall to create VM, the hypervisor will create its all virtual CPUs
|
||||
From the UOS point of view, CPU management is very simple - when DM does
|
||||
hypercalls to create VMs, the hypervisor will create its virtual CPUs
|
||||
based on the configuration in this UOS VM's ``vm config``.
|
||||
|
||||
As mentioned in previous description, ``vcpu_affinity`` in ``vm config``
|
||||
@@ -150,7 +150,7 @@ the major states are:
|
||||
- **VPCU_ZOMBIE**: vCPU is being offline, and its vCPU thread is not
|
||||
running on its associated CPU
|
||||
|
||||
- **VPCU_OFFLINE**: vCPU is offlined
|
||||
- **VPCU_OFFLINE**: vCPU is offline
|
||||
|
||||
.. figure:: images/hld-image17.png
|
||||
:align: center
|
||||
|
@@ -3,8 +3,8 @@
|
||||
Partition mode
|
||||
##############
|
||||
|
||||
ACRN is type-1 hypervisor that supports running multiple guest operating
|
||||
systems (OS). Typically, the platform BIOS/boot-loader boots ACRN, and
|
||||
ACRN is a type-1 hypervisor that supports running multiple guest operating
|
||||
systems (OS). Typically, the platform BIOS/bootloader boots ACRN, and
|
||||
ACRN loads single or multiple guest OSes. Refer to :ref:`hv-startup` for
|
||||
details on the start-up flow of the ACRN hypervisor.
|
||||
|
||||
@@ -21,12 +21,12 @@ Introduction
|
||||
|
||||
In partition mode, ACRN provides guests with exclusive access to cores,
|
||||
memory, cache, and peripheral devices. Partition mode enables developers
|
||||
to dedicate resources exclusively among the guests. However there is no
|
||||
to dedicate resources exclusively among the guests. However, there is no
|
||||
support today in x86 hardware or in ACRN to partition resources such as
|
||||
peripheral buses (e.g. PCI). On x86 platforms that support Cache
|
||||
Allocation Technology (CAT) and Memory Bandwidth Allocation(MBA), resources
|
||||
such as Cache and memory bandwidth can be used by developers to partition
|
||||
L2, Last Level Cache (LLC) and memory bandwidth among the guests. Refer to
|
||||
L2, Last Level Cache (LLC), and memory bandwidth among the guests. Refer to
|
||||
:ref:`hv_rdt` for more details on ACRN RDT high-level design and
|
||||
:ref:`rdt_configuration` for RDT configuration.
|
||||
|
||||
@@ -34,7 +34,7 @@ L2, Last Level Cache (LLC) and memory bandwidth among the guests. Refer to
|
||||
ACRN expects static partitioning of resources either by code
|
||||
modification for guest configuration or through compile-time config
|
||||
options. All the devices exposed to the guests are either physical
|
||||
resources or emulated in the hypervisor. So, there is no need for
|
||||
resources or are emulated in the hypervisor. So, there is no need for a
|
||||
device-model and Service OS. :numref:`pmode2vms` shows a partition mode
|
||||
example of two VMs with exclusive access to physical resources.
|
||||
|
||||
@@ -47,7 +47,7 @@ example of two VMs with exclusive access to physical resources.
|
||||
Guest info
|
||||
**********
|
||||
|
||||
ACRN uses multi-boot info passed from the platform boot-loader to know
|
||||
ACRN uses multi-boot info passed from the platform bootloader to know
|
||||
the location of each guest kernel in memory. ACRN creates a copy of each
|
||||
guest kernel into each of the guests' memory. Current implementation of
|
||||
ACRN requires developers to specify kernel parameters for the guests as
|
||||
@@ -64,7 +64,7 @@ Cores
|
||||
=====
|
||||
|
||||
ACRN requires the developer to specify the number of guests and the
|
||||
cores dedicated for each guest. Also the developer needs to specify
|
||||
cores dedicated for each guest. Also, the developer needs to specify
|
||||
the physical core used as the Boot Strap Processor (BSP) for each guest. As
|
||||
the processors are brought to life in the hypervisor, it checks if they are
|
||||
configured as BSP for any of the guests. If a processor is BSP of any of
|
||||
@@ -90,7 +90,7 @@ for assigning host memory to the guests:
|
||||
1) Sum of guest PCI hole and guest "System RAM" is less than 4GB.
|
||||
|
||||
2) Pick the starting address in the host physical address and the
|
||||
size, so that it does not overlap with any reserved regions in
|
||||
size so that it does not overlap with any reserved regions in
|
||||
host E820.
|
||||
|
||||
ACRN creates EPT mapping for the guest between GPA (0, memory size) and
|
||||
@@ -127,7 +127,7 @@ Platform info - mptable
|
||||
=======================
|
||||
|
||||
ACRN, in partition mode, uses mptable to convey platform info to each
|
||||
guest. Using this platform information, number of cores used for each
|
||||
guest. Using this platform information, number of cores used for each
|
||||
guest, and whether the guest needs devices with INTX, ACRN builds
|
||||
mptable and copies it to the guest memory. In partition mode, ACRN uses
|
||||
physical APIC IDs to pass to the guests.
|
||||
@@ -137,7 +137,7 @@ I/O - Virtual devices
|
||||
|
||||
Port I/O is supported for PCI device config space 0xcfc and 0xcf8, vUART
|
||||
0x3f8, vRTC 0x70 and 0x71, and vPIC ranges 0x20/21, 0xa0/a1, and
|
||||
0x4d0/4d1. MMIO is supported for vIOAPIC. ACRN exposes a virtual
|
||||
0x4d0/4d1. MMIO is supported for vIOAPIC. ACRN exposes a virtual
|
||||
host-bridge at BDF (Bus Device Function) 0.0:0 to each guest. Access to
|
||||
256 bytes of config space for virtual host bridge is emulated.
|
||||
|
||||
@@ -150,11 +150,11 @@ the virtual host bridge. ACRN does not support either passing thru
|
||||
bridges or emulating virtual bridges. Pass-thru devices should be
|
||||
statically allocated to each guest using the guest configuration. ACRN
|
||||
expects the developer to provide the virtual BDF to BDF of the
|
||||
physical device mapping for all the pass-thru devices as
|
||||
part of each guest configuration.
|
||||
physical device mapping for all the pass-thru devices as part of each guest
|
||||
configuration.
|
||||
|
||||
Run-time ACRN support for guests
|
||||
********************************
|
||||
Runtime ACRN support for guests
|
||||
*******************************
|
||||
|
||||
ACRN, in partition mode, supports an option to pass-thru LAPIC of the
|
||||
physical CPUs to the guest. ACRN expects developers to specify if the
|
||||
@@ -185,20 +185,20 @@ Guests w/o LAPIC pass-thru
|
||||
--------------------------
|
||||
|
||||
For guests without LAPIC pass-thru, IPIs between guest CPUs are handled in
|
||||
the same way as sharing mode of ACRN. Refer to :ref:`virtual-interrupt-hld`
|
||||
the same way as sharing mode in ACRN. Refer to :ref:`virtual-interrupt-hld`
|
||||
for more details.
|
||||
|
||||
Guests w/ LAPIC pass-thru
|
||||
-------------------------
|
||||
|
||||
ACRN supports pass-thru if and only if the guest is using x2APIC mode
|
||||
for the vLAPIC. In LAPIC pass-thru mode, writes to Interrupt Command
|
||||
Register (ICR) x2APIC MSR is intercepted. Guest writes the IPI info
|
||||
including vector, destination APIC IDs to the ICR. Upon an IPI request
|
||||
from the guest, ACRN does sanity check on the destination processors
|
||||
programmed into ICR. If the destination is a valid target for the guest,
|
||||
ACRN sends IPI with the same vector from ICR to the physical CPUs
|
||||
corresponding to the destination processor info in ICR.
|
||||
for the vLAPIC. In LAPIC pass-thru mode, writes to the Interrupt Command
|
||||
Register (ICR) x2APIC MSR is intercepted. Guest writes the IPI info,
|
||||
including vector, and destination APIC IDs to the ICR. Upon an IPI request
|
||||
from the guest, ACRN does a sanity check on the destination processors
|
||||
programmed into the ICR. If the destination is a valid target for the guest,
|
||||
ACRN sends an IPI with the same vector from the ICR to the physical CPUs
|
||||
corresponding to the destination processor info in the ICR.
|
||||
|
||||
.. figure:: images/partition-image14.png
|
||||
:align: center
|
||||
@@ -217,7 +217,7 @@ Address registers (BAR), offsets starting from 0x10H to 0x24H, provide
|
||||
the information about the resources (I/O and MMIO) used by the PCI
|
||||
device. ACRN virtualizes the BAR registers and for the rest of the
|
||||
config space, forwards reads and writes to the physical config space of
|
||||
pass-thru devices. Refer to `I/O`_ section below for more details.
|
||||
pass-thru devices. Refer to the `I/O`_ section below for more details.
|
||||
|
||||
.. figure:: images/partition-image1.png
|
||||
:align: center
|
||||
@@ -237,14 +237,14 @@ I/O
|
||||
|
||||
ACRN supports I/O for pass-thru devices with two restrictions.
|
||||
|
||||
1) Supports only MMIO. So requires developers to expose I/O BARs as
|
||||
1) Supports only MMIO. Thus, this requires developers to expose I/O BARs as
|
||||
not present in the guest configuration.
|
||||
|
||||
2) Supports only 32-bit MMIO BAR type.
|
||||
|
||||
As guest PCI sub-system scans the PCI bus and assigns Guest Physical
|
||||
Address (GPA) to the MMIO BAR, ACRN maps GPA to the address in the
|
||||
physical BAR of the pass-thru device using EPT. Following timeline chart
|
||||
As the guest PCI sub-system scans the PCI bus and assigns a Guest Physical
|
||||
Address (GPA) to the MMIO BAR, ACRN maps the GPA to the address in the
|
||||
physical BAR of the pass-thru device using EPT. The following timeline chart
|
||||
explains how PCI devices are assigned to guest and BARs are mapped upon
|
||||
guest initialization.
|
||||
|
||||
@@ -265,7 +265,7 @@ ACRN expects developers to identify the interrupt line info (0x3CH) from
|
||||
the physical BAR of the pass-thru device and build an interrupt entry in
|
||||
the mptable for the corresponding guest. As guest configures the vIOAPIC
|
||||
for the interrupt RTE, ACRN writes the info from the guest RTE into the
|
||||
physical IOAPIC RTE. Upon guest kernel request to mask the interrupt,
|
||||
physical IOAPIC RTE. Upon the guest kernel request to mask the interrupt,
|
||||
ACRN writes to the physical RTE to mask the interrupt at the physical
|
||||
IOAPIC. When guest masks the RTE in vIOAPIC, ACRN masks the interrupt
|
||||
RTE in the physical IOAPIC. Level triggered interrupts are not
|
||||
@@ -275,9 +275,9 @@ MSI support
|
||||
~~~~~~~~~~~
|
||||
|
||||
Guest reads/writes to PCI configuration space for configuring MSI
|
||||
interrupts using address. Data and control registers are pass-thru to
|
||||
the physical BAR of pass-thru device. Refer to `Configuration
|
||||
space access`_ for details on how PCI configuration space is emulated.
|
||||
interrupts using an address. Data and control registers are pass-thru to
|
||||
the physical BAR of the pass-thru device. Refer to `Configuration
|
||||
space access`_ for details on how the PCI configuration space is emulated.
|
||||
|
||||
Virtual device support
|
||||
======================
|
||||
@@ -328,7 +328,7 @@ Hypervisor IPIs work the same way as in sharing mode.
|
||||
Guests w/ LAPIC pass-thru
|
||||
-------------------------
|
||||
|
||||
Since external interrupts are pass-thru to guest IDT, IPIs do not
|
||||
Since external interrupts are pass-thru to the guest IDT, IPIs do not
|
||||
trigger vmexit. ACRN uses NMI delivery mode and the NMI exiting is
|
||||
chosen for vCPUs. At the time of NMI interrupt on the target processor,
|
||||
if the processor is in non-root mode, vmexit happens on the processor
|
||||
@@ -341,8 +341,8 @@ For details on how hypervisor console works, refer to
|
||||
:ref:`hv-console`.
|
||||
|
||||
For a guest console in partition mode, ACRN provides an option to pass
|
||||
``vmid`` as an argument to ``vm_console``. vmid is same as the one
|
||||
developer uses in the guest configuration.
|
||||
``vmid`` as an argument to ``vm_console``. vmid is the same as the one
|
||||
developers use in the guest configuration.
|
||||
|
||||
Guests w/o LAPIC pass-thru
|
||||
--------------------------
|
||||
@@ -352,18 +352,18 @@ Works the same way as sharing mode.
|
||||
Hypervisor Console
|
||||
==================
|
||||
|
||||
ACRN uses TSC deadline timer to provide timer service. Hypervisor
|
||||
ACRN uses the TSC deadline timer to provide a timer service. The hypervisor
|
||||
console uses a timer on CPU0 to poll characters on the serial device. To
|
||||
support LAPIC pass-thru, TSC deadline MSR is pass-thru and the local
|
||||
timer interrupt also delivered to the guest IDT. Instead of TSC deadline
|
||||
timer, ACRN uses VMX preemption timer to poll the serial device.
|
||||
support LAPIC pass-thru, the TSC deadline MSR is pass-thru and the local
|
||||
timer interrupt is also delivered to the guest IDT. Instead of the TSC
|
||||
deadline timer, ACRN uses the VMX preemption timer to poll the serial device.
|
||||
|
||||
Guest Console
|
||||
=============
|
||||
|
||||
ACRN exposes vUART to partition mode guests. vUART uses vPIC to inject
|
||||
interrupt to the guest BSP. In cases of guest having more than one core,
|
||||
during runtime, vUART might need to inject interrupt to guest BSP from
|
||||
interrupt to the guest BSP. In cases of the guest having more than one core,
|
||||
during runtime, vUART might need to inject an interrupt to the guest BSP from
|
||||
another core (other than BSP). As mentioned in section <Hypervisor IPI
|
||||
service>, ACRN uses NMI delivery mode for notifying the CPU running BSP
|
||||
service>, ACRN uses NMI delivery mode for notifying the CPU running the BSP
|
||||
of the guest.
|
||||
|
@@ -3,23 +3,44 @@
|
||||
RDT Allocation Feature Supported by Hypervisor
|
||||
##############################################
|
||||
|
||||
The hypervisor uses RDT (Resource Director Technology) allocation features such as CAT(Cache Allocation Technology) and MBA(Memory Bandwidth Allocation) to control VMs which may be over-utilizing cache resources or memory bandwidth relative to their priority. By setting limits to critical resources, ACRN can optimize RTVM performance over regular VMs. In ACRN, the CAT and MBA are configured via the "VM-Configuration". The resources allocated for VMs are determined in the VM configuration(:ref:`rdt_vm_configuration`).
|
||||
The ACRN hypervisor uses RDT (Resource Director Technology) allocation features
|
||||
such as CAT (Cache Allocation Technology) and MBA (Memory Bandwidth
|
||||
Allocation) to control VMs which may be over-utilizing cache resources or
|
||||
memory bandwidth relative to their priorities. By setting limits to critical
|
||||
resources, ACRN can optimize RTVM performance over regular VMs. In ACRN, the
|
||||
CAT and MBA are configured via the "VM-Configuration". The resources
|
||||
allocated for VMs are determined in the VM configuration (:ref:`rdt_vm_configuration`).
|
||||
|
||||
For futher details on Intel RDT, please refer to `Intel (R) 64 and IA-32 Architectures Software Developer's Manual, (Section 17.19 INTEL® RESOURCE DIRECTOR TECHNOLOGY ALLOCATION FEATURES) <https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-3a-3b-3c-and-3d-system-programming-guide>`_
|
||||
For further details on the Intel RDT, refer to `Intel 64 and IA-32 Architectures Software Developer's Manual, (Section 17.19 Intel Resource Director Technology Allocation Features) <https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-3a-3b-3c-and-3d-system-programming-guide>`_.
|
||||
|
||||
|
||||
Objective of CAT
|
||||
****************
|
||||
The CAT feature in the hypervisor can isolate the cache for a VM from other VMs. It can also isolate the cache usage between VMX root mode and VMX non-root mode. Generally, certain cache resources will be allocated for the RT VMs in order to reduce the performance interference through the shared cache access from the neighbor VMs.
|
||||
The CAT feature in the hypervisor can isolate the cache for a VM from other
|
||||
VMs. It can also isolate cache usage between VMX root and non-root
|
||||
modes. Generally, certain cache resources are allocated for the
|
||||
RT VMs in order to reduce performance interference through the shared
|
||||
cache access from the neighbor VMs.
|
||||
|
||||
The figure below shows that with CAT, the cache ways can be isolated vs default where high priority VMs can be impacted by a noisy neighbor.
|
||||
The figure below shows that with CAT, the cache ways can be isolated vs
|
||||
the default where high priority VMs can be impacted by a noisy neighbor.
|
||||
|
||||
.. figure:: images/cat-objective.png
|
||||
:align: center
|
||||
|
||||
CAT Support in ACRN
|
||||
===================
|
||||
On x86 platforms that support CAT, ACRN hypervisor automatically enables the support and by default shares the cache ways equally between all the VMs. This is done by setting max cache mask in MSR_IA32_type_MASK_n (where type: L2 or L3) MSR corresponding to each CLOS and setting IA32_PQR_ASSOC MSR with CLOS 0. The user can check the cache capabilities such as cache mask, max supported CLOS as described in :ref:`rdt_detection_capabilities` and program the IA32_type_MASK_n and IA32_PQR_ASSOC MSR with class-of-service (CLOS) ID, to select a cache mask to take effect. ACRN uses VMCS MSR loads on every VM Entry/VM Exit for non-root and root modes to enforce the settings.
|
||||
On x86 platforms that support CAT, the ACRN hypervisor automatically enables
|
||||
support and by default shares the cache ways equally between all VMs.
|
||||
This is done by setting the max cache mask in the MSR_IA32_type_MASK_n (where
|
||||
type: L2 or L3) MSR that corresponds to each CLOS and then setting the
|
||||
IA32_PQR_ASSOC MSR to CLOS 0. (Note that CLOS, or Class of Service, is a
|
||||
resource allocator.) The user can check the cache capabilities such as cache
|
||||
mask and max supported CLOS as described in :ref:`rdt_detection_capabilities`
|
||||
and then program the IA32_type_MASK_n and IA32_PQR_ASSOC MSR with a
|
||||
CLOS ID, to select a cache mask to take effect. ACRN uses
|
||||
VMCS MSR loads on every VM Entry/VM Exit for non-root and root modes to
|
||||
enforce the settings.
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 3,7,11,15
|
||||
@@ -63,14 +84,26 @@ On x86 platforms that support CAT, ACRN hypervisor automatically enables the sup
|
||||
};
|
||||
|
||||
.. note::
|
||||
ACRN takes the lowest common CLOS max value between the supported resources and sets the MAX_PLATFORM_CLOS_NUM. For example, if max CLOS supported by L3 is 16 and L2 is 8, ACRN programs MAX_PLATFORM_CLOS_NUM to 8. ACRN recommends to have consistent capabilities across all RDT resource by using common subset CLOS. This is done in order to minimize misconfiguration errors.
|
||||
ACRN takes the lowest common CLOS max value between the supported
|
||||
resources and sets the MAX_PLATFORM_CLOS_NUM. For example, if max CLOS
|
||||
supported by L3 is 16 and L2 is 8, ACRN programs MAX_PLATFORM_CLOS_NUM to
|
||||
8. ACRN recommends consistent capabilities across all RDT
|
||||
resources by using the common subset CLOS. This is done in order to
|
||||
minimize misconfiguration errors.
|
||||
|
||||
|
||||
Objective of MBA
|
||||
****************
|
||||
The Memory Bandwidth Allocation (MBA) feature provides indirect and approximate control over memory bandwidth available per-core. It provides a method to control VMs which may be over-utilizing bandwidth relative to their priority and thus improving performance of high priority VMs. MBA introduces a programmable request rate controller (PRRC) between cores and high-speed interconnect. Throttling values can be programmed via MSRs to the PRRC to limit bandwidth availability.
|
||||
The Memory Bandwidth Allocation (MBA) feature provides indirect and
|
||||
approximate control over memory bandwidth that's available per core. It
|
||||
provides a method to control VMs which may be over-utilizing bandwidth
|
||||
relative to their priorities and thus improves the performance of high
|
||||
priority VMs. MBA introduces a programmable request rate controller (PRRC)
|
||||
between cores and high-speed interconnect. Throttling values can be
|
||||
programmed via MSRs to the PRRC to limit bandwidth availability.
|
||||
|
||||
The following figure shows memory bandwidth impact without MBA which cause bottleneck for high priority VMs vs with MBA support,
|
||||
The following figure shows memory bandwidth impact without MBA which causes
|
||||
bottlenecks for high priority VMs vs with MBA support:
|
||||
|
||||
.. figure:: images/no_mba_objective.png
|
||||
:align: center
|
||||
@@ -87,7 +120,16 @@ The following figure shows memory bandwidth impact without MBA which cause bottl
|
||||
|
||||
MBA Support in ACRN
|
||||
===================
|
||||
On x86 platforms that support MBA, ACRN hypervisor automatically enables the support and by default sets no limits to the memory bandwidth access by VMs. This is done by setting 0 mba delay value in MSR_IA32_MBA_MASK_n MSR corresponding to each CLOS and setting IA32_PQR_ASSOC MSR with CLOS 0. The user can check the MBA capabilities such as mba delay values, max supported CLOS as described in :ref:`rdt_detection_capabilities` and program the IA32_MBA_MASK_n and IA32_PQR_ASSOC MSR with class-of-service (CLOS) ID, to select a delay to take effect for restricting memory bandwidth. ACRN uses VMCS MSR loads on every VM Entry/VM Exit for non-root and root modes to enforce the settings.
|
||||
On x86 platforms that support MBA, the ACRN hypervisor automatically enables
|
||||
support and by default sets no limits to the memory bandwidth access by VMs.
|
||||
This is done by setting a 0 mba delay value in the MSR_IA32_MBA_MASK_n MSR
|
||||
that corresponds to each CLOS and then setting IA32_PQR_ASSOC MSR with CLOS
|
||||
0. To select a delay to take effect for restricting memory bandwidth,
|
||||
users can check the MBA capabilities such as mba delay values and
|
||||
max supported CLOS as described in :ref:`rdt_detection_capabilities` and
|
||||
then program the IA32_MBA_MASK_n and IA32_PQR_ASSOC MSR with the CLOS ID.
|
||||
ACRN uses VMCS MSR loads on every VM Entry/VM Exit for non-root and root
|
||||
modes to enforce the settings.
|
||||
|
||||
.. code-block:: none
|
||||
:emphasize-lines: 3,7,11,15
|
||||
@@ -131,7 +173,12 @@ On x86 platforms that support MBA, ACRN hypervisor automatically enables the sup
|
||||
};
|
||||
|
||||
.. note::
|
||||
ACRN takes the lowest common CLOS max value between the supported resources and sets the MAX_PLATFORM_CLOS_NUM. For example, if max CLOS supported by L3 is 16 and MBA is 8, ACRN programs MAX_PLATFORM_CLOS_NUM to 8. ACRN recommends to have consistent capabilities across all RDT resource by using common subset CLOS. This is done in order to minimize misconfiguration errors.
|
||||
ACRN takes the lowest common CLOS max value between the supported
|
||||
resources and sets the MAX_PLATFORM_CLOS_NUM. For example, if max CLOS
|
||||
supported by L3 is 16 and MBA is 8, ACRN programs MAX_PLATFORM_CLOS_NUM
|
||||
to 8. ACRN recommends to have consistent capabilities across all RDT
|
||||
resources by using a common subset CLOS. This is done in order to minimize
|
||||
misconfiguration errors.
|
||||
|
||||
|
||||
CAT and MBA high-level design in ACRN
|
||||
@@ -139,7 +186,7 @@ CAT and MBA high-level design in ACRN
|
||||
|
||||
Data structures
|
||||
===============
|
||||
The below figure shows the RDT data structure to store the enumerated resources.
|
||||
The below figure shows the RDT data structure to store enumerated resources.
|
||||
|
||||
.. figure:: images/mba_data_structures.png
|
||||
:align: center
|
||||
@@ -147,14 +194,28 @@ The below figure shows the RDT data structure to store the enumerated resources.
|
||||
Enabling CAT, MBA software flow
|
||||
===============================
|
||||
|
||||
The hypervisor enumerates RDT capabilities and sets up mask arrays; it also sets up CLOS for VMs and hypervisor itself per the "vm configuration"(:ref:`rdt_vm_configuration`).
|
||||
The hypervisor enumerates RDT capabilities and sets up mask arrays; it also
|
||||
sets up CLOS for VMs and the hypervisor itself per the "vm configuration"(:ref:`rdt_vm_configuration`).
|
||||
|
||||
* The RDT capabilities are enumerated on boot-strap processor (BSP), at the pCPU pre-initialize stage. The global data structure ``res_cap_info`` stores the capabilites of the supported resources.
|
||||
* If CAT or/and MBA is supported, then setup masks array on all APs, at the pCPU post-initialize stage. The mask values are written to IA32_type_MASK_n. Refer :ref:`rdt_detection_capabilities` for details on identifying values to program the mask/delay MRSs as well as max CLOS.
|
||||
* If CAT or/and is supported, the CLOS of a **VM** will be stored into its vCPU ``msr_store_area`` data structure guest part. It will be loaded to MSR IA32_PQR_ASSOC at each VM entry.
|
||||
* If CAT or/and MBA is supported, the CLOS of **hypervisor** is stored for all VMs, in their vCPU ``msr_store_area`` data structure host part. It will be loaded to MSR IA32_PQR_ASSOC at each VM exit.
|
||||
- The RDT capabilities are enumerated on the bootstrap processor (BSP) during
|
||||
the pCPU pre-initialize stage. The global data structure ``res_cap_info``
|
||||
stores the capabilites of the supported resources.
|
||||
|
||||
The figure below shows the high level overview of RDT resource flow in ACRN hypervisor.
|
||||
- If CAT or/and MBA is supported, then setup masks array on all APs at the
|
||||
pCPU post-initialize stage. The mask values are written to
|
||||
IA32_type_MASK_n. Refer to :ref:`rdt_detection_capabilities` for details
|
||||
on identifying values to program the mask/delay MRSs and the max CLOS.
|
||||
|
||||
- If CAT or/and MBA is supported, the CLOS of a **VM** will be stored into
|
||||
its vCPU ``msr_store_area`` data structure guest part. It will be loaded
|
||||
to MSR IA32_PQR_ASSOC at each VM entry.
|
||||
|
||||
- If CAT or/and MBA is supported, the CLOS of **hypervisor** is stored for
|
||||
all VMs, in their vCPU ``msr_store_area`` data structure host part. It will
|
||||
be loaded to MSR IA32_PQR_ASSOC at each VM exit.
|
||||
|
||||
The figure below shows the high level overview of RDT resource flow in the
|
||||
ACRN hypervisor.
|
||||
|
||||
.. figure:: images/cat_mba_software_flow.png
|
||||
:align: center
|
||||
|
Reference in New Issue
Block a user