mirror of
https://github.com/projectacrn/acrn-hypervisor.git
synced 2025-06-25 15:02:13 +00:00
doc: update Memory management HLD
Update HLD documentation with HLD 0.7 section 3.3 (Memory Management). Add a referenced target link to hv-cpu-virt.rst Tracked-on: #1590 Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
This commit is contained in:
parent
2f8c31f6b4
commit
bc7b06ae2f
@ -1207,6 +1207,7 @@ APIs to register its IO/MMIO range:
|
|||||||
- unregister a MMIO emulation handler for a hypervisor emulated device
|
- unregister a MMIO emulation handler for a hypervisor emulated device
|
||||||
by specific MMIO range
|
by specific MMIO range
|
||||||
|
|
||||||
|
.. _instruction-emulation:
|
||||||
|
|
||||||
Instruction Emulation
|
Instruction Emulation
|
||||||
*********************
|
*********************
|
||||||
|
BIN
doc/developer-guides/hld/images/mem-image18.png
Normal file
BIN
doc/developer-guides/hld/images/mem-image18.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 52 KiB |
Binary file not shown.
Before Width: | Height: | Size: 11 KiB |
BIN
doc/developer-guides/hld/images/mem-image45.png
Normal file
BIN
doc/developer-guides/hld/images/mem-image45.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 33 KiB |
BIN
doc/developer-guides/hld/images/mem-image69.png
Normal file
BIN
doc/developer-guides/hld/images/mem-image69.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 19 KiB |
BIN
doc/developer-guides/hld/images/mem-image8.png
Normal file
BIN
doc/developer-guides/hld/images/mem-image8.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 56 KiB |
BIN
doc/developer-guides/hld/images/mem-image84.png
Normal file
BIN
doc/developer-guides/hld/images/mem-image84.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 161 KiB |
@ -8,21 +8,30 @@ This document describes memory management for the ACRN hypervisor.
|
|||||||
Overview
|
Overview
|
||||||
********
|
********
|
||||||
|
|
||||||
|
The hypervisor (HV) virtualizes real physical memory so an unmodified OS
|
||||||
|
(such as Linux or Android) running in a virtual machine, has the view of
|
||||||
|
managing its own contiguous physical memory. HV uses virtual-processor
|
||||||
|
identifiers (VPIDs) and the extended page-table mechanism (EPT) to
|
||||||
|
translate guest-physical address into host-physical address. HV enables
|
||||||
|
EPT and VPID hardware virtualization features, establishes EPT page
|
||||||
|
tables for SOS/UOS, and provides EPT page tables operation interfaces to
|
||||||
|
others.
|
||||||
|
|
||||||
In the ACRN hypervisor system, there are few different memory spaces to
|
In the ACRN hypervisor system, there are few different memory spaces to
|
||||||
consider. From the hypervisor's point of view there are:
|
consider. From the hypervisor's point of view there are:
|
||||||
|
|
||||||
- Host Physical Address (HPA): the native physical address space, and
|
- **Host Physical Address (HPA)**: the native physical address space, and
|
||||||
- Host Virtual Address (HVA): the native virtual address space based on
|
- **Host Virtual Address (HVA)**: the native virtual address space based on
|
||||||
a MMU. A page table is used to do the translation between HPA and HVA
|
a MMU. A page table is used to translate between HPA and HVA
|
||||||
spaces.
|
spaces.
|
||||||
|
|
||||||
And from the Guest OS running on a hypervisor there are:
|
From the Guest OS running on a hypervisor there are:
|
||||||
|
|
||||||
- Guest Physical Address (GPA): the guest physical address space from a
|
- **Guest Physical Address (GPA)**: the guest physical address space from a
|
||||||
virtual machine. GPA to HPA transition is usually based on a
|
virtual machine. GPA to HPA transition is usually based on a
|
||||||
MMU-like hardware module (EPT in X86), and associated with a page
|
MMU-like hardware module (EPT in X86), and associated with a page
|
||||||
table
|
table
|
||||||
- Guest Virtual Address (GVA): the guest virtual address space from a
|
- **Guest Virtual Address (GVA)**: the guest virtual address space from a
|
||||||
virtual machine based on a vMMU
|
virtual machine based on a vMMU
|
||||||
|
|
||||||
.. figure:: images/mem-image2.png
|
.. figure:: images/mem-image2.png
|
||||||
@ -47,19 +56,25 @@ inside the hypervisor and from a VM:
|
|||||||
- How ACRN hypervisor manages SOS guest memory (HPA/GPA)
|
- How ACRN hypervisor manages SOS guest memory (HPA/GPA)
|
||||||
- How ACRN hypervisor & SOS DM manage UOS guest memory (HPA/GPA)
|
- How ACRN hypervisor & SOS DM manage UOS guest memory (HPA/GPA)
|
||||||
|
|
||||||
Hypervisor Memory Management
|
Hypervisor Physical Memory Management
|
||||||
****************************
|
*************************************
|
||||||
|
|
||||||
The ACRN hypervisor is the primary owner to manage system
|
In the ACRN, the HV initializes MMU page tables to manage all physical
|
||||||
memory. Typically the boot firmware (e.g., EFI) passes the platform physical
|
memory and then switches to the new MMU page tables. After MMU page
|
||||||
memory layout - E820 table to the hypervisor. The ACRN hypervisor does its memory
|
tables are initialized at the platform initialization stage, no updates
|
||||||
management based on this table.
|
are made for MMU page tables.
|
||||||
|
|
||||||
Physical Memory Layout - E820
|
Hypervisor Physical Memory Layout - E820
|
||||||
=============================
|
========================================
|
||||||
|
|
||||||
The boot firmware (e.g., EFI) passes the E820 table through a multiboot protocol.
|
The ACRN hypervisor is the primary owner to manage system memory.
|
||||||
This table contains the original memory layout for the platform.
|
Typically the boot firmware (e.g., EFI) passes the platform physical
|
||||||
|
memory layout - E820 table to the hypervisor. The ACRN hypervisor does
|
||||||
|
its memory management based on this table using 4-level paging.
|
||||||
|
|
||||||
|
The BIOS/bootloader firmware (e.g., EFI) passes the E820 table through a
|
||||||
|
multiboot protocol. This table contains the original memory layout for
|
||||||
|
the platform.
|
||||||
|
|
||||||
.. figure:: images/mem-image1.png
|
.. figure:: images/mem-image1.png
|
||||||
:align: center
|
:align: center
|
||||||
@ -69,38 +84,511 @@ This table contains the original memory layout for the platform.
|
|||||||
Physical Memory Layout Example
|
Physical Memory Layout Example
|
||||||
|
|
||||||
:numref:`mem-layout` is an example of the physical memory layout based on a simple
|
:numref:`mem-layout` is an example of the physical memory layout based on a simple
|
||||||
platform E820 table. The following sections demonstrate different memory
|
platform E820 table.
|
||||||
space management by referencing it.
|
|
||||||
|
|
||||||
Physical to Virtual Mapping
|
Hypervisor Memory Initialization
|
||||||
===========================
|
================================
|
||||||
|
|
||||||
ACRN hypervisor is running under paging mode, so after receiving
|
The ACRN hypervisor runs under paging mode. After the bootstrap
|
||||||
the platform E820 table, ACRN hypervisor creates its MMU page table
|
processor (BSP) gets the platform E820 table, BSP creates its MMU page
|
||||||
based on it. This is done by the function init_paging() for all
|
table based on it. This is done by the function *init_paging()* and
|
||||||
physical CPUs.
|
*smep()*. After the application processor (AP) receives IPI CPU startup
|
||||||
|
interrupt, it uses the MMU page tables created by BSP and enable SMEP.
|
||||||
|
:numref:`hv-mem-init` describes the hypervisor memory initialization for BSP
|
||||||
|
and APs.
|
||||||
|
|
||||||
The memory mapping policy here is:
|
.. figure:: images/mem-image8.png
|
||||||
|
|
||||||
- Identical mapping for each physical CPU (ACRN hypervisor's memory
|
|
||||||
could be relocatable in a future implementation)
|
|
||||||
- Map all memory regions with UNCACHED type
|
|
||||||
- Remap RAM regions to WRITE-BACK type
|
|
||||||
|
|
||||||
.. figure:: images/mem-image4.png
|
|
||||||
:align: center
|
:align: center
|
||||||
:width: 900px
|
:name: hv-mem-init
|
||||||
:name: vm-layout
|
|
||||||
|
Hypervisor Memory Initialization
|
||||||
|
|
||||||
|
The memory mapping policy used is:
|
||||||
|
|
||||||
|
- Identical mapping (ACRN hypervisor memory could be relocatable in
|
||||||
|
the future)
|
||||||
|
- Map all memory regions with UNCACHED type
|
||||||
|
- Remap RAM regions to WRITE-BACK type
|
||||||
|
|
||||||
|
.. figure:: images/mem-image69.png
|
||||||
|
:align: center
|
||||||
|
:name: hv-mem-vm-init
|
||||||
|
|
||||||
Hypervisor Virtual Memory Layout
|
Hypervisor Virtual Memory Layout
|
||||||
|
|
||||||
:numref:`vm-layout` shows:
|
:numref:`hv-mem-vm-init` above shows:
|
||||||
|
|
||||||
- Hypervisor can access all of system memory
|
- Hypervisor has a view of and can access all system memory
|
||||||
- Hypervisor has an UNCACHED MMIO/PCI hole reserved for devices, such
|
- Hypervisor has UNCACHED MMIO/PCI hole reserved for devices such as
|
||||||
as for LAPIC/IOAPIC access
|
LAPIC/IOAPIC accessing
|
||||||
- Hypervisor has its own memory with WRITE-BACK cache type for its
|
- Hypervisor has its own memory with WRITE-BACK cache type for its
|
||||||
code and data (< 1M part is for secondary CPU reset code)
|
code/data (< 1M part is for secondary CPU reset code)
|
||||||
|
|
||||||
|
The hypervisor should use minimum memory pages to map from virtual
|
||||||
|
address space into physical address space.
|
||||||
|
|
||||||
|
- If 1GB hugepage can be used
|
||||||
|
for virtual address space mapping, the corresponding PDPT entry shall be
|
||||||
|
set for this 1GB hugepage.
|
||||||
|
- If 1GB hugepage can't be used for virtual
|
||||||
|
address space mapping and 2MB hugepage can be used, the corresponding
|
||||||
|
PDT entry shall be set for this 2MB hugepage.
|
||||||
|
- If both of 1GB hugepage
|
||||||
|
and 2MB hugepage can't be used for virtual address space mapping, the
|
||||||
|
corresponding PT entry shall be set.
|
||||||
|
|
||||||
|
If memory type or access rights of a page is updated, or some virtual
|
||||||
|
address space is deleted, it will lead to splitting of the corresponding
|
||||||
|
page. The hypervisor will still keep using minimum memory pages to map from
|
||||||
|
virtual address space into physical address space.
|
||||||
|
|
||||||
|
Memory Pages Pool Functions
|
||||||
|
===========================
|
||||||
|
|
||||||
|
Memory pages pool functions provide dynamic management of multiple
|
||||||
|
4KB page-size memory blocks, used by the hypervisor to store internal
|
||||||
|
data. Through these functions, the hypervisor can allocate and
|
||||||
|
deallocate pages.
|
||||||
|
|
||||||
|
Data Flow Design
|
||||||
|
================
|
||||||
|
|
||||||
|
The physical memory management unit provides MMU 4-level page tables
|
||||||
|
creating and updating services, MMU page tables switching service, SMEP
|
||||||
|
enable service, and HPA/HVA retrieving service to other units.
|
||||||
|
:numref:`mem-data-flow-physical` shows the data flow diagram
|
||||||
|
of physical memory management.
|
||||||
|
|
||||||
|
.. figure:: images/mem-image45.png
|
||||||
|
:align: center
|
||||||
|
:name: mem-data-flow-physical
|
||||||
|
|
||||||
|
Data Flow of Hypervisor Physical Memory Management
|
||||||
|
|
||||||
|
Data Structure Design
|
||||||
|
=====================
|
||||||
|
|
||||||
|
The page tables operation type:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
|
enum _page_table_type {
|
||||||
|
|
||||||
|
PTT_HOST = 0, /* Operations for MMU page tables */
|
||||||
|
PTT_EPT = 1, /* Operations for EPT page tables */
|
||||||
|
PAGETABLE_TYPE_UNKNOWN, /* Page tables operation type is unknown */
|
||||||
|
};
|
||||||
|
|
||||||
|
Interfaces Design
|
||||||
|
=================
|
||||||
|
|
||||||
|
|
||||||
|
MMU Initialization
|
||||||
|
------------------
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - APIs
|
||||||
|
- Description
|
||||||
|
|
||||||
|
* - void enable_smep(void)
|
||||||
|
- Supervisor-mode execution prevention (SMEP) enable
|
||||||
|
|
||||||
|
* - void enable_paging(uint64_t pml64_base_addr)
|
||||||
|
- MMU paging enable
|
||||||
|
|
||||||
|
* - void init_paging(void)
|
||||||
|
- MMU page tables initialization
|
||||||
|
|
||||||
|
* - uint64_t get_paging_pml4(void)
|
||||||
|
- Page map level 4 (PML4) table start address getting
|
||||||
|
|
||||||
|
Page Allocation
|
||||||
|
---------------
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - APIs
|
||||||
|
- Description
|
||||||
|
|
||||||
|
* - void \* alloc_paging_struct(void)
|
||||||
|
- Allocate one page from memory page pool
|
||||||
|
|
||||||
|
Address Space Translation
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - APIs
|
||||||
|
- Description
|
||||||
|
|
||||||
|
* - HPA2HVA(x)
|
||||||
|
- Translate host-physical address to host-virtual address
|
||||||
|
|
||||||
|
* - HVA2HPA(x)
|
||||||
|
- Translate host-virtual address to host-physical address
|
||||||
|
|
||||||
|
|
||||||
|
Hypervisor Memory Virtualization
|
||||||
|
********************************
|
||||||
|
|
||||||
|
The hypervisor provides a contiguous region of physical memory for SOS
|
||||||
|
and each UOS. It also guarantees that the SOS and UOS can not access
|
||||||
|
code and internal data in the hypervisor, and each UOS can not access
|
||||||
|
code and internal data of the SOS and other UOSs.
|
||||||
|
|
||||||
|
The hypervisor:
|
||||||
|
|
||||||
|
- enables EPT and VPID hardware virtualization features,
|
||||||
|
- establishes EPT page tables for SOS/UOS,
|
||||||
|
- provides EPT page tables operations services,
|
||||||
|
- virtualizes MTRR for SOS/UOS,
|
||||||
|
- provides VPID operations services,
|
||||||
|
- provides services for address spaces translation between GPA and HPA, and
|
||||||
|
- provides services for data transfer between hypervisor and virtual machine.
|
||||||
|
|
||||||
|
Memory Virtualization Capability Checking
|
||||||
|
=========================================
|
||||||
|
|
||||||
|
In the hypervisor, memory virtualization provides EPT/VPID capability
|
||||||
|
checking service and EPT hugepage supporting checking service. Before HV
|
||||||
|
enables memory virtualization and uses EPT hugepage, these service need
|
||||||
|
to be invoked by other units.
|
||||||
|
|
||||||
|
Data Transfer between Different Address Spaces
|
||||||
|
==============================================
|
||||||
|
|
||||||
|
In ACRN, different memory space management is used in the hypervisor,
|
||||||
|
Service OS, and User OS to achieve spatial isolation. Between memory
|
||||||
|
spaces, there are different kinds of data transfer, such as a SOS/UOS
|
||||||
|
may hypercall to request hypervisor services which includes data
|
||||||
|
transferring, or when the hypervisor does instruction emulation: the HV
|
||||||
|
needs to access the guest instruction pointer register to fetch guest
|
||||||
|
instruction data.
|
||||||
|
|
||||||
|
Access GPA from Hypervisor
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
When hypervisor need access GPA for data transfer, the caller from guest
|
||||||
|
must make sure this memory range's GPA is continuous. But for HPA in
|
||||||
|
hypervisor, it could be dis-continuous (especially for UOS under hugetlb
|
||||||
|
allocation mechanism). For example, a 4M GPA range may map to 2
|
||||||
|
different 2M huge host-physical pages. The ACRN hypervisor must take
|
||||||
|
care of this kind of data transfer by doing EPT page walking based on
|
||||||
|
its HPA.
|
||||||
|
|
||||||
|
Access GVA from Hypervisor
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
When hypervisor needs to access GVA for data transfer, it's likely both
|
||||||
|
GPA and HPA could be address dis-continuous. The ACRN hypervisor must
|
||||||
|
watch for this kind of data transfer, and handle it by doing page
|
||||||
|
walking based on both its GPA and HPA.
|
||||||
|
|
||||||
|
EPT Page Tables Operations
|
||||||
|
==========================
|
||||||
|
|
||||||
|
The hypervisor should use a minimum of memory pages to map from
|
||||||
|
guest-physical address (GPA) space into host-physical address (HPA)
|
||||||
|
space.
|
||||||
|
|
||||||
|
- If 1GB hugepage can be used for GPA space mapping, the
|
||||||
|
corresponding EPT PDPT entry shall be set for this 1GB hugepage.
|
||||||
|
- If 1GB hugepage can't be used for GPA space mapping and 2MB hugepage can be
|
||||||
|
used, the corresponding EPT PDT entry shall be set for this 2MB
|
||||||
|
hugepage.
|
||||||
|
- If both 1GB hugepage and 2MB hugepage can't be used for GPA
|
||||||
|
space mapping, the corresponding EPT PT entry shall be set.
|
||||||
|
|
||||||
|
If memory type or access rights of a page is updated or some GPA space
|
||||||
|
is deleted, it will lead to the corresponding EPT page being split. The
|
||||||
|
hypervisor should still keep to using minimum EPT pages to map from GPA
|
||||||
|
space into HPA space.
|
||||||
|
|
||||||
|
The hypervisor provides EPT guest-physical mappings adding service, EPT
|
||||||
|
guest-physical mappings modifying/deleting service, EPT page tables
|
||||||
|
deallocation, and EPT guest-physical mappings invalidation service.
|
||||||
|
|
||||||
|
Virtual MTRR
|
||||||
|
************
|
||||||
|
|
||||||
|
In ACRN, the hypervisor only virtualizes MTRRs fixed range (0~1MB).
|
||||||
|
The HV sets MTRRs of the fixed range as Write-Back for UOS, and the SOS reads
|
||||||
|
native MTRRs of the fixed range set by BIOS.
|
||||||
|
|
||||||
|
If the guest physical address is not in the fixed range (0~1MB), the
|
||||||
|
hypervisor uses the default memory type in the MTRR (Write-Back).
|
||||||
|
|
||||||
|
When the guest disables MTRRs, the HV sets the guest address memory type
|
||||||
|
as UC.
|
||||||
|
|
||||||
|
If the guest physical address is in fixed range (0~1MB), the HV sets
|
||||||
|
memory type according to the fixed virtual MTRRs.
|
||||||
|
|
||||||
|
When the guest enable MTRRs, MTRRs have no effect on the memory type
|
||||||
|
used for access to GPA. The HV first intercepts MTRR MSR registers
|
||||||
|
access through MSR access VM exit and updates EPT memory type field in EPT
|
||||||
|
PTE according to the memory type selected by MTRRs. This combines with
|
||||||
|
PAT entry in the PAT MSR (which is determined by PAT, PCD, and PWT bits
|
||||||
|
from the guest paging structures) to determine the effective memory
|
||||||
|
type.
|
||||||
|
|
||||||
|
VPID operations
|
||||||
|
===============
|
||||||
|
|
||||||
|
Virtual-processor identifier (VPID) is a hardware feature to optimize
|
||||||
|
TLB management. When VPID is enable, hardware will add a tag for TLB of
|
||||||
|
a logical processor and cache information for multiple linear-address
|
||||||
|
spaces. VMX transitions may retain cached information and the logical
|
||||||
|
processor switches to a different address space, avoiding unnecessary
|
||||||
|
TLB flushes.
|
||||||
|
|
||||||
|
In ACRN, an unique VPID must be allocated for each virtual CPU
|
||||||
|
when a virtual CPU is created. The logical processor invalidates linear
|
||||||
|
mappings and combined mapping associated with all VPIDs (except VPID
|
||||||
|
0000H), and with all PCIDs when the logical processor launches the virtual
|
||||||
|
CPU. The logical processor invalidates all linear mapping and combined
|
||||||
|
mappings associated with the specified VPID when the interrupt pending
|
||||||
|
request handling needs to invalidate cached mapping of the specified
|
||||||
|
VPID.
|
||||||
|
|
||||||
|
Data Flow Design
|
||||||
|
================
|
||||||
|
|
||||||
|
The memory virtualization unit includes address space translation
|
||||||
|
functions, data transferring functions, VM EPT operations functions,
|
||||||
|
VPID operations functions, VM exit hanging about EPT violation and EPT
|
||||||
|
misconfiguration, and MTRR virtualization functions. This unit handles
|
||||||
|
guest-physical mapping updates by creating or updating related EPT page
|
||||||
|
tables. It virtualizes MTRR for guest OS by updating related EPT page
|
||||||
|
tables. It handles address translation from GPA to HPA by walking EPT
|
||||||
|
page tables. It copies data from VM into the HV or from the HV to VM by
|
||||||
|
walking guest MMU page tables and EPT page tables. It provides services
|
||||||
|
to allocate VPID for each virtual CPU and TLB invalidation related VPID.
|
||||||
|
It handles VM exit about EPT violation and EPT misconfiguration. The
|
||||||
|
following :numref:`mem-flow-mem-virt` describes the data flow diagram of
|
||||||
|
the memory virtualization unit.
|
||||||
|
|
||||||
|
.. figure:: images/mem-image84.png
|
||||||
|
:align: center
|
||||||
|
:name: mem-flow-mem-virt
|
||||||
|
|
||||||
|
Data Flow of Hypervisor Memory Virtualization
|
||||||
|
|
||||||
|
Data Structure Design
|
||||||
|
=====================
|
||||||
|
|
||||||
|
EPT Memory Type Data Definition:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
|
/* EPT memory type is specified in bits 5:3 of the last EPT
|
||||||
|
* paging-structure entry */
|
||||||
|
#define EPT_MT_SHIFT 3U
|
||||||
|
|
||||||
|
/* EPT memory type is uncacheable */
|
||||||
|
#define EPT_UNCACHED (0UL << EPT_MT_SHIFT)
|
||||||
|
|
||||||
|
/* EPT memory type is write combining */
|
||||||
|
#define EPT_WC (1UL << EPT_MT_SHIFT)
|
||||||
|
|
||||||
|
/* EPT memory type is write through */
|
||||||
|
#define EPT_WT (4UL << EPT_MT_SHIFT)
|
||||||
|
|
||||||
|
/* EPT memory type is write protected */
|
||||||
|
#define EPT_WP (5UL << EPT_MT_SHIFT)
|
||||||
|
|
||||||
|
/* EPT memory type is write back */
|
||||||
|
#define EPT_WB (6UL << EPT_MT_SHIFT)
|
||||||
|
|
||||||
|
EPT Memory Access Right Definition:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
|
/* EPT memory access right is read-only */
|
||||||
|
#define EPT_RD (1UL << 0U)
|
||||||
|
|
||||||
|
/* EPT memory access right is read/write */
|
||||||
|
#define EPT_WR (1UL << 1U)
|
||||||
|
|
||||||
|
/* EPT memory access right is executable */
|
||||||
|
#define EPT_EXE (1UL << 2U)
|
||||||
|
|
||||||
|
/* EPT memory access right is read/write and executable */
|
||||||
|
define EPT_RWX (EPT_RD | EPT_WR | EPT_EXE)
|
||||||
|
|
||||||
|
Interfaces Design
|
||||||
|
=================
|
||||||
|
|
||||||
|
The memory virtualization unit interacts with external units through VM
|
||||||
|
exit and APIs.
|
||||||
|
|
||||||
|
VM Exit about EPT
|
||||||
|
=================
|
||||||
|
|
||||||
|
There are two VM exit handlers for EPT violation and EPT
|
||||||
|
misconfiguration in the hypervisor. EPT page tables are
|
||||||
|
always configured correctly for SOS and UOS. If EPT misconfiguration is
|
||||||
|
detected, a fatal error is reported by HV. The hypervisor
|
||||||
|
uses EPT violation to intercept MMIO access to do device emulation. EPT
|
||||||
|
violation handling data flow is described in the
|
||||||
|
:ref:`instruction-emulation`.
|
||||||
|
|
||||||
|
Memory Virtualization APIs
|
||||||
|
==========================
|
||||||
|
|
||||||
|
Here is a list of major memory related APIs in HV:
|
||||||
|
|
||||||
|
EPT/VPID Capability Checking
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - APIs
|
||||||
|
- Description
|
||||||
|
|
||||||
|
* - int check_vmx_mmu_cap(void)
|
||||||
|
- EPT and VPID capability checking
|
||||||
|
|
||||||
|
|
||||||
|
1GB Hugepage Supporting Checking
|
||||||
|
--------------------------------
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - APIs
|
||||||
|
- Description
|
||||||
|
|
||||||
|
* - bool check_mmu_1gb_support(enum _page_table_type page_table_type)
|
||||||
|
- 1GB page supporting capability checking
|
||||||
|
|
||||||
|
Data Transferring between hypervisor and VM
|
||||||
|
-------------------------------------------
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - APIs
|
||||||
|
- Description
|
||||||
|
|
||||||
|
* - int copy_from_gpa(const struct vm \*vm, void \*h_ptr, uint64_t gpa, uint32_t size)
|
||||||
|
- Copy data from VM GPA space to HV address space
|
||||||
|
|
||||||
|
* - int copy_to_gpa(const struct vm \*vm, void \*h_ptr, uint64_t gpa, uint32_t size)
|
||||||
|
- Copy data from HV address space to VM GPA space
|
||||||
|
|
||||||
|
* - int copy_from_gva(struct vcpu \*vcpu, void \*h_ptr, uint64_t gva,
|
||||||
|
uint32_t size, uint32_t \*err_code, uint64_t \*fault_addr)
|
||||||
|
- Copy data from VM GVA space to HV address space
|
||||||
|
|
||||||
|
* - int copy_to_gva(struct vcpu \*vcpu, void \*h_ptr, uint64_t gva,
|
||||||
|
uint32_t size, uint32_t \*err_code, uint64_t \*fault_addr)
|
||||||
|
- Copy data from HV address space to VM GVA space
|
||||||
|
|
||||||
|
Address Space Translation
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - APIs
|
||||||
|
- Description
|
||||||
|
|
||||||
|
* - uint64_t gpa2hpa(const struct vm \*vm, uint64_t gpa)
|
||||||
|
- Translating from guest-physical address to host-physical address
|
||||||
|
|
||||||
|
* - uint64_t hpa2gpa(const struct vm \*vm, uint64_t hpa)
|
||||||
|
- Translating from host-physical address to guest-physical address
|
||||||
|
|
||||||
|
* - bool check_continuous_hpa(struct vm \*vm, uint64_t gpa_arg, uint64_t size_arg)
|
||||||
|
- Host-physical address continuous checking
|
||||||
|
|
||||||
|
EPT
|
||||||
|
---
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - APIs
|
||||||
|
- Description
|
||||||
|
|
||||||
|
* - int ept_mr_add(const struct vm \*vm, uint64_t hpa_arg, uint64_t gpa_arg,
|
||||||
|
uint64_t size, uint32_t prot_arg)
|
||||||
|
- Guest-physical memory region mapping
|
||||||
|
|
||||||
|
* - int ept_mr_del(const struct vm \*vm, uint64_t \*pml4_page, uint64_t gpa,
|
||||||
|
uint64_t size)
|
||||||
|
- Guest-physical memory region unmapping
|
||||||
|
|
||||||
|
* - int ept_mr_modify(const struct vm \*vm, uint64_t \*pml4_page, uint64_t gpa,
|
||||||
|
uint64_t size, uint64_t prot_set, uint64_t prot_clr)
|
||||||
|
- Guest-physical memory page access right or memory type updating
|
||||||
|
|
||||||
|
* - void destroy_ept(struct vm \*vm)
|
||||||
|
- EPT page tables destroy
|
||||||
|
|
||||||
|
* - void free_ept_mem(void \*pml4_addr)
|
||||||
|
- EPT page tables free
|
||||||
|
|
||||||
|
* - void invept(struct vcpu \*vcpu)
|
||||||
|
- Guest-physical mappings and combined mappings invalidation
|
||||||
|
|
||||||
|
* - int ept_violation_vmexit_handler(struct vcpu \*vcpu)
|
||||||
|
- EPT violation handling
|
||||||
|
|
||||||
|
* - int ept_misconfig_vmexit_handler(__unused struct vcpu \*vcpu)
|
||||||
|
- EPT misconfiguration handling
|
||||||
|
|
||||||
|
Virtual MTRR
|
||||||
|
------------
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - APIs
|
||||||
|
- Description
|
||||||
|
|
||||||
|
* - void init_mtrr(struct vcpu \*vcpu)
|
||||||
|
- Virtual MTRR initialization
|
||||||
|
|
||||||
|
* - void mtrr_wrmsr(struct vcpu \*vcpu, uint32_t msr, uint64_t value)
|
||||||
|
- Virtual MTRR MSR write
|
||||||
|
|
||||||
|
* - uint64_t mtrr_rdmsr(struct vcpu \*vcpu, uint32_t msr)
|
||||||
|
- Virtual MTRR MSR read
|
||||||
|
|
||||||
|
VPID
|
||||||
|
----
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 50 50
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - APIs
|
||||||
|
- Description
|
||||||
|
|
||||||
|
* - uint16_t allocate_vpid(void)
|
||||||
|
- VPID allocation
|
||||||
|
|
||||||
|
* - void flush_vpid_single(uint16_t vpid)
|
||||||
|
- Specified VPID flush
|
||||||
|
|
||||||
|
* - void flush_vpid_global(void)
|
||||||
|
- All VPID flush
|
||||||
|
|
||||||
Service OS Memory Management
|
Service OS Memory Management
|
||||||
****************************
|
****************************
|
||||||
@ -132,9 +620,8 @@ Host to Guest Mapping
|
|||||||
=====================
|
=====================
|
||||||
|
|
||||||
ACRN hypervisor creates Service OS's host (HPA) to guest (GPA) mapping
|
ACRN hypervisor creates Service OS's host (HPA) to guest (GPA) mapping
|
||||||
(EPT mapping) through the function
|
(EPT mapping) through the function ``prepare_vm0_memmap_and_e820()``
|
||||||
``prepare_vm0_memmap_and_e820()`` when it creates the SOS VM. It follows
|
when it creates the SOS VM. It follows these rules:
|
||||||
these rules:
|
|
||||||
|
|
||||||
- Identical mapping
|
- Identical mapping
|
||||||
- Map all memory range with UNCACHED type
|
- Map all memory range with UNCACHED type
|
||||||
@ -148,101 +635,14 @@ can access its MMIO through this static mapping. EPT violation is only
|
|||||||
serving for vLAPIC/vIOAPIC's emulation in the hypervisor for Service OS
|
serving for vLAPIC/vIOAPIC's emulation in the hypervisor for Service OS
|
||||||
VM.
|
VM.
|
||||||
|
|
||||||
User OS Memory Management
|
|
||||||
*************************
|
|
||||||
|
|
||||||
User OS VM is created by the DM (Device Model) application running in
|
|
||||||
the Service OS. DM is responsible for the memory allocation for a User
|
|
||||||
or Guest OS VM.
|
|
||||||
|
|
||||||
Guest Physical Memory Layout - E820
|
|
||||||
===================================
|
|
||||||
|
|
||||||
DM will create the E820 table for a User OS VM based on these simple
|
|
||||||
rules:
|
|
||||||
|
|
||||||
- If requested VM memory size < low memory limitation (defined in DM,
|
|
||||||
as 2GB), then low memory range = [0, requested VM memory size]
|
|
||||||
- If requested VM memory size > low memory limitation (defined in DM,
|
|
||||||
as 2GB), then low memory range = [0, 2GB], high memory range = [4GB,
|
|
||||||
4GB + requested VM memory size - 2GB]
|
|
||||||
|
|
||||||
.. figure:: images/mem-image6.png
|
|
||||||
:align: center
|
|
||||||
:width: 900px
|
|
||||||
:name: uos-mem-layout
|
|
||||||
|
|
||||||
UOS Physical Memory Layout
|
|
||||||
|
|
||||||
DM is doing UOS memory allocation based on hugeTLB mechanism by
|
|
||||||
default. The real memory mapping
|
|
||||||
may be scattered in SOS physical memory space, as shown below:
|
|
||||||
|
|
||||||
.. figure:: images/mem-image5.png
|
|
||||||
:align: center
|
|
||||||
:width: 900px
|
|
||||||
:name: uos-mem-layout-hugetlb
|
|
||||||
|
|
||||||
UOS Physical Memory Layout Based on Hugetlb
|
|
||||||
|
|
||||||
Host to Guest Mapping
|
|
||||||
=====================
|
|
||||||
|
|
||||||
A User OS VM's memory is allocated by the Service OS DM application, and
|
|
||||||
may come from different huge pages in the Service OS as shown in
|
|
||||||
:ref:`uos-mem-layout-hugetlb`.
|
|
||||||
|
|
||||||
As Service OS has the full information of these huge pages size,
|
|
||||||
SOS-GPA and UOS-GPA, it works with the hypervisor to complete UOS's host
|
|
||||||
to guest mapping using this pseudo code:
|
|
||||||
|
|
||||||
.. code-block:: c
|
|
||||||
|
|
||||||
for x in allocated huge pages do
|
|
||||||
x.hpa = gpa2hpa_for_sos(x.sos_gpa)
|
|
||||||
host2guest_map_for_uos(x.hpa, x.uos_gpa, x.size)
|
|
||||||
end
|
|
||||||
|
|
||||||
Trusty
|
Trusty
|
||||||
======
|
******
|
||||||
|
|
||||||
For an Android User OS, there is a secure world called "trusty world
|
For an Android User OS, there is a secure world named trusty world
|
||||||
support", whose memory needs are taken care by the ACRN hypervisor for
|
support, whose memory must be secured by the ACRN hypervisor and
|
||||||
security consideration. From the memory management's view, the trusty
|
must not be accessible by SOS and UOS normal world.
|
||||||
memory space should not be accessible by SOS or UOS normal world.
|
|
||||||
|
|
||||||
.. figure:: images/mem-image7.png
|
.. figure:: images/mem-image18.png
|
||||||
:align: center
|
:align: center
|
||||||
:width: 900px
|
|
||||||
:name: uos-mem-layout-trusty
|
|
||||||
|
|
||||||
UOS Physical Memory Layout with Trusty
|
UOS Physical Memory Layout with Trusty
|
||||||
|
|
||||||
Memory Interaction
|
|
||||||
******************
|
|
||||||
|
|
||||||
Previous sections described different memory spaces management in the
|
|
||||||
ACRN hypervisor, Service OS, and User OS. Among these memory spaces,
|
|
||||||
there are different kinds of interaction, for example, a VM may do a
|
|
||||||
hypercall to the hypervisor that includes a data transfer, or an
|
|
||||||
instruction emulation in the hypervisor may need to access the Guest
|
|
||||||
instruction pointer register to fetch instruction data.
|
|
||||||
|
|
||||||
Access GPA from Hypervisor
|
|
||||||
==========================
|
|
||||||
|
|
||||||
When a hypervisor needs access to the GPA for data transfers, the caller
|
|
||||||
from the Guest must make sure this memory range's GPA is address
|
|
||||||
continuous. But for HPA in the hypervisor, it could be address
|
|
||||||
dis-continuous (especially for UOS under hugetlb allocation mechanism).
|
|
||||||
For example, a 4MB GPA range may map to 2 different 2MB huge pages. The
|
|
||||||
ACRN hypervisor needs to take care of this kind of data transfer by
|
|
||||||
doing EPT page walking based on its HPA.
|
|
||||||
|
|
||||||
Access GVA from Hypervisor
|
|
||||||
==========================
|
|
||||||
|
|
||||||
Likely, when hypervisor need to access GVA for data transfer, both GPA
|
|
||||||
and HPA could be address dis-continuous. The ACRN hypervisor must pay
|
|
||||||
attention to this kind of data transfer, and handle it by doing page
|
|
||||||
walking based on both its GPA and HPA.
|
|
||||||
|
Loading…
Reference in New Issue
Block a user