The statically configured vm_configs[].cpu_affinity_bitmap should remain
intact during the life cycle of the SOS, otherwise user can't destroy
and create the same VM with different CPU affinity. For example:
- Initially vm_configs[1].cpu_affinity_bitmap is set to 0xF: pCPU 0/1/2/3.
- VM1 is created on pCPU1 and pCPU2 and vm_configs[1].cpu_affinity_bitmap
is overwritten as 0x6.
- VM1 is destroyed.
- Now VM1 can't be launched again on pCPU0 or pCPU3.
This patch fixes this by saving the static VM configuration before the
create_vm hypercall and restore it when the post-launched VM is shutting
down.
Tracked-On: #4616
Signed-off-by: Zide Chen <zide.chen@intel.com>
Either INVALID_GPA or NULL returned from local_gpa2hpa means the
page walk failure. But current code only take care of NULL and
leave INVALID_GPA not detected.
It could trigger ACRN crash in root mode when guest have a invalid
gva.
We add INVALID_GPA check as well.
Tracked-On: #4721
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Also clear all accessed bit in pagetables when the VM do wbinvd. Then
the next wbinvd will benefit from it.
Tracked-On: #4703
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Currently the vcpu_affinity[] array fixes the vCPU to pCPU mapping.
While the new cpu_affinity_bitmap doesn't explicitly sepcify this
mapping, instead, it implicitly assumes that vCPU0 maps to the pCPU
with lowest pCPU ID, vCPU1 maps to the second lowest pCPU ID, and
so on.
This makes it possible for post-launched VM to run vCPUs on a subset of
these pCPUs only, and not all of them.
acrn-dm may launch post-launched VMs with the current approach: indicate
VM UUID and hypervisor launches all VCPUs from the PCPUs that are masked
in cpu_affinity_bitmap.
Also acrn-dm can choose to launch the VM on a subset of PCPUs that is
defined in cpu_affinity_bitmap. In this way, acrn-dm must specify the
subset of PCPUs in the CREATE_VM hypercall.
Additionally, with this change, a guest's vcpu_num can be easily calculated
from cpu_affinity_bitmap, so don't assign vcpu_num in vm_configuration.c.
Tracked-On: #4616
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The current logic puts hpa2 above GPA 4G always, which is incorrect. Need
to set gpa start of hpa2 right after hpa1 when hpa1 size is less then 2G;
Tracked-On: #4458
Signed-off-by: Victor Sun <victor.sun@intel.com>
Maintain a per-pCPU array of vCPUs (struct acrn_vcpu *vcpu_array[CONFIG_MAX_VM_NUM]),
one VM cannot have multiple vCPUs share one pcpu, so we can utilize this property
and use the containing VM's vm_id as the index to the vCPU array:
In create_vcpu(), we simply do:
per_cpu(vcpu_array, pcpu_id)[vm->vm_id] = vcpu;
In offline_vcpu():
per_cpu(vcpu_array, pcpuid_from_vcpu(vcpu))[vcpu->vm->vm_id] = NULL;
so basically we use the containing VM's vm_id as the index to the vCPU array,
as well as the index of posted interrupt IRQ/vector pair that are assigned
to this vCPU:
0: first vCPU and first posted interrupt IRQs/vector pair
(POSTED_INTR_IRQ/POSTED_INTR_VECTOR)
...
CONFIG_MAX_VM_NUM-1: last vCPU and last posted interrupt IRQs/vector pair
((POSTED_INTR_IRQ + CONFIG_MAX_VM_NUM - 1U)/(POSTED_INTR_VECTOR + CONFIG_MAX_VM_NUM - 1U)
In the posted interrupt handler, it will do the following:
Translate the IRQ into a zero based index of where the vCPU
is located in the vCPU list for current pCPU. Once the
vCPU is found, we wake up the waiting thread and record
this request as ACRN_REQUEST_EVENT
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
This is a preparation patch for adding support for VT-d PI
related vCPU scheduling.
ACRN does not support vCPU migration, one vCPU always runs on
the same pCPU, so PI's ndst is never changed after startup.
VCPUs of a VM won’t share same pCPU. So the maximum possible number
of VCPUs that can run on a pCPU is CONFIG_MAX_VM_NUM.
Allocate unique Activation Notification Vectors (ANV) for each vCPU
that belongs to the same pCPU, the ANVs need only be unique within each
pCPU, not across all vCPUs. This reduces # of pre-allocated ANVs for
posted interrupts to CONFIG_MAX_VM_NUM, and enables ACRN to avoid
switching between active and wake-up vector values in the posted
interrupt descriptor on vCPU scheduling state changes.
A total of CONFIG_MAX_VM_NUM consecutive IRQs/vectors are reserved
for posted interrupts use.
The code first initializes vcpu->arch.pid.control.bits.nv dynamically
(will be added in subsequent patch), the other code shall use
vcpu->arch.pid.control.bits.nv instead of the hard-coded notification vectors.
Rename some functions:
apicv_post_intr --> apicv_trigger_pi_anv
posted_intr_notification --> handle_pi_notification
setup_posted_intr_notification --> setup_pi_notification
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
Given the vcpumask, check if the IRQ is single destination
and return the destination vCPU if so, the address of associated PI
descriptor for this vCPU can then be passed to dmar_assign_irte() to
set up the posted interrupt IRTE for this device.
For fixed mode interrupt delivery, all vCPUs listed in vcpumask should
service the interrupt requested. But VT-d PI cannot support multicast/broadcast
IRQs, it only supports single CPU destination. So the number of vCPUs
shall be 1 in order to handle IRQ in posted mode for this device.
Add pid_paddr to struct intr_source. If platform_caps.pi is true and
the IRQ is single-destination, pass the physical address of the destination
vCPU's PID to ptirq_build_physical_msi and dmar_assign_irte
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
Add platform_caps.c to maintain platform related information
Set platform_caps.pi to true if all iommus are posted interrupt capable, false
otherwise
If lapic passthru is not configured and platform_caps.pi is true, the vm
may be able to use posted interrupt for a ptdev, if the ptdev's IRQ is
single-destination
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
Exend union dmar_ir_entry to support VT-d posted interrupts.
Rename some fields of union dmar_ir_entry:
entry --> value
sw_bits --> avail
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
Pass intr_src and dmar_ir_entry irte as pointers to dmar_assign_irte(),
which fixes the "Attempt to change parameter passed by value" MISRA C violation.
A few coding style fixes
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
For CPU side posted interrupts, it only uses bit 0 (ON) of the PI's 64-bit control
, other bits are don't care. This is not the case for VT-d posted
interrupts, define more bit fields for the PI's 64-bit control.
Use bitmap functions to manipulate the bit fields atomically.
Some MISRA-C violation and coding style fixes
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
The posted interrupt descriptor is more of a vmx/vmcs concept than a vlapic
concept. struct acrn_vcpu_arch stores the vmx/vmcs info, so put struct pi_desc
in struct acrn_vcpu_arch.
Remove the function apicv_get_pir_desc_paddr()
A few coding style/typo fixes
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
Rename struct vlapic_pir_desc to pi_desc
Rename struct member and local variable pir_desc to pid
pir=posted interrupt request, pi=posted interrupt
pid=posted interrupt descriptor
pir is part of pi descriptor, so it is better to use pi instead of pir
struct pi_desc will be moved to vmx.h in subsequent commit.
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
Waag will send NMIs to all its cores during reboot. But currently,
NMI cannot be injected to vcpu which is in HLT state.
To fix the problem, need to wakeup target vcpu, and inject NMI through
interrupt-window.
Tracked-On: #4620
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
EPT table can be changed concurrently by more than one vcpus.
This patch add a lock to protect the add/modify/delete operations
from different vcpus concurrently.
Tracked-On: #4253
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
As ACRN prepares to support servers with large amounts of memory
current logic to allocate space for 4K pages of EPT at compile time
will increase the size of .bss section of ACRN binary.
Bootloaders could run into a situation where they cannot
find enough contiguous space to load ACRN binary under 4GB,
which is typically heavily fragmented with E820 types Reserved,
ACPI data, 32-bit PCI hole etc.
This patch does the following
1) Works only for "direct" mode of vboot
2) reserves space for 4K pages of EPT, after boot by parsing
platform E820 table, for all types of VMs.
Size comparison:
w/o patch
Size of DRAM Size of .bss
48 GB 0xe1bbc98 (~226 MB)
128 GB 0x222abc98 (~548 MB)
w/ patch
Size of DRAM Size of .bss
48 GB 0x1991c98 (~26 MB)
128 GB 0x1a81c98 (~28 MB)
Tracked-On: #4563
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
For SOS VM, when the target platform has multiple IO-APICs, there
should be equal number of virtual IO-APICs.
This patch adds support for emulating multiple vIOAPICs per VM.
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
As ACRN prepares to support platforms with multiple IO-APICs,
GSI is a better way to represent physical and virtual INTx interrupt
source.
1) This patch replaces usage of "pin" with "gsi" whereever applicable
across the modules.
2) PIC pin to gsi is trickier and needs to consider the usage of
"Interrupt Source Override" structure in ACPI for the corresponding VM.
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
Reverts 538ba08c: hv:Add vpin to ptdev entry mapping for vpic/vioapic
ACRN uses an array of size per VM to store ptirq entries against the vIOAPIC pin
and an array of size per VM to store ptirq entries against the vPIC pin.
This is done to speed up "ptirq entry" lookup at runtime for Level triggered
interrupts in API ptirq_intx_ack used on EOI.
This patch switches the lookup API for INTx interrupts to the API,
ptirq_lookup_entry_by_sid
This could add delay to processing EOI for Level triggered interrupts.
Trade-off here is space saved for array/s of size CONFIG_MAX_IOAPIC_LINES with 8 bytes
per data. On a server platform, ACRN needs to emulate multiple vIOAPICs for
SOS VM, same as the number of physical IO-APICs. Thereby ACRN would need around
10 such arrays per VM.
Removes the need of "pic_pin" except for the APIs facing the hypercalls
hcall_set_ptdev_intr_info, hcall_reset_ptdev_intr_info
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
This commit allows hypervisor to allocate cache to vcpu by assigning different clos
to vcpus of a same VM.
For example, we could allocate different cache to housekeeping core and real-time core
of an RTVM in order to isolate the interference of housekeeping core via cache hierarchy.
Tracked-On: #4566
Signed-off-by: Yan, Like <like.yan@intel.com>
Reviewed-by: Chen, Zide <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
ve820.c is a common file in arch/x86/guest/ now, so move function of
create_sos_vm_e820() to this file to make code structure clear;
Tracked-On: #4458
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
hypervisor/arch/x86/configs/$(BOARD)/ve820.c is used to store pre-launched
VM specific e820 entries according to memory configuration of customer.
It should be a scenario based configurations but we had to put it in per
board foler because of different board memory settings. This brings concerns
to customer on configuration orgnization.
Currently the file provides same e820 layout for all pre-launched VMs, but
they should have different e820 when their memory are configured differently.
Although we have acrn-config tool to generate ve802.c automatically, it
is not friendly to modify hardcoded ve820 layout manually, so the patch
changes the entries initialization method by calculating each entry item
in C code.
Tracked-On: #4458
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1. Renames DEFINE_IOAPIC_SID with DEFINE_INTX_SID as the virtual source can
be IOAPIC or PIC
2. Rename the src member of source_id.intx_id to ctlr to indicate interrupt
controller
2. Changes the type of src member of source_id.intx_id from uint32_t to
enum with INTX_CTLR_IOAPIC and INTX_CTLR_PIC
Tracked-On: #4447
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Current code avoid the rule 88 S in MISRA-C, so move xsaves and xrstors
assembler to individual functions.
Tracked-On: #4436
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The init value for XCR0 and XSS should be the same with spec:
In SDM Vol1 13.3:
XCR0[0] is associated with x87 state (see Section 13.5.1). XCR0[0] is
always 1. The other bits in XCR0 are all 0 coming out of RESET.
The IA32_XSS MSR (with MSR index DA0H) is zero coming out of RESET.
The previous code try to fix the xsave area leak to other VMs during init
phase, but bring the error to linux. Besides, it cannot avoid the
possible leak in running phase. Need find a better solution.
Tracked-On: #4430
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
There can be times when user unknowinlgy enables
CONFIG_CAT_ENBALED SW flag, but the hardware might
not support L3 or L2 CAT. In such case software can
end up writing to the CAT MSRs which can cause
undefined results. The patch fixes the issue by
enabling CAT only when both HW as well software
via the CONFIG_CAT_ENABLED supports CAT.
The patch also address typo with "clos2prq_msr"
function name. It should be "clos2pqr_msr" instead.
PQR stands for platform qos register.
Tracked-On: #3715
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
As part of rdt cat refactoring, goal is to combine all rdt
specific features such as CAT under one module. So renaming
rdt resouce specific files such as cat.c/.h to generic rdt.c/.h
files.
Tracked-On: #3715
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1. Rename BOOT_CPU_ID to BSP_CPU_ID
2. Repace hardcoded value with BSP_CPU_ID when
ID of BSP is referenced.
Tracked-On: #4420
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Now we split passthrough PCI device from DM to HV, we could remove all the passthrough
PCI device unused code.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In this case, we could handle all the passthrough PCI devices in ACRN hypervisor.
But we still need DM to initialize BAR resources and Intx for passthrough PCI
device for post-launched VM since these informations should been filled into
ACPI tables. So
1. we add a HC vm_assign_pcidev to pass the extra informations to replace the old
vm_assign_ptdev.
2. we saso remove HC vm_set_ptdev_msix_info since it could been setted by the post-launched
VM now same as SOS.
3. remove vm_map_ptdev_mmio call for PTDev in DM since ACRN hypervisor will handle these
BAR access.
4. the most important thing is to trap PCI configure space access for PTDev in HV for
post-launched VM and bypass the virtual PCI device configure space access to DM.
This patch doesn't do the clean work. Will do it in the next patch.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Add assign/deassign PCI device hypercall APIs to assign a PCI device from SOS to
post-launched VM or deassign a PCI device from post-launched VM to SOS. This patch
is prepared for spliting passthrough PCI device from DM to HV.
The old assign/deassign ptdev APIs will be discarded.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
On UEFI UP2 board, APs might execute HLT before SOS kernel INIT them.
After SOS kernel take over and will re-init the APs directly. The flows
from HV perspective is like:
HLT trap:
wait_event(VCPU_EVENT_VIRTUAL_INTERRUPT) -> sleep_thread
SOS kernel INIT, SIPI APs:
pause_vcpu(ZOMBIE) -> sleep_thread
-> reset_vcpu
-> launch_vcpu -> wake_vcpu
However, the last wake_vcpu will fail because the cpu event
VCPU_EVENT_VIRTUAL_INTERRUPT had not got signaled.
This patch will reset all vcpu events in reset_vcpu. If the thread was
previously waiting for a event, its waiting status will be cleared and
launch_vcpu will wake it to running.
Tracked-On: #4402
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1. Align the coding style for these MACROs
2. Align the values of fixed VECTORs
Tracked-On: #4348
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
In lapic passthrough mode, it should passthrough HLT/PAUSE execution
too. This patch disable their emulation when switch to lapic passthrough mode.
Tracked-On: #4329
Tested-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
is_polling_ioreq is more straightforward. Rename it.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
SOS will use PCIe ECAM access PCIe external configuration space. HV should trap this
access for security(Now pre-launched VM doesn't want to support PCI ECAM; post-launched
VM trap PCIe ECAM access in DM).
Besides, update PCIe MMCONFIG region to be owned by hypervisor and expose and pass through
platform hide PCI devices by BIOS to SOS.
Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
HLT emulation is import to CPU resource maximum utilization. vcpu
doing HLT means it is idle and can give up CPU proactively. Thus, we
pause the vcpu thread in HLT emulation and resume it while event happens.
When vcpu enter HLT, its vcpu thread will sleep, but the vcpu state is
still 'Running'.
VM ID PCPU ID VCPU ID VCPU ROLE VCPU STATE
===== ======= ======= ========= ==========
0 0 0 PRIMARY Running
0 1 1 SECONDARY Running
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Sometimes HV wants to know if there are pending interrupts of one vcpu.
Add .has_pending_intr interface in acrn_apicv_ops and return the pending
interrupts status by check IRRs of apicv.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Introduce two kinds of events for each vcpu,
VCPU_EVENT_IOREQ: for vcpu waiting for IO request completion
VCPU_EVENT_VIRTUAL_INTERRUPT: for vcpu waiting for virtual interrupts events
vcpu can wait for such events, and resume to run when the
event get signalled.
This patch also change IO request waiting/notifying to this way.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
As we enabled cpu sharing, PAUSE-loop exiting can help vcpu
to release its pcpu proactively. It's good for performance.
VMX_PLE_GAP: upper bound on the amount of time between two successive
executions of PAUSE in a loop.
VMX_PLE_WINDOW: upper bound on the amount of time a guest is allowed to
execute in a PAUSE loop
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In current code, wait_pcpus_offline() and make_pcpu_offline() are called by
both shutdown_vm() and reset_vm(), but this is not needed when lapic_pt is
not enabled for the vcpus of the VM.
The patch merged offline pcpus part code into a common
offline_lapic_pt_enabled_pcpus() api for shutdown_vm() and reset_vm() use and
called only when lapic_pt is enabled.
Tracked-On: #4325
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1. This patch passes-through CR4.PCIDE to guest VM.
2. This patch handles the invlidation of TLB and the paging-structure caches.
According to SDM Vol.3 4.10.4.1, the following instructions invalidate
entries in the TLBs and the paging-structure caches:
- INVLPG: this instruction is passed-through to guest, no extra handling needed.
- INVPCID: this instruction is passed-trhough to guest, no extra handling needed.
- CR0.PG from 1 to 0: already handled by current code, change of CR0.PG will do
EPT flush.
- MOV to CR3: hypervisor doesn't trap this instrcution, no extra handling needed.
- CR4.PGE changed: already handled by current code, change of CR4.PGE will no EPT
flush.
- CR4.PCIDE from 1 to 0: this patch handles this case, will do EPT flush.
- CR4.PAE changed: already handled by current code, change of CR4.PAE will do EPT
flush.
- CR4.SEMP from 1 to 0, already handled by current code, change of CR4.SEMP will
do EPT flush.
- Task switch: Task switch is not supported in VMX non-root mode.
- VMX transitions: already handled by current code with the support of VPID.
3. This patch checks the validatiy of CR0, CR4 related to PCID feature.
According to SDM Vol.3 4.10.1, CR.PCIDE can be 1 only in IA-32e mode.
- MOV to CR4 causes a general-protection exception (#GP) if it would change CR4.PCIDE
from 0 to 1 and either IA32_EFER.LMA = 0 or CR3[11:0] ≠ 000H
- MOV to CR0 causes a general-protection exception if it would clear CR0.PG to 0
while CR4.PCIDE = 1
Tracked-On: #4296
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
According to SDM Vol.3 Section 25.3, behavior of the INVPCID
instruction is determined first by the setting of the “enable
INVPCID” VM-execution control:
- If the “enable INVPCID” VM-execution control is 0, INVPCID
causes an invalid-opcode exception (#UD).
- If the “enable INVPCID” VM-execution control is 1, treatment
is based on the setting of the “INVLPG exiting” VM-execution
control:
* If the “INVLPG exiting” VM-execution control is 0, INVPCID
operates normally.
* If the “INVLPG exiting” VM-execution control is 1, INVPCID
causes a VM exit.
In current implementation, hypervisor doesn't set “INVLPG exiting”
VM-execution control, this patch sets “enable INVPCID” VM-execution
control to 1 when the instruction is supported by physical cpu.
If INVPCID is supported by physical cpu, INVPCID will not cause VM
exit in VM.
If INVPCID is not supported by physical cpu, INVPCID causes an #UD
in VM.
When INVPCID is passed-through to VM, According to SDM Vol.3 28.3.3.1,
INVPCID instruction invalidates linear mappings and combined mappings.
They are required to do so only for the current VPID.
HV assigned a unique vpid for each vCPU, if guest uses wrong PCID,
it would not affect other vCPUs.
Tracked-On: #4296
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>