change the input parameter from vcpu to eptp in order to let this api
more generic, no need to care normal world or secure world.
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Currently, the clos id of the cpu cores in vmx root mode is the same as non-root mode.
For RTVM, if hypervisor share the same clos id with non-root mode, the cacheline may
be polluted due to the hypervisor code execution when vmexit.
The patch adds hv_clos in vm_configurations.c
Hypervisor initializes clos setting according to hv_clos during physical cpu cores initialization.
For RTVM, MSR auto load/store areas are used to switch different settings for VMX root/non-root
mode for RTVM.
Tracked-On: #2462
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
-- remove some unnecessary includes
-- fix a typo
-- remove unnecessary void before launch_vms
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
ACRN HV is designed/implemented with "invariant TSC" capability, which wasn't checked at boot time.
This commit adds the "invairant TSC" detection, ACRN fails to boot if there wasn't "invariant TSC" capability.
Tracked-On: #3636
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now that ACPI is enabled for pre-launched VMs, we can remove all mptable code.
Tracked-On: #3601
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Statically define the per vm RSDP/XSDT/MADT ACPI template tables in vacpi.c,
RSDP/XSDT tables are copied to guest physical memory after checksum is
calculated. For MADT table, first fix up process id/lapic id in its lapic
subtable, then the MADT table's checksum is calculated before it is copies to
guest physical memory.
Add 8-bit checksum function in util.h
Tracked-On: #3601
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
EPT tables are shared by MMU and IOMMU.
Some IOMMUs don't support page-walk coherency, the cpu cache of EPT entires
should be flushed to memory after modifications, so that the modifications
are visible to the IOMMUs.
This patch adds a new interface to flush the cache of modified EPT entires.
There are different implementations for EPT/PPT entries:
- For PPT, there is no need to flush the cpu cache after update.
- For EPT, need to call iommu_flush_cache to make the modifications visible
to IOMMUs.
Tracked-On: #3607
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
VT-d shares the EPT tables as the second level translation tables.
For the IOMMUs that don't support page-walk coherecy, cpu cache should
be flushed for the IOMMU EPT entries that are modified.
For the current implementation, EPT tables for translating from GPA to HPA
for EPT/IOMMU are not modified after VM is created, so cpu cache invlidation is
done once per VM before starting execution of VM.
However, this may be changed, runtime EPT modification is possible.
When cpu cache of EPT entries is invalidated when modification, there is no need
invalidate cpu cache globally per VM.
This patch exports iommu_flush_cache for EPT entry cache invlidation operations.
- IOMMUs share the same copy of EPT table, cpu cache should be flushed if any of
the IOMMU active doesn't support page-walk coherency.
- In the context of ACRN, GPA to HPA mapping relationship is not changed after
VM created, skip flushing iotlb to avoid potential performance penalty.
Tracked-On: #3607
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
-- move 'RFLAGS_AC' to cpu.h
-- move 'VMX_SUPPORT_UNRESTRICTED_GUEST' to msr.h
and rename it to 'MSR_IA32_MISC_UNRESTRICTED_GUEST'
-- move 'get_vcpu_mode' to vcpu.h
-- remove deadcode 'vmx_eoi_exit()'
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
move some data structures and APIs related host reset
from vm_reset.c to pm.c, these are not related with guest.
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Now, we use native gdt saved in boot context for guest and assume
it could be put to same address of guest. But it may not be true
after the pre-launched VM is introduced. The gdt for guest could
be overwritten by guest images.
This patch make 32bit protect mode boot not use saved boot context.
Insteadly, we use predefined vcpu_regs value for protect guest to
initialize the guest bsp registers and copy pre-defined gdt table
to a safe place of guest memory to avoid gdt table overwritten by
guest images.
Tracked-On: #3532
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
fix violations touched below:
1.Cast operation on a constant value
2.signed/unsigned implicity conversion
3.return value unused.
V1->V2:
1.bitmap api will return boolean type, not need to check "!= 0", deleted.
2.The behaves ~(uint32_t)X and (uint32_t)~X are not defined in ACRN hypervisor Coding Guidelines,
removed the change of it.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
This patch moves vmx_rdmsr_pat/vmx_wrmsr_pat from vmcs.c to vmsr.c,
so that these two functions would become internal functions inside
vmsr.c.
This approach improves the modularity.
v1 -> v2:
* remove 'vmx_rdmsr_pat'
* rename 'vmx_wrmsr_pat' with 'write_pat_msr'
Tracked-On: #1842
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Add a field (vdev_ops) in struct acrn_vm_pci_dev_config to configure a PCI CFG
operation for an emulated PCI device. Use pci_pt_dev_ops for PCI_DEV_TYPE_PTDEV
by default if there's no such configure.
Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add emulated PCI device configure for SOS to prepare for add support for customizing
special pci operations for each emulated PCI device.
Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Align SOS pci device configure with pre-launched VM and filter pre-launched VM's
PCI PT device from SOS pci device configure.
Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
pci_dev_config in VM configure stores all the PCI devices for a VM. Besides PT
devices, there're other type devices, like virtual host bridge. So rename ptdev
to pci_dev for these configure.
Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In some case, guest need to get more information under virtual environment,
like guest capabilities. Basically this could be done by hypercalls, but
hypercalls are designed for trusted VM/SOS VM, We need a machenism to report
these information for normal VMs. In this patch, vCPUID leaf 0x40000001 will
be used to satisfy this needs that report some extended information for guest
by CPUID.
Tracked-On: #3498
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Currently the Px Cx supported SoCs which listed in cpu_state_tbl.c is limited,
and it is not a wise option to build a huge state table data base to support
Px/Cx for other SoCs. This patch give a alternative solution that build a board
specific cpu state table in board.c which could be auto-generated by offline
tool, then the CPU Px/Cx of customer board could be enabled;
Hypervisor will search the cpu state table in cpu_state_tbl[] first, if not
found then go check board_cpu_state_tbl. If no matched cpu state table is found
then Px/Cx will not be supported;
Tracked-On: #3477
Signed-off-by: Victor Sun <victor.sun@intel.com>
After "commit f0e1c5e init vcpu host stack when reset vcpu", SOS resume form S3
wants to schedule to vcpu_thread not the point where SOS enter S3. So we should
schedule to idel first then reschedule to execute vcpu_thread.
Tracked-On: #3387
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Based on SDM Vol2 the monitor uses the RAX register to setup the address
monitored by HW. The mwait uses the rax/rcx as the hints that the process
will enter. It is incorrect that the same value is used for monitor/mwait.
The ecx in mwait specifies the optional externsions.
At the same time it needs to check whether the the value of monitored addr
is already expected before entering mwait. Otherwise it will have possible
lockup.
V1->V2: Add the asm wrappper of monitor/mwait to avoid the mixed usage of
inline assembly in wait_sync_change
v2-v3: Remove the unnecessary line break in asm_monitor/asm_mwait.
Follow Fei's comment to remove the mwait ecx hint setting that
treats the interrupt as break event. It only needs to check whether the
value of psync_change is already expected.
Tracked-On: #3442
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
When need hpa and hva translation before init_paging, we need hpa2hva_early and
hva2hpa_early since init_paging may modify hva2hpa to not be identical mapping.
Tracked-On: #2987
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
softirq shouldn't be bounded to vcpu thread. One issue for this
is shell (based on timer) can't work if we don't start any guest.
This change also is trying best to make softirq handler running
with irq enabled.
Also update the irq disable/enabel in vmexit handler to align
with the usage in vcpu_thread.
Tracked-On: #3387
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
This commit adds ops to vlapic structure, and add an *ops parameter to vlapic_reset().
At vlapic reset, the ops is set to the global apicv_ops, and may be assigned
to other ops later.
Tracked-On: #3227
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
When initialize secondary pcpu, pass INVALID_CPU_ID as param of init_pcpu_pre()
looks weird, so change the param type to bool to represent whether the pcpu is
a BSP or AP.
Tracked-On: #3420
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In x86 architecture, word/doubleword/quadword aligned read/write on its boundary
is atomic, so we may remove atomic load/store.
As for atomic set/clear, use bitmap_set/claer seems more reasonable. After replace
them all, we could remove them too.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
vCPU schedule state change is under schedule lock protection. So there's no need
to be atomic.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
ACRN Coding guidelines requires two different types pointer can't
convert to each other, except void *.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
ACRN Coding guidelines requires two different types pointer can't
convert to each other, except void *.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
ACRN Coding guidelines requires type conversion shall be explicity.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In almost case, vLAPIC will only be accessed by the related vCPU. There's no
synchronization issue in this case. However, other vCPUs could deliver interrupts
to the current vCPU, in this case, the IRR (for APICv base situation) or PIR
(for APICv advanced situation) and TMR for both cases could be accessed by more
than one vCPUS simultaneously. So operations on IRR or PIR should be atomical
and visible to other vCPUs immediately. In another case, vLAPIC could be accessed
by another vCPU when create vCPU or reset vCPU which could be supposed to be
consequently.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
In spite of vhm_req status could be updated in HV and DM on different CPUs, they
only change vhm_req status when they detect vhm_req status has been updated by
each other. So vhm_req status will not been misconfigured. However, before HV
sets vhm_req status to REQ_STATE_PENDING, vhm_req buffer filling should be visible
to DM. Add a write memory barrier to guarantee this.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Hypervisor exposes mitigation technique for Speculative
Store Bypass(SSB) to guests and allows a guest to determine
whether to enable SSBD mitigation by providing direct guest
access to IA32_SPEC_CTRL.
Before that, hypervisor should check the SSB mitigation support
on underlying processor, this patch is to add this capability check.
Tracked-On: #3385
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
cancel_event_injection is not need any more if we do 'scheudle' prior to
acrn_handle_pending_request. Commit "921288a6672: hv: fix interrupt
lost when do acrn_handle_pending_request twice" bring 'schedule'
forward, so remove cancel_event_injection related stuff.
Tracked-On: #3374
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
ACRN Coding guidelines requires no dead code.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
Microarchitectural Data Sampling (MDS) is a hardware vulnerability
which allows unprivileged speculative access to data which is available
in various CPU internal buffers.
1. Mitigation on ACRN:
1) Microcode update is required.
2) Clear CPU internal buffers (store buffer, load buffer and
load port) if current CPU is affected by MDS, when VM entry
to avoid any information leakage to guest thru above buffers.
3) Mitigation is not needed if ARCH_CAP_MDS_NO bit (bit5)
is set in IA32_ARCH_CAPABILITIES MSR (10AH), in this case,
current processor is no affected by MDS vulnerability, in other
cases mitigation for MDS is required.
2. Methods to clear CPU buffers (microcode update is required):
1) L1D cache flush
2) VERW instruction
Either of above operations will trigger clearing all
CPU internal buffers if this CPU is affected by MDS.
Above mechnism is enumerated by:
CPUID.(EAX=7H, ECX=0):EDX[MD_CLEAR=10].
3. Mitigation details on ACRN:
if (processor is affected by MDS)
if (processor is not affected by L1TF OR
L1D flush is not launched on VM Entry)
execute VERW instruction when VM entry.
endif
endif
4. Referrence:
Deep Dive: Intel Analysis of Microarchitectural Data Sampling
https://software.intel.com/security-software-guidance/insights/
deep-dive-intel-analysis-microarchitectural-data-sampling
Deep Dive: CPUID Enumeration and Architectural MSRs
https://software.intel.com/security-software-guidance/insights/
deep-dive-cpuid-enumeration-and-architectural-msrs
Tracked-On: #3317
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
Reviewed-by: Jason CJ Chen <jason.cj.chen@intel.com>
ACRN hypervisor always print CPU microcode update
warning message on KBL NUC platform, even after
BIOS was updated to the latest.
'check_cpu_security_cap()' returns false if
no ARCH_CAPABILITIES MSR support on current platform,
but this MSR may not be available on some platforms.
This patch is to remove this pre-condition.
Tracked-On: #3317
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Jason CJ Chen <jason.cj.chen@intel.com>
The current implement will trigger shutdown vm request on the BSP VCPU on the VM,
not the VCPU will trap out because triple fault. However, if the BSP VCPU on the VM
is handling another IO emulation, it may overwrite the triple fault IO request on
the vhm_request_buffer in function acrn_insert_request. The atomic operation of
get_vhm_req_state can't guarantee the vhm_request_buffer will not access by another
IO request if it is not running on the corresponding VCPU. So it should trigger
triple fault shutdown VM IO request on the VCPU which trap out because of triple
fault exception.
Besides, rt_vm_pm1a_io_write will do the right thing which we shouldn't do it in
triple_fault_shutdown_vm.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
According to SDM, bit N (physical address width) to bit 63 should be masked when calculate
host page frame number.
Currently, hypervisor doesn't set any of these bits, so gpa2hpa can work as expectd.
However, any of these bit set, gpa2hpa return wrong value.
Hypervisor never sets bit N to bit 51 (reserved bits), for simplicity, just mask bit 52 to bit 63.
Tracked-On: #3352
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Define/Use variable in place of code to improve readability:
Define new local variable struct pci_bar *vbar, and use vbar-> in place of vdev->bar[idx].
Define new local variable uint64_t vbar_base in init_vdev_pt
Rename uint64_t vbar[PCI_BAR_COUNT] of struct acrn_vm_pci_ptdev_config to uint64_t vbar_base[PCI_BAR_COUNT]
Tracked-On: #3241
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
vcpu is never scan because of scan tool will be crashed!
After modulization, the vcpu can be scaned by the scan tool.
Clean up the violations in vcpu.c.
Fix the violations:
1.No brackets to then/else.
2.Function return value not checked.
3.Signed/unsigned coversion without cast.
V1->V2:
change the type of "vcpu->arch.irq_window_enabled" to bool.
V2->V3:
add "void *" prefix on the 1st parameter of memset.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The current implement will cache each ISR vector in ISR vector stack and do
ISR vector stack check when updating PPR. However, there is no need to do this
because:
1) We will not touch vlapic->isrvec_stk[0] except doing vlapic_reset:
So we don't need to do vlapic->isrvec_stk[0] check.
2) We only deliver higher priority interrupt from IRR to ISR:
So we don't need to check whether vlapic->isrvec_stk interrupts is always increasing.
3) There're only 15 different priority interrupt, It will not happened that more that
15 interrupts could been delivered to ISR:
So we don't need to check whether vlapic->isrvec_stk_top will larger than
ISRVEC_STK_SIZE which is 16.
This patch try to remove ISR vector stack and use isrv to cache the vector number for
the highest priority bit that is set in the ISR.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Remove couple of run-time ASSERTs in ioapic module by checking for the
number of interrupt pins per IO-APICs against the configured MAX_IOAPIC_LINES
in the initialization flow.
Also remove the need for two MACROs specifying the max. number of
interrupt lines per IO-APIC and add a config item MAX_IOAPIC_LINES for the
same.
Tracked-On: #3299
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The has_rt_vm walk through all VMs to check RT VM flag and if
there is no any RT VM, then return false otherwise return true.
Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
The ept_flush_leaf_page API is used to flush address space
from a ept page entry, user can use it to match walk_ept_mr to
flush VM address space.
Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The walk_ept_table API is used to walk through EPT table for getting
all of present pages, user can get each page entry and its size
from the walk_ept_table callback.
The get_ept_entry is used to getting EPT pointer of the vm, if current
context of mv is secure world, return secure world EPT pointer, otherwise
return normal world EPT pointer.
Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
flush_address_space is used to flush address space by clflushopt instruction.
Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
CLFLUSHOPT is used to invalidate from every level of the cache hierarchy
in the cache coherence domain the cache line that contains the linear
address specified with memory operand. If that cache line contains
modified date at any level of the cache hierarchy, that data is written
back to memory.
If the platform does not support CLFLUSHOPT instruction, boot will fail.
Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The current implement doesn't clear which access type we support for
APIC-Access VM Exit:
1) linear access for an instruction fetch
-- APIC-access page is mapped as UC which doesn't support fetch
2) linear access (read or write) during event delivery
-- Which is not happened in normal case except the guest went wrong, such as,
set the IDT table in APIC-access page. In this case, we don't need to support.
3) guest-physical access during event delivery;
guest-physical access for an instruction fetch or during instruction execution
-- Do we plan to support enable APIC in real mode ? I don't think so.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>