Physical address to SW SRAM region maybe different
on different platforms, this hardcoded address may
result in address mismatch for SW SRAM operations.
This patch removes above hardcoded address and uses
the physical address parsed from native RTCT.
Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
'ptcm' and 'ptct' are legacy name according
to the latest TCC spec, hence rename below files
to avoid confusing:
ptcm.c -> rtcm.c
ptcm.h -> rtcm.h
ptct.h -> rtct.h
Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
When "signal_event" is called, "wait_event" will actually not block.
So it is ok to remove this line.
Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
Now, we use hash table to maintain intx irq mapping by using
the key generated from sid. So once the entry is added,we can
not update source ide any more. Otherwise, we can't locate the
entry with the key generated from new source ide.
For source id change, remove_remapping/add_remapping is used
instead of update source id directly if entry was added already.
Tracked-On: #5640
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch move the split-lock logic into dedicated file
to reduce LOC. This may make the logic more clear.
Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
This patch adds a cache register for VMX_PROC_VM_EXEC_CONTROLS
to avoid the frequent VMCS access.
Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
The TF is visible to guest which may be modified by
the guest, so it is not a safe method to emulate the
split-lock. While MTF is specifically designed for
single-stepping in x86/Intel hardware virtualization
VT-x technology which is invisible to the guest. Use MTF
to single step the VCPU during the emulation of split lock.
Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
For a SMP guest, split-lock check may happen on
multiple vCPUs simultaneously. In this case, one
vCPU at most can be allowed running in the
split-lock emulation window. And if the vCPU is
doing the emulation, it should never be blocked
in the hypervisor, it should go back to the guest
to execute the lock instruction immediately and
trap back to the hypervisor with #DB to complete the
split-lock emulation.
Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
We have trapped the #DB for split-lock emulation.
Only fault exception need RIP being retained.
Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
xchg may also cause the #AC for split-lock check.
This patch adds this emulation.
1. Kick other vcpus of the guest to stop execution
if the guest has more than one vcpu.
2. Emulate the xchg instruction.
3. Notify other vcpus (if any) to restart execution.
Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch adds the split-lock emulation.
If a #AC is caused by instruction with LOCK prefix then
emulate it, otherwise, inject it back as it used to be.
1. Kick other vcpus of the guest to stop execution
and set the TF flag to have #DB if the guest has more
than one vcpu.
2. Skip over the LOCK prefix and resume the current
vcpu back to guest for execution.
3. Notify other vcpus to restart exception at the end
of handling the #DB since we have completed
the LOCK prefix instruction emulation.
Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Check hardware support for all features in CR4,
and hide bits from guest by vcpuid if they're not supported
for guests OS.
Tracked-On: #5586
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
- The current code to virtualize CR0/CR4 is not
well designed, and hard to read.
This patch reshuffle the logic to make it clear
and classify those bits into PASSTHRU,
TRAP_AND_PASSTHRU, TRAP_AND_EMULATE & reserved bits.
Tracked-On: #5586
Signed-off-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
While following two styles are both correct, the 2nd one is simpler.
bool is_level_triggered;
1. if (is_level_triggered == true) {...}
2. if (is_level_triggered) {...}
This patch cleans up the style in hypervisor.
Tracked-On: #861
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
From SDM Vol.2C - XSETBV instruction description,
If CR4.OSXSAVE[bit 18] = 0,
execute "XSETBV" instruction will generate #UD exception.
From SDM Vol.3C 25.1.1,#UD exception has priority over VM exits,
So if vCPU execute "XSETBV" instruction when CR4.OSXSAVE[bit 18] = 0,
VM exits won't happen.
While hv inject #GP if vCPU execute "XSETBV" instruction
when CR4.OSXSAVE[bit 18] = 0.
It's a wrong behavior, this patch will fix the bug.
Tracked-On: #4020
Signed-off-by: Junming Liu <junming.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
It is possible for more than one vCPUs to trigger shutdown on an RTVM.
We need to avoid entering VM_READY_TO_POWEROFF state again after the
RTVM has been paused or shut down.
Also, make sure an RTVM enters VM_READY_TO_POWEROFF state before it can
be paused.
v1 -> v2:
- rename to poweroff_if_rt_vm for better clarity
Tracked-On: #5411
Signed-off-by: Peter Fang <peter.fang@intel.com>
Currently, ACRN only support shutdown when triple fault happens, because ACRN
doesn't present/emulate a virtual HW, i.e. port IO, to support shutdown. This
patch emulate a virtual shutdown component, and the vACPI method for guest OS
to use.
Pre-launched VM uses ACPI reduced HW mode, intercept the virtual sleep control/status
registers for pre-launched VMs shutdown
Tracked-On: #5411
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Like post-launched VMs, for pre-launched VMs, the ACPI reset register
is also fixed at 0xcf9 and the reset value is 0xE, so pre-launched VMs
now also use ACPI reset register for rebooting.
Tracked-On: #5411
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
More than one VM may request shutdown on the same pCPU before
shutdown_vm_from_idle() is called in the idle thread when pCPUs are
shared among VMs.
Use a per-pCPU bitmap to store all the VMIDs requesting shutdown.
v1 -> v2:
- use vm_lock to avoid a race on shutdown
Tracked-On: #5411
Signed-off-by: Peter Fang <peter.fang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add two Kconfig pSRAM config:
one for whether to enable the pSRAM on the platfrom or not;
another for if the pSRAM is enabled on the platform whether to enable
the pSRAM in the pre-launched RTVM.
If we enable the pSRAM on the platform, we should remove the pSRAM EPT
mapping from the SOS to prevent it could flush the pSRAM cache.
Tracked-On: #5330
Signed-off-by: Qian Wang <qian1.wang@intel.com>
1.Modified the virtual e820 table for pre-launched VM. We added a
segment for pSRAM, and thus lowmem RAM is split into two parts.
Logics are added to deal with the split.
2.Added EPT mapping of pSRAM segment for pre-launched RTVM if it
uses pSRAM.
Tracked-On: #5330
Signed-off-by: Qian Wang <qian1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
pSRAM memory should be cachable. However, it's not a RAM or a normal MMIO,
so we can't use the an exist API to do the EPT mapping and set the EPT cache
attribute to WB for it. Now we assume that SOS must assign the PSRAM area as
a whole and as a separate memory region whose base address is PSRAM_BASE_HPA.
If the hpa of the EPT mapping region is equal to PSRAM_BASE_HPA, we think this
EPT mapping is for pSRAM, we change the EPT mapping cache attribute to WB.
And fix a minor bug when SOS trap out to emulate wbinvd when pSRAM is enabled.
Tracked-On: #5330
Signed-off-by: Qian Wang <qian1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Use ept_flush_leaf_page to emulate guest WBINVD when PTCM is enabled and skip
the pSRAM in ept_flush_leaf_page.
TODO: do we need to emulate WBINVD in HV side.
Tracked-On: #5330
Signed-off-by: Qian Wang <qian1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
According 11.5.1 Cache Control Registers and Bits, Intel SDM Vol 3,
change CR0.CD will not flush cache to insure memory coherency. So
it's not needed to call wbinvd to flush cache in ACRN Hypervisor.
That's what the guest should do.
Tracked-On: #5330
Signed-off-by: Qian Wang <qian1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
support pci-vuart type, and refine:
1.Rename init_vuart() to init_legacy_vuarts(), only init PIO type.
2.Rename deinit_vuart() to deinit_legacy_vuarts(), only deinit PIO type.
3.Move io handler code out of setup_vuart(), into init_legacy_vuarts()
4.add init_pci_vuart(), deinit_pci_vuart, for one pci vuart vdev.
and some change from requirement:
1.Increase MAX_VUART_NUM_PER_VM to 8.
Tracked-On: #5394
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
This function can be used by other modules instead of hypercall
handling only, hence move it to vlapic.c
Tracked-On: #5407
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Li, Fei <fei1.li@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now ACRN supports direct boot mode, which could be SBL/ABL, or GRUB boot.
Thus the vboot wrapper layer can be removed and the direct boot functions
don't need to be wrapped in direct_boot.c:
- remove call to init_vboot(), and call e820_alloc_memory() directly at the
time when the trampoline buffer is actually needed.
- Similarly, call CPU_IRQ_ENABLE() instead of the wrapper init_vboot_irq().
- remove get_ap_trampoline_buf(), since the existing function
get_trampoline_start16_paddr() returns the exact same value.
- merge init_general_vm_boot_info() into init_vm_boot_info().
- remove vm_sw_loader pointer, and call direct_boot_sw_loader() directly.
- move get_rsdp_ptr() from vboot_wrapper.c to multiboot.c, and remove the
wrapper over two boot modes.
Tracked-On: #5197
Signed-off-by: Zide Chen <zide.chen@intel.com>
The commit of da81a0041d
"HV: add e820 ACPI entry for pre-launched VM" introduced a issue that the
base_hpa and remaining_hpa_size are also calculated on the entry of 32bit
PCI hole which from 0x80000000 to 0xffffffff, which is incorrect;
Tracked-On: #5266
Signed-off-by: Victor Sun <victor.sun@intel.com>
Per PCI Firmware Specification Revision 3.0, 4.1.2. MCFG Table Description:
Memory Mapped Enhanced Configuration Space Base Address Allocation Structure
assign the Start Bus Number and the End Bus Number which could decoded by the
Host Bridge. We should not access the PCI device which bus number outside of
the range of [Start Bus Number, End Bus Number).
For ACRN, we should:
1. Don't detect PCI device which bus number outside the range of
[Start Bus Number, End Bus Number) of MCFG ACPI Table.
2. Only trap the ECAM MMIO size: [MMCFG_BASE_ADDRESS, MMCFG_BASE_ADDRESS +
(End Bus Number - Start Bus Number + 1) * 0x100000) for SOS.
Tracked-On: #5233
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Previously we use a pre-defined structure as vACPI table for pre-launched
VM, the structure is initialized by HV code. Now change the method to use a
pre-loaded multiboot module instead. The module file will be generated by
acrn-config tool and loaded to GPA 0x7ff00000, a hardcoded RSDP table at
GPA 0x000f2400 will point to the XSDT table which at GPA 0x7ff00080;
Tracked-On: #5266
Signed-off-by: Victor Sun <victor.sun@intel.com>
Signed-off-by: Shuang Zheng <shuang.zheng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Previously the ACPI table was stored in F segment which might not be big
enough for a customized ACPI table, hence reserve 1MB space in pre-launched
VM e820 table to store the ACPI related data:
0x7ff00000 ~ 0x7ffeffff : ACPI Reclaim memory
0x7fff0000 ~ 0x7fffffff : ACPI NVS memory
Tracked-On: #5266
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
When HV pass through the P2SB MMIO device to pre-launched VM, vgpio
device model traps MMIO access to the GPIO registers within P2SB so
that it can expose virtual IOAPIC pins to the VM in accordance with
the programmed mappings between gsi and vgsi.
Tracked-On: #5246
Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add the capability of forwarding specified physical IOAPIC interrupt
lines to pre-launched VMs as virtual IOAPIC interrupts. This is for the
sake of the certain MMIO pass-thru devices on EHL CRB which can support
only INTx interrupts.
Tracked-On: #5245
Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add HC_CREATE_VDEV and HC_DESTROY_VDEV two hypercalls that are used to
create and destroy an emulated device(PCI device or legacy device) in hypervisor
v3: 1) change HC_CREATE_DEVICE and HC_DESTROY_DEVICE to HC_CREATE_VDEV
and HC_DESTROY_VDEV
2) refine code style
v4: 1) remove unnecessary parameter
2) add VM state check for HC_CREATE_VDEV and HC_DESTROY hypercalls
Tracked-On: #4853
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
-- use an array to fast locate the hypercall handler
to replace switch case.
-- uniform hypercall handler as below:
int32_t (*handler)(sos_vm, target_vm, param1, param2)
Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Fix the bug for "is_apl_platform" func.
"monitor_cap_buggy" is identical to "is_apl_platform", so remove it.
On apl platform:
1) ACRN doesn't use monitor/mwait instructions
2) ACRN disable GPU IOMMU
Tracked-On:#3675
Signed-off-by: Junming Liu <junming.liu@intel.com>
-- move vm_state_lock to other place in vm structure
to avoid the memory waste because of the page-aligned.
-- remove the memset from create_vm
-- explicitly set max_emul_mmio_regions and vcpuid_entry_nr to 0
inside create_vm to avoid use without initialization.
-- rename max_emul_mmio_regions to nr_emul_mmio_regions
v1->v2:
add deinit_emul_io in shutdown_vm
Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Previously the CPU affinity of SOS VM is initialized at runtime during
sanitize_vm_config() stage, follow the policy that all physical CPUs
except ocuppied by Pre-launched VMs are all belong to SOS_VM. Now change
the process that SOS CPU affinity should be initialized at build time
and has the assumption that its validity is guarenteed before runtime.
Tracked-On: #5077
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
To hide CET feature from guest VM completely, the MSR IA32_MSR_XSS also
need to be intercepted because it comprises CET_U and CET_S feature bits
of xsave/xstors operations. Mask these two bits in IA32_MSR_XSS writing.
With IA32_MSR_XSS interception, member 'xss' of 'struct ext_context' can
be removed because it is duplicated with the MSR store array
'vcpu->arch.guest_msrs[]'.
Tracked-On: #5074
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Return-oriented programming (ROP), and similarly CALL/JMP-oriented
programming (COP/JOP), have been the prevalent attack methodologies for
stealth exploit writers targeting vulnerabilities in programs.
CET (Control-flow Enforcement Technology) provides the following
capabilities to defend against ROP/COP/JOP style control-flow subversion
attacks:
* Shadow stack: Return address protection to defend against ROP.
* Indirect branch tracking: Free branch protection to defend against
COP/JOP
The full support of CET for Linux kernel has not been merged yet. As the
first stage, hide CET from guest VM.
Tracked-On: #5074
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Add mmio device pass through support for pre-launched VM.
When we pass through a MMIO device to pre-launched VM, we would remove its
resource from the SOS. Now these resources only include the MMIO regions.
Tracked-On: #5053
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Add two hypercalls to support MMIO device pass through for post-launched VM.
And when we support MMIO pass through for pre-launched VM, we could re-use
the code in mmio_dev.c
Tracked-On: #5053
Signed-off-by: Li Fei1 <fei1.li@intel.com>
During context switch in hypervisor, xsave/xrstore are used to
save/resotre the XSAVE area according to the XCR0 and XSS. The legacy
region in XSAVE area include FPU and SSE, we should make sure the
legacy region be saved during contex switch. FPU in XCR0 is always
enabled according to SDM.
For SSE, we enable it in XCR0 during context switch.
Tracked-On: #5062
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>