add pgtable callbacks set_pgentry to implement arch specific
set generic page table entry for any paging level, and remove
x86 specific tweak_exe_right/recover_exe_right callbacks, move
the logic in set_pgentry callback.
remove common set_pgentry function to avoid confusing.
Tracked-On: #8831
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
move early_pgtable_map_uart and pgtable_create_trusty_root
to x86 code, and provide calling with x86 private header
pagemisc.h
Tracked-On: #8831
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Liu, Yifan1 <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
this patch moves function xx_offset and xx_index to common code,
Add arch interface arch_quirk/arch_pgtle_page_vaddr and
arch_pgtle_large.
Tracked-On: #8831
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Liu, Yifan1 <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
move _page_table_level to common, and rename functions and
variables to comform with pgtln style
when we refer to pgtl0e, it means the lowest translation
table entry, while the "pte" refers generic page table entry.
Tracked-On: #8831
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
add x86 mm_common.h to map common macro name to x86 name
and chang them in common/mmu.c, replace XX_PFN_MASK with
PFN_MASK, since they are the same.
Tracked-On: #8831
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Liu, Yifan1 <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
move struct pgtable and page_pool to common code and
move alloc_page/free_page/init_page_pool to common
Tracked-On: #8831
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Liu, Yifan1 <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
move some funcitons like hpa2hva to common file.
change some files to include file from asm/pgtable.h to common/pgtable.h
Tracked-On: #8831
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
Only x86 has local APIC pass-thru. For release mode, the console_vmexit_callback
is empty, complier should optimize this unuseful check.
Tracked-On: #8805
Signed-off-by: Fei Li <fei1.li@intel.com>
get_vm_from_vmid,is_paused_vm and is_poweroff_vm should be common APIs.
But now doesn't implement them as common for not introduce more VM related
data structure and APIs.
Tracked-On: #8805
Signed-off-by: Fei Li <fei1.li@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
MMIO read/write without memory order should be common ARCH, without
PIO support shouldn't use PIO APIs, so implement them as empty.
Tracked-On: #8807
Signed-off-by: Fei Li <fei1.li@intel.com>
Signed-off-by: Haoyu Tang <haoyu.tang@intel.com>
Reviewed-by: Yifan Liu <yifan1.liu@intel.com>
stack_frame is not only for vcpu thread, host thread needs
it, so move stack_frame out of vcpu file.
Tracked-On: #8812
Signed-off-by: Xue Bosheng <bosheng.xue@intel.com>
Reviewed-by: Yifan Liu <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
delivery mode and idle mode are x86 specific percpu, so move it from common to
x86 arch, also change the name of mode_to_idle to be idle_mode, change the name
of mode_to_kick_pcpu to be kick_pcpu_mode.
Tracked-On: #8812
Signed-off-by: Xue Bosheng <bosheng.xue@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
Convert IRQ-related macros to static inline functions and introduce
wrappers for arch-specific implementations. This follows the style we
defined for multi-arch development.
This is a follow-up update for commit
a7239d126 ("[FIXME] hv: risc-v add denpended implementation in cpu.h").
CPU_IRQ_ENABLE_ON_CONFIG -> local_irq_enable
CPU_IRQ_DISABLE_ON_CONFIG -> local_irq_disable
CPU_INT_ALL_DISABLE -> local_irq_save
CPU_INT_ALL_RESTORE -> local_irq_restore
Tracked-On: #8813
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Reviewed-by: Yifan Liu <yifan1.liu@intel.com>
Extract common interface to include/lib/bits.h, and invoke the variant
implementation of arch.
Re-implement unlocked functions as C in common library.
Rename bitmap*lock() to bitmap*(), bitmap*nolock() to bitmap*non_atomic().
Tracked-On: #8803
Signed-off-by: Haoyu Tang <haoyu.tang@intel.com>
Reviewed-by: Yifan Liu <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
Extract common interface to include/lib/spinlock.h, and invoke the
variant implementation of arch.
Refine assemble macro code in case that ASSEMBLER defined.
Tracked-On: #8803
Signed-off-by: Haoyu Tang <haoyu.tang@intel.com>
Reviewed-by: Yifan Liu <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
Move x86 architecture dependent per cpu data into a
seperate structure and embeded it in per_cpu_region.
caller could access architecture dependent member by
using prefix 'arch.'.
v2->v3:
move whose_iwkey, profiling_info and tsc_suspend to x86
v1->v2:
rebased on latest repo
Tracked-On: #8801
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Reviewed-by: Liu, Yifan1 <yifan1.liu@intel.com>
Reviewed-by: Chen, Jian Jun<jian.jun.chen@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
Since there is no common IPI abstraction, the arch_ prefix is redundant.
This patch renames the functions as follows:
- arch_send_dest_ipi_mask -> send_dest_ipi_mask
- arch_send_single_ipi -> send_single_ipi
Tracked-On: #8799
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Rename send_single_ipi() and send_dest_ipi_mask() to
arch_send_single_ipi() and arch_send_dest_ipi_mask() in x86, to make the
naming consistent with the RISC-V implementation and reflect that these
functions are arch-specific.
Tracked-On: #8786
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
Mark hypervisor memory region as unusable in its e820 table to avoid
being overlapped by e820_alloc_memory(). As it is already filtered out
in hypervisor e820 table, there is no longer need to filter it out in
service VM e820.
Tracked-On: #8738
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
The hypervisor image size is determined at link time, but now it is
calculated and stored in a global variable during mmu initialization,
and the helper function reads from that variable. Change to calculate
it inside helper function to avoid inconsistency.
Tracked-On: #8738
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
The VM-exit instruction length(VMX_EXIT_INSTR_LEN) in VMCS is undefined
on EPT violation, except during delivery of a software interrupt,
privileged software exception, or software exception[1]. Although CPU
is likely to set the field, it can be incorrect in certain cases, such
as cmp+jcc and test+jcc.
Since hypervisor does not know exactly how much bytes needed, and GVA
translation is costly, it first copies at most 15 (VIE_INST_SIZE) bytes
within the page, then decodes the instruction. If more bytes are needed
during decoding and copied length is less than 15, it copies remaining
bytes.
[1] 29.2.5, https://cdrdv2-public.intel.com/671200/325462-sdm-vol-1-2abcd-3abcd.pdf
Tracked-On: #8756
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
The Access Size field in ACPI GAS was not introduced before ACPI 2.0,
Errata C. It is not guaranteed to be a non zero value, like QEMU
programs it to 0. As it only indicates how many bytes it can be
accessed at once, the register size should be determined by Bit Width
and Bit Offset. In IO space, Bit Offset is always 0, the size is
(Bit Width / 8).
Tracked-On: #8771
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@intel.com>
Reviewed-by: Li Fei <fei1.li@intel.com>
In current code process, hyperv data in struct vm_arch is never cleared
during VM shutdown and is retained to next VM launch. As the enabled
bit of hypercall_page msr is not clear, hypercall page might cause fatal
error such as Windows VM BSOD during VM restart and memory
remapping. Hyperv page destory function can ensure hyperv page is
destory during each VM shutdown so hyperv related config such as
hypercall page is established correctly during each VM launch.
Tracked-On: #8755
Signed-off-by: Yichong Tang <yichong.tang@intel.com>
Add reset_control in acrn_vm. Use this reset_control to simulate
RESET_CONTROL(0xCF9) register in hypervisor.
Tracked-On: #8724
Signed-off-by: Yuan Lu <yuan.y.lu@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Service VM may write 0x6 to port 0xcf9 to trigger a warm reset, but
current hypervisor always performs a cold reset by writing 0xE to CF9.
Hypervisor should reboot the system in the same mode as Service VM
specified. Specific OS features (like linux pstore) requires warm
reset to keep data across reboot.
The behavior of hv console's reboot command (cold reset) remains
unchanged.
Tracked-On: #8539
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Per SDM, VPDPBUSD/VPDPBUSDS/VPDPWSSD/VPDPWSSDS instructions depend on
CPUID Feature Flag 'AVX-VNNI, AVX512_VNNI, AVX512VL'. 'AVX512_VNNI' and
'AVX512VL' are already exposed to any VM.
'AVX-VNNI' is in CPUID.(EAX=07H,ECX=1):EAX.AVX-VNNI[bit 4]. This patch
is to expose all the CPUID.EAX=07H subleaf features to VMs.
Mask corresponding bits if want to disable some features in the future.
Tracked-On: #8710
Reviewed-by: Fei Li <fei1.li@intel.com>
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
Remove unreachable code branch in line 163:
if CR0 enabled WP, supervisor-mode writing a read-only page have
been checked in line 109.
Merge redundant checking:
if smap is enabled, supervisor-mode can't access user-mode address
when eflags.ac disabled.
Tracked-On: #8708
Signed-off-by: Haoyu Tang <haoyu.tang@intel.com>
Some hypercalls return -ENODEV which should be set into RAX as return
value, e.g. HC_ASSIGN_PCIDEV. So, remove the check in
vmcall_vmexit_handler() and change return value to -EACCESS if the
hypercall is not sent from Service VM or allowed VM.
Tracked-On: #8598
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Guest VM, such as Linux, may read RESET_CONTROL(0xCF9) register
before writing to, in this case, ACRN should not always return
dummy value.
Tracked-On: #8688
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
GUEST_FLAG_STATELESS indicates guest is running a stateless operating
system and need to be shutdown forcefully without data loss. This flag
is only appalicable to pre-launched VM. For TEE_VM, this flag will be
set implicitly.
Tracked-On: #8671
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Now multiboot modules memory is already reserved from e820 in function
`alloc_mods_memory()` and Service VM will not corrupt pre-launched VM
modules.
So remove the code of Service VM delayed loading.
Tracked-On: #8652
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
1. Enable Service VM to power off or restart the whole platform even when RTVM is running.
2. Allow Service VM stop the RTVM using acrnctl tool with option "stop -f".
3. Add 'Service VM supervisor role enabled' option in ACRN configurator
Tracked-On: #8618
Signed-off-by: YuanXin-Intel <xin.yuan@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>
Now only BSP is reset. After Service VM OS resumes from s3, APs'
apic_base_msr are incorrect with x2apic bit en.
To avoid incorrect states, do `reset_vm` after resume.
Tracked-On: #8623
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
Some cpuids will return invalid values on hybrid platform because of the
error in the pointer arithmetic. Add `(void *)` before
`cpu_cpuids.leaves`.
Leaf 0x14 is used to report Intel Processor Trace Enumeration and varies
between P-cores and E-cores on hybrid platform. So add it to
`hybrid_leaves`.
Tracked-On: #8608
Fixes: 59a8cc4c2 ("hv: cpuid: make leaf 0x4 per-cpu in hybrid architecture")
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
In HV, cpuid uses the lower 32 bits of rax\rbx\rcx\rdx registers to pass parameters,
But the software does not clear the upper 32-bit registers, if the guest
uses 64-bit variables to pass parameters to cpuid,guest will use rax\rbx\rcx\rdx,
not eax\ebx\ecx\edx, the previous value of the high 32 registers will affect the guest.
Tracked-On: #8605
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Signed-off-by: andi6 <andi6@xiaomi.com>
P-cores and E-cores accessing leaf 0x2U/0x14U/0x16U/0x18U/0x1A/0x1C/0x80000006U
will have different information in hybrid architecture.
So add them to per-cpu list in hybrid architecture and directly return
the physical value.
Note: 0x14U is hided and return 0.
Tracked-On: #8608
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
Leaf 0x6 returns thermal and power management information. In
hybrid architecture, P-cores and E-cores have different information.
Add leaf 0x6 to per-cpu list in hybrid architecture and handle specific
cpuid access.
Tracked-On: #8608
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
Leaf 0x4 returns deterministic cache parameters for each level. In
hybrid architecture, P-cores and E-cores have different cache
information.
Add leaf 0x4 to per-cpu list in hybrid architecture and handle specific
cpuid access.
Tracked-On: #8608
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
CPUID returns processor identification and feature information.
Different pcpus may return different infos. That is, the info is
per-cpu.
In hybrid architecture, per-cpu leaf is different from the previous. So
introduce a struct percpu_cpuids to indicate the per-cpu leaf. struct
percpu_cpuids will consist of two parts: generic percpu leaves and
hybrid related percpu leaves.
This patch is just to add generic percpu leaves.
Tracked-On: #8608
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
CPUID leaf 1f is preferred superset of leaf 0b, currently ACRN exposes
leaf 0b but leaf 1f is empty so the 2 leaves mismatch, and so
application will follow the SDM to check 1f first.
Tracked-On: #8608
Signed-off-by: Xin Zhang <xin.x.zhang@intel.com>
This patch can fetch the thermal lvt irq and propagate
it to VM.
At this stage we support the case that there is only one VM
governing thermal. And we pass the hardware thermal irq to this VM.
First, we register the handler for thermal lvt interrupt, its irq
vector is THERMAL_VECTOR and the handler is thermal_irq_handler().
Then, when a thermal irq occurs, it flags the SOFTIRQ_THERMAL bit
of softirq_pending, This bit triggers the thermal_softirq() function.
And this function will inject the virtual thermal irq to VM.
Tracked-On: #8595
Signed-off-by: Zhangwei6 <wei6.zhang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
In this phase, we only use one VM to control thermal.
So we make thermal MSRs readable and writable by this VM.
This VM is flagged with GUEST_FLAG_VTM, and can
read/write thermal MSRs.
For the VMs not flagged with GUEST_FLAG_VTM,
can only read these thermal MSRs to get current status.
Tracked-On: #8595
Signed-off-by: Zhangwei6 <wei6.zhang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
In the triple fault handler, post-launched VMs are instantly turned
off. Now a vm event is generated simultaneously. So that
developers can capture the event and decide what to do with it. (e.g.,
logging and populating diagnostics, or poweroff VM)
Tracked-On: #8547
Signed-off-by: Wu Zhou <wu.zhou@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
This patch creates vm_event support in HV, including:
1. Create vm_event data type.
2. Add vm_event sbuf and its initializer. The sbuf will be allocated by
DM in Service VM. Its page address will then be share to HV through
hypercall.
3. Add an API to send the HV generated event.
Tracked-On: #8547
Signed-off-by: Wu Zhou <wu.zhou@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Abstract out schedulers config data for vCPU threads and other hypervisor
threads to sched_params structure. And it's used to initialize per
thread scheduler private data. The sched_params for vCPU threads come
from vm_config generated by config tools while other hypervisor threads
need give them explicitly.
Tracked-On: #8500
Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>