Commit Graph

1413 Commits

Author SHA1 Message Date
hangliu1
a72bd5e076 hv: multiarch: abstract pcpu related data from x86
Move phys_cpu_num and pcpu_active_bitmap to common, which could be
only accessed by interfaces provided by smp.h.

v2->v3:
1. move ALL_CPUS_MASK/AP_MASK to common cpu.h

v1->v2:
1. preserve phys_cpu_num in x86 but implement arch_get_num_available_cpus()
   to provide interface for common code to access.
2. change function name test_xx to check_xx

Tracked-On: #8801
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Reviewed-by: Liu, Yifan1 <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-19 15:04:55 +08:00
hangliu1
0cd5567140 hv:multiarch:riscv: add dummy function
add dummy function bitmap_set_lock, need to be replaced
with real implementation.

Tracked-On: #8801
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Reviewed-by: Liu, Yifan1 <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-19 15:04:55 +08:00
hangliu1
41f04b4e7a HV: abstract percpu structure from x86 arch
Move x86 architecture dependent per cpu data into a
seperate structure and embeded it in per_cpu_region.
caller could access architecture dependent member by
using prefix 'arch.'.

v2->v3:
move whose_iwkey, profiling_info and tsc_suspend to x86
v1->v2:
rebased on latest repo

Tracked-On: #8801
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Reviewed-by: Liu, Yifan1 <yifan1.liu@intel.com>
Reviewed-by: Chen, Jian Jun<jian.jun.chen@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-19 15:04:55 +08:00
hangliu1
286c383b53 hv: multiarch: riscv: add dummy header for riscv
add dummy headrer file for early compile, should be replaced totally later.

Tracked-On: #8801
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Reviewed-by: Liu, Yifan1 <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-19 15:04:55 +08:00
hangliu1
b5306f2a74 enable common code compile
enable riscv common file compile and add PAGE_SIZE
definition to make riscv compile pass.

Tracked-On: #8801
Signed-off-by: hangliu1 <hang1.liu@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Reviewed-by: Liu, Yifan1 <yifan1.liu@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-19 15:04:55 +08:00
Shiqing Gao
037c479221 hv: ipi: drop arch_ prefix from IPI functions
Since there is no common IPI abstraction, the arch_ prefix is redundant.
This patch renames the functions as follows:
    - arch_send_dest_ipi_mask -> send_dest_ipi_mask
    - arch_send_single_ipi   -> send_single_ipi

Tracked-On: #8799
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
2025-09-19 13:56:21 +08:00
Haicheng Li
f6bf0b809d risc-v: initial timer codes
This patch implements risc-v specific timer codes. Basically,
risc-v adapts to acrn timer framework with some specific
behaviors. So far, it enables sstc support in h-mode.

Tracked-On: #8792
Signed-off-by: Haicheng Li <haicheng.li@outlook.com>
Co-developed-by: Yong Li <yong.li@intel.com>
Signed-off-by: Yong Li <yong.li@intel.com>
Co-developed-by: Yi Y Sun <yi.y.sun@intel.com>
Signed-off-by: Yi Y Sun <yi.y.sun@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-17 08:55:12 +08:00
Haicheng Li
e09fc9a218 risc-v: add SBI timer interface
This patch adds sbi_set_timer() function to set timer through
SBI interface.

Tracked-On: #8792
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
Co-developed-by: Yi Y Sun <yi.y.sun@intel.com>
Signed-off-by: Yi Y Sun <yi.y.sun@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-17 08:55:12 +08:00
Haicheng Li
d074ce85eb hv: multi-arch add RISC-V memory library
RISC-V use common memory library.

Tracked-On: #8794
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
Co-developed-by: Haoyu Tang <haoyu.tang@intel.com>
Signed-off-by: Haoyu Tang <haoyu.tang@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-15 14:07:58 +08:00
Haicheng Li
7573e436a5 hv: multi-arch reconstruct memory library
Split common and X86 arch specific codes.
For arch without specific memory library,add common library
to use.

Tracked-On: #8794
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
Co-developed-by: Haoyu Tang <haoyu.tang@intel.com>
Signed-off-by: Haoyu Tang <haoyu.tang@intel.com>
Signed-off-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-15 14:07:58 +08:00
Jian Jun Chen
b854a24109 hv: risc-v: add C entry function of BSP and APs
Tracked-On: #8788
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-15 13:12:21 +08:00
Haicheng Li
f134e2b741 hv: smpcall: riscv: implement SMP call handlers
- Introduce arch/riscv/notify.c to implement RISC-V specific SMP call handlers.

Tracked-On: #8786
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
Co-developed-by: Shiqing Gao <shiqing.gao@intel.com>
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-09 16:37:04 +08:00
Shiqing Gao
b0a4c2d024 [FIXME] hv: smpcall: riscv: add placeholder implementations for dependent code
This patch provides dummy implementations of functions and data
structures required for the IPI and SMP call on RISC-V.
It serves as a placeholder to ensure RISC-V builds pass and
is not needed for the final merge.

Official implementations are still WIP by other engineers:

 - To be provided in the library patchset (by Haoyu):
    uint16_t ffs64(uint64_t value);
    bool bitmap_test(uint16_t nr, const volatile uint64_t *addr);
    void bitmap_clear_lock(uint16_t nr_arg, volatile uint64_t *addr);
    void bitmap_clear_nolock(uint16_t nr_arg, volatile uint64_t *addr);
    uint64_t atomic_cmpxchg64(volatile uint64_t *ptr, uint64_t old, uint64_t new);

 - To be provided in the platform initialization patchset (by Hang):
    void wait_sync_change(volatile const uint64_t *sync, uint64_t wake_sync);
    bool is_pcpu_active(uint16_t pcpu_id);
    uint16_t get_pcpu_id(void);

----------
Changelog:
 * Split per_cpu.h implementation into a separate commit.

Tracked-On: #8786
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-09 16:37:04 +08:00
Haicheng Li
b8542f7def hv: riscv: per_cpu: add initial version
Since smpcall depends on the per_cpu_region data structure to access
smp_call_info_data, this patch adds the initial version of per_cpu
support on RISC-V. For now it only includes SMP call related info.

Further refinement will be done in the platform initialization patchset
(by Hang).

Tracked-On: #8786
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
Co-developed-by: Shiqing Gao <shiqing.gao@intel.com>
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-09 16:37:04 +08:00
Shiqing Gao
a5e46a70ff hv: smpcall: abstract a common SMP call framework
This patch:
 - abstracts the common logic from existing x86 implementation
 - moves x86-specific logic to arch/x86/notify.c

A new common/notify.{c,h} is introduced to provide a common SMP call framework for
multi-arch support in ACRN.
arch-specific files such as arch/{x86,riscv}/notify.c is aim to provide the
corresponding implementations respectively.

The framework provides the following common APIs:
 - init_smp_call(): initialize the SMP call support during pCPU initialization
 - handle_smp_call(): execute the SMP call notification handler
 - smp_call_function(): trigger the SMP call request to target pCPUs

Other SW modules should invoke these common APIs to perform arch-independent
SMP operations.

Two arch-specific hooks are abstracted:
 - arch_smp_call_kick_pcpu():
    - On x86, special handling is required when LAPIC is passthrough.
    - On RISC-V, a plain IPI is sufficient to kick the target pCPU.
 - arch_init_smp_call():
   - On x86, CPU initialization reserves dedicated vectors and
     registers callback handlers for purposes such as notifications
     or posted interrupts.
   - On RISC-V, no special handling is required at present; this
     can be extended in the future if needed.

----------
Changelog:
 * Merged the following two patches into one:
     [RFC PATCH v2 4/7] hv: introduce common/smp.{c,h}
     [RFC PATCH v2 5/7] hv: smpcall: x86: adapt to common SMP call
Tracked-On: #8786
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-09 16:37:04 +08:00
Shiqing Gao
066d55c8c3 hv: ipi: x86: rename send_single_ipi() and send_dest_ipi_mask()
Rename send_single_ipi() and send_dest_ipi_mask() to
arch_send_single_ipi() and arch_send_dest_ipi_mask() in x86, to make the
naming consistent with the RISC-V implementation and reflect that these
functions are arch-specific.

Tracked-On: #8786
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-09 16:37:04 +08:00
Shiqing Gao
89f0dc544c hv: ipi: x86: update send_dest_ipi_mask() prototype
Align the prototype of send_dest_ipi_mask() on x86 with the RISC-V
definition. dest_mask is updated from uint32_t to uint64_t:

    From: void send_dest_ipi_mask(uint32_t dest_mask, uint32_t vector)
    To:   void send_dest_ipi_mask(uint64_t dest_mask, uint32_t vector)

On RISC-V, send_dest_ipi_mask() is implemented using SBI interfaces,
where the dest_mask is defined as "unsigned long" in the SBI spec.

Tracked-On: #8786
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-09 16:37:04 +08:00
Haicheng Li
0d630e8f37 hv: ipi: riscv: implement IPI using SBI interface
This patch implements the IPI for RISC-V using SBI interface.

There is no common IPI concept abstracted, due to the following reasons:
 - RISC-V:
   Software delivers an IPI to target CPUs via software interrupts.
   The interrupt number is fixed for each privilege mode (e.g.,
   Supervisor Software Interrupt = IRQ 1, Machine Software Interrupt = IRQ 3).
   The actual purpose of the IPI is indicated by an IPI message type,
   which is a software-level concept. When the IPI is received,
   the target CPU must check the message type to determine the required action.

 - x86:
   Software delivers an IPI to target CPUs using a specific vector number.
   During CPU initialization, software can assign dedicated vectors for
   particular purposes. When the IPI is received, the target CPU could
   directly invoke the handler bound to that vector.

Each architecture provides its own IPI implementation, and other SW modules
directly call these arch-specific functions.

------
Notes:
 * To ensure RISC-V builds pass, an empty `include/arch/riscv/asm/cpu.h`
   is added since `debug/logmsg.h` includes `asm/cpu.h`.
 * Implemented IPI functionality using the SBI IPI Extension (EID #0x735049).
   Legacy SBI extensions are not supported in ACRN.

----------
Changelog:
 * Updated commit message and code comments to state explicitly that
   legacy SBI extensions are not supported in ACRN.
 * Refined the prototype of sbi_send_ipi() to align with the SBI spec:
     From: int64_t sbi_send_ipi(uint64_t mask)
     To:   int64_t sbi_send_ipi(uint64_t mask, uint64_t mask_base)
   In ACRN it is invoked as sbi_send_ipi(dest_mask, 0UL), with mask_base
   set to 0UL.
 * Renamed send_single_ipi() and send_dest_ipi_mask() to
   arch_send_single_ipi() and arch_send_dest_ipi_mask() respectively.

Tracked-On: #8786
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
Co-developed-by: Shiqing Gao <shiqing.gao@intel.com>
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-09 16:37:04 +08:00
Haicheng Li
c4531e903c hv: riscv: implement SBI interfaces
This patch implements the common SBI interfaces, including
sbi_ecall(), common macros and data structures.

Changelog:
 * Renamed SBI_EXPERIMENTAL_x and SBI_VENDOR_x.
 * Added description for sbi_ecall().
 * Renamed SBI-related enums, macros, and data structures to align with
   the SBI specification.

Tracked-On: #8786
Signed-off-by: Haicheng Li <haicheng.li@intel.com>
Co-developed-by: Shiqing Gao <shiqing.gao@intel.com>
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2025-09-09 16:37:04 +08:00
Jiaqing Zhao
1c7e1a192b hv: refactor hypervisor image size helper function
The hypervisor image size is determined at link time, but now it is
calculated and stored in a global variable during mmu initialization,
and the helper function reads from that variable. Change to calculate
it inside helper function to avoid inconsistency.

Tracked-On: #8738
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
2025-08-19 07:49:23 +00:00
Jiaqing Zhao
3c6aa23ab2 hv: instr_emul: Correct handling of instruction length
The VM-exit instruction length(VMX_EXIT_INSTR_LEN) in VMCS is undefined
on EPT violation, except during delivery of a software interrupt,
privileged software exception, or software exception[1]. Although CPU
is likely to set the field, it can be incorrect in certain cases, such
as cmp+jcc and test+jcc.

Since hypervisor does not know exactly how much bytes needed, and GVA
translation is costly, it first copies at most 15 (VIE_INST_SIZE) bytes
within the page, then decodes the instruction. If more bytes are needed
during decoding and copied length is less than 15, it copies remaining
bytes.

[1] 29.2.5, https://cdrdv2-public.intel.com/671200/325462-sdm-vol-1-2abcd-3abcd.pdf

Tracked-On: #8756
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
2025-08-19 07:49:23 +00:00
Yichong Tang
27aee66f88 hv: hyperv: Add hyperv page destory function
In current code process, hyperv data in struct vm_arch is never cleared
during VM shutdown and is retained to next VM launch. As the enabled
bit of hypercall_page msr is not clear, hypercall page might cause fatal
error such as Windows VM BSOD during VM restart and memory
remapping. Hyperv page destory function can ensure hyperv page is
destory during each VM shutdown so hyperv related config such as
hypercall page is established correctly during each VM launch.

Tracked-On: #8755
Signed-off-by: Yichong Tang <yichong.tang@intel.com>
2025-03-10 15:36:03 +08:00
Yuan Lu
dbc3ff39aa hv: vm_reset: simulate RESET_CONTROL(0xCF9) register
Add reset_control in acrn_vm. Use this reset_control to simulate
RESET_CONTROL(0xCF9) register in hypervisor.

Tracked-On: #8724
Signed-off-by: Yuan Lu <yuan.y.lu@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
2024-09-12 14:09:17 +08:00
Jiaqing Zhao
eae668268e hv: handle reboot from Service VM properly
Service VM may write 0x6 to port 0xcf9 to trigger a warm reset, but
current hypervisor always performs a cold reset by writing 0xE to CF9.
Hypervisor should reboot the system in the same mode as Service VM
specified. Specific OS features (like linux pstore) requires warm
reset to keep data across reboot.

The behavior of hv console's reboot command (cold reset) remains
unchanged.

Tracked-On: #8539
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2024-09-09 14:37:16 +08:00
Haiwei Li
17c4ce75a1 hv: cpuid: expose CPUID.EAX=07H subleaf to VMs
Per SDM, VPDPBUSD/VPDPBUSDS/VPDPWSSD/VPDPWSSDS instructions depend on
CPUID Feature Flag 'AVX-VNNI, AVX512_VNNI, AVX512VL'. 'AVX512_VNNI' and
'AVX512VL' are already exposed to any VM.

'AVX-VNNI' is in CPUID.(EAX=07H,ECX=1):EAX.AVX-VNNI[bit 4]. This patch
is to expose all the CPUID.EAX=07H subleaf features to VMs.

Mask corresponding bits if want to disable some features in the future.

Tracked-On: #8710
Reviewed-by: Fei Li <fei1.li@intel.com>
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
2024-09-09 14:03:51 +08:00
Jiaqing Zhao
5c351bee0f hv: vtd: allocate drhd_dev_scope based on board file
Determine the size of drhd_dev_scope based on DRHD_MAX_DEVSCOPE_COUNT
in board file instead of hardcoding. The current default value 16 will
be used if it is not defined in board file to keep compatibility, a
warning will be raised in this case.

Tracked-On: #8494
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2024-08-05 15:51:17 +08:00
Haiwei Li
fa2b8fcfbe doc: add module design for some defines in hwmgmt_page
GAI Tooling Notice: These contents may have been developed with support from one
or more generative artificial intelligence solutions.

ACRN hypervisor is decomposed into a series of components and modules. The
module design in hypervisor is to add inline doxygen style comments above
functions, macros, structures, etc.

This patch is to add comments for some elements in hwmgmt_page module.

Tracked-On: #8665

Signed-off-by: Haiwei Li <haiwei.li@intel.com>
2024-08-01 14:50:27 +08:00
YuanXin-Intel
e4b1584577 Change Service VM to supervisor role
1. Enable Service VM to power off or restart the whole platform even when RTVM is running.
2. Allow Service VM stop the RTVM using acrnctl tool with option "stop -f".
3. Add 'Service VM supervisor role enabled' option in ACRN configurator

Tracked-On: #8618

Signed-off-by: YuanXin-Intel <xin.yuan@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Reviewed-by: Jian Jun Chen <jian.jun.chen@intel.com>
2024-06-28 13:35:07 +08:00
Haiwei Li
3d6ca845e2 hv: s3: add timer support
When resume from s3, Service VM OS will hang because timer interrupt on
BSP is not triggered. Hypervisor won't update physical timer because
there are expired timers on pcpu timer list.

Add suspend and resume ops for modules that use timers.

This patch is just for Service VM OS. Support for User VM will be added
in the future.

Tracked-On: #8623
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
2024-06-27 11:26:09 +08:00
Haiwei Li
81935737ff hv: s3: reset vm after resume
Now only BSP is reset. After Service VM OS resumes from s3, APs'
apic_base_msr are incorrect with x2apic bit en.

To avoid incorrect states, do `reset_vm` after resume.

Tracked-On: #8623
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
2024-06-27 11:26:09 +08:00
Haiwei Li
b885d02396 hv: cpuid: add several leaf to per-cpu list in hybrid architecture
P-cores and E-cores accessing leaf 0x2U/0x14U/0x16U/0x18U/0x1A/0x1C/0x80000006U
will have different information in hybrid architecture.

So add them to per-cpu list in hybrid architecture and directly return
the physical value.

Note: 0x14U is hided and return 0.

Tracked-On: #8608
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
2024-05-28 11:02:56 +08:00
Haiwei Li
d6fe8b0892 hv: cpuid: make leaf 0x6 per-cpu in hybrid architecture
Leaf 0x6 returns thermal and power management information. In
hybrid architecture, P-cores and E-cores have different information.

Add leaf 0x6 to per-cpu list in hybrid architecture and handle specific
cpuid access.

Tracked-On: #8608
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
2024-05-28 11:02:56 +08:00
Haiwei Li
59a8cc4c28 hv: cpuid: make leaf 0x4 per-cpu in hybrid architecture
Leaf 0x4 returns deterministic cache parameters for each level. In
hybrid architecture, P-cores and E-cores have different cache
information.

Add leaf 0x4 to per-cpu list in hybrid architecture and handle specific
cpuid access.

Tracked-On: #8608
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
2024-05-28 11:02:56 +08:00
Haiwei Li
f7506424e4 hv: cpuid: refactor per-cpu leaves definition
CPUID returns processor identification and feature information.
Different pcpus may return different infos. That is, the info is
per-cpu.

In hybrid architecture, per-cpu leaf is different from the previous. So
introduce a struct percpu_cpuids to indicate the per-cpu leaf. struct
percpu_cpuids will consist of two parts: generic percpu leaves and
hybrid related percpu leaves.

This patch is just to add generic percpu leaves.

Tracked-On: #8608
Signed-off-by: Haiwei Li <haiwei.li@intel.com>
2024-05-28 11:02:56 +08:00
Zhangwei6
ddfcb8c3fc hv: enable thermal lvt interrupt
This patch can fetch the thermal lvt irq and propagate
it to VM.

At this stage we support the case that there is only one VM
governing thermal. And we pass the hardware thermal irq to this VM.

First, we register the handler for thermal lvt interrupt, its irq
vector is THERMAL_VECTOR and the handler is thermal_irq_handler().

Then, when a thermal irq occurs, it flags the SOFTIRQ_THERMAL bit
of softirq_pending, This bit triggers the thermal_softirq() function.
And this function will inject the virtual thermal irq to VM.

Tracked-On: #8595

Signed-off-by: Zhangwei6 <wei6.zhang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2024-05-16 09:40:32 +08:00
Zhangwei6
78243c3f49 hv: expose thermal MSRs to VM.
In this phase, we only use one VM to control thermal.
So we make thermal MSRs readable and writable by this VM.

This VM is flagged with GUEST_FLAG_VTM, and can
read/write thermal MSRs.
For the VMs not flagged with GUEST_FLAG_VTM,
can only read these thermal MSRs to get current status.

Tracked-On: #8595
Signed-off-by: Zhangwei6 <wei6.zhang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2024-05-16 09:40:32 +08:00
Wu Zhou
581ec58fbb hv: vm_event: create vm_event support
This patch creates vm_event support in HV, including:
1. Create vm_event data type.
2. Add vm_event sbuf and its initializer. The sbuf will be allocated by
   DM in Service VM. Its page address will then be share to HV through
   hypercall.
3. Add an API to send the HV generated event.

Tracked-On: #8547
Signed-off-by: Wu Zhou <wu.zhou@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2024-02-01 17:01:31 +08:00
Qiang Zhang
6a1d91c740 hv: sched: Add sched_params struct for thread parameters
Abstract out schedulers config data for vCPU threads and other hypervisor
threads to sched_params structure. And it's used to initialize per
thread scheduler private data. The sched_params for vCPU threads come
from vm_config generated by config tools while other hypervisor threads
need give them explicitly.

Tracked-On: #8500
Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>
2023-09-18 16:26:05 +08:00
Wu Zhou
064be1e3e6 hv: support halt in hv idle
When all vCPU threads on one pCPU are put to sleep (e.g., when all
guests execute HLT), hv would schedule to idle thread. Currently the
idle thread executes PAUSE which does not enter any c-state and consumes
a lot of power. This patch is to support HLT in the idle thread.

When we switch to HLT, we have to make sure events that would wake a
vCPU must also be able to wake the pCPU. Those events are either
generated by local interrupt or issued by other pCPUs followed by an
ipi kick.

Each of them have an interrupt involved, so they are also able to wake
the halted pCPU. Except when the pCPU has just scheduled to idle thread
but not yet halted, interrupts could be missed.

sleep-------schedule to idle------IRQ ON---HLT--(kick missed)
                                         ^
                              wake---kick|

This areas should be protected. This is done by a safe halt
mechanism leveraging STI instruction’s delay effect (same as Linux).

vCPUs with lapic_pt or hv with CONFIG_KEEP_IRQ_DISABLED=y does not allow
interrupts in root mode, so they could never wake from HLT (INIT kick
does not wake HLT in root mode either). They should continue using PAUSE
in idle.

Tracked-On: #8507
Signed-off-by: Wu Zhou <wu.zhou@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2023-09-15 11:52:40 +08:00
Yonghua Huang
791019edc5 hv: define a MACRO to indicate maximum memory size
~0UL is widely used to specify the maximum memory size
 when calling e820_alloc_memory(), this patch to define
 a MACRO for it to avoid using this magic number.

Tracked-On: #8502
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2023-09-12 13:52:48 +08:00
Qiang Zhang
a4a73b5aac HV: emulate dummy multi-function dev in Service VM
For a pdev which allocated to prelaunched VM or owned by HV, we need to check
whether it is a multifuction dev at function 0. If yes we have to emulate a
dummy function dev in Service VM, otherwise the sub-function devices will be
lost in guest OS pci probe process.

Tracked-On: #8492
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>
Signed-off-by: Victor Sun <victor.sun@intel.com>
2023-09-11 16:13:16 +08:00
Qiang Zhang
bf653d277b HV: init one dev config with service vm config param
When we do init_all_dev_config() in pci.c, the pdevs added to pci dev_config
will be exposed to Service VM or passthru to prelauched VM. The original code
would find service VM config in every pci pdev init loop, this is unnecessary
and definitely impact performance. Here we generate Service VM config pointer
with config tool so that init_one_dev_config() could refer service VM config
directly.

Tracked-On: #8491
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Signed-off-by: Qiang Zhang <qiang4.zhang@intel.com>
Signed-off-by: Victor Sun <victor.sun@intel.com>
2023-09-11 16:13:16 +08:00
Muhammad Qasim Abdul Majeed
a457e65619 doc: Fix spelling and typo mistakes.
Tracked-On: #8488

Signed-off-by: Muhammad Qasim Abdul Majeed <qasim.majeed20@gmail.com>
2023-09-05 09:34:21 +08:00
Jiaqing Zhao
7bfbdf04b8 doc: remove '@return None' for void functions
doxygen will warn that documented return type is found for functions
that does not return anything in 1.9.4 or later versions. 'None' is
not a special keyword in doxyge, it will recognize it as description
to the return value that does not exist in void functions.

Tracked-On: #8425
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2023-08-03 14:56:29 -07:00
Wu Zhou
8af2c263db hv: disable HFI and ITD for guests
The Hardware Feedback Interface (HFI) and Intel® Thread Director (ITD)
features require OS to provide a physical page address to
IA32_HW_FEEDBACK_PTR. Then the hardware will update the processor
information to the page address. The issue is that guest VM will program
its GPA to that MSR, causing great risk of tempering memory.

So HFI and ITD should be made invisible to guests, until we provide
proper virtulization of those features.

Tracked-On: #8463
Signed-off-by: Wu Zhou <wu.zhou@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2023-08-01 14:57:23 +08:00
Jiaqing Zhao
3cc1127ae9 hv: fix undefined reference to nested_vmexit_handler() in vmexit.c
arch/x86/guest/nested.c, where nested_vmexit_handler() is defined, is
only compiled when CONFIG_NVMX_ENABLED is enabled. Define a dummy
function in include/arch/x86/asm/guest/nested.h to fix the undefined
reference linker error.

Tracked-On: #8465
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2023-08-01 14:22:14 +08:00
Wu Zhou
89d11d91e2 hv: bugfix: fix the ptdev irq destination issue
According to SDM Vol3 11.12.10, in x2APIC mode, Logical Destination has
two parts:
  - Cluster ID (LDR[31:16])
  - Logical ID (LDR[15:0])
Cluster ID is a numerical address, while Logical ID is a 16bit mask. We
can only use Logical ID to address multi destinations within a Cluster.

So we can't just 'or' all the Logical Destination in LDR registers to
get one mask for all target pCPUs. This would get a wrong destination
mask if the target Destinations are from different Clusters.

For example in ADL/RPL x2APIC LDRs for core 2-5 are 0x10001 0x10100
0x20001 0x20100. If we 'or' them together, we would get a Logical
Destination of 0x30101, which points to core 6 and another core.
If core 6 is running a RTVM, then the irq is unable to get to
core 2-5, causing the guest on core 2-5 driver fail.

Guests working in xAPIC mode may use 'Flat Model' to select an
arbitrary list of CPUs as its irq destination. HV may not be able to
include them all when transfering to physical destinations, because
the HW is working in x2APIC mode and can only use 'Cluster Model'.

There would be no perfect fix for this issue. This patch is a simple
fix, by just keep the first Cluster of all target Logical Destinations.

Tracked-On: #8435
Signed-off-by: Wu Zhou <wu.zhou@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2023-07-05 17:41:16 +08:00
Jiaqing Zhao
e5d46dcc7d hv: vpci: ignore PCI I/O BAR with non-zero upper 16 bits
On x86 platform, the upper 16 bit of I/O BAR should be initialized to
zero by BIOS. Howerever, some buggy BIOS still programs the upper 16
bits to non-zero, which causes error in check_pt_dev_pio_bars(). Since
I/O BAR reprogramming by VM is currently unsupported, this patch
ignores such I/O BARs when creating vpci devices to make VM boot.

Tracked-On: #8373
Signed-off-by: Jiaqing Zhao <jiaqing.zhao@linux.intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2023-06-26 14:40:57 +08:00
Wu Zhou
db83648a8d hv: hide thermal interface from guests
Thermal events are delivered through lapic thermal LVT. Currently
ACRN does not support delivering those interrupts to guests by
virtual lapic. They need to be virtualized to provide guests some
thermal management abilities. Currently we just hide thermal
lvt from guests, including:

1. Thermal LVT:
There is no way to hide thermal LVT from guests. But we need do
something to make sure no interrupt can be actually trigered:
  - skip thermal LVT in vlapic_trigger_lvt()
  - trap-and-emulate thermal LVT in lapic-pt mode

2. As We have plan to introduce virtualization of thermal monitor in the
future, we use a vm flag GUEST_FLAG_VTM which is default 0 to control
the access to it. So that it can help enabling VTM in the future.

Tracked-On: #8414
Signed-off-by: Wu Zhou <wu.zhou@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2023-06-15 20:36:44 +08:00
Wu Zhou
c5d019b836 hv: emulate cpuids and MSRs for VHWP
Changes made by this patch includes:
1. Emulate HWP and pstate MSRs/CPUIDs. Those are exposed to guest when
   the GUEST_FLAG_VHWP is set:
    - CPUID[6].EAX[7,9,10]: MSR_IA32_PM_ENABLE(enabled by hv, always read
      1), MSR_IA32_HWP_CAPABILITIES, MSR_IA32_HWP_REQUEST,
      MSR_IA32_HWP_STATUS,
    - CPUID[6].ECX[0]: MSR_IA32_MPERF, MSR_IA32_APERF
    - MSR_IA32_PERF_STATUS(read as base frequency when not owning pCPU)
    - MSR_IA32_PERF_CTL(ignore writes)
2. Always hide HWP interrupt and package control MSRs/CPUIDs:
    - CPUID[6].EAX[8]: MSR_IA32_HWP_INTERRUPT(currently ACRN is not able
      to deliver thermal LVT virtual interrupt to guests)
    - CPUID[6].EAX[11,22]: MSR_IA32_HWP_REQUEST_PKG, MSR_IA32_HWP_CTL

Tracked-On: #8414
Signed-off-by: Wu Zhou <wu.zhou@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2023-06-09 10:06:42 +08:00