-- replace global hypercall lock with per-vm lock
-- add spinlock protection for vm & vcpu state change
v1-->v2:
change get_vm_lock/put_vm_lock parameter from vm_id to vm
move lock obtain before vm state check
move all lock from vmcall.c to hypercall.c
Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
For a ptirq_remapping_info entry, when build IRTE:
- If the caller provides a valid IRTE, use the IRET
- If the caller doesn't provide a valid IRTE, allocate a IRET when the
entry doesn't have a valid IRTE, in this case, the IRET will be freed
when free the entry.
Tracked-On:#4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now Host Bridge and PCI Bridge could only be added to SOS's acrn_vm_pci_dev_config.
So For UOS, we always emualte Host Bridge and PCI Bridge for it and assign PCI device
to it; for SOS, if it's the highest severity VM, we will assign Host Bridge and PCI
Bridge to it directly, otherwise, we will emulate them same as UOS.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Invalidate cache by scanning and flushing the whole guest memory is
inefficient which might cause long execution time for WBINVD emulation.
A long execution in hypervisor might cause a vCPU stuck phenomenon what
impact Windows Guest booting.
This patch introduce a workaround method that pausing all other vCPUs in
the same VM when do wbinvd emulation.
Tracked-On: #4703
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
RTVM (with lapic PT) boots hang when maxcpus is
assigned a value less than the CPU number configured
in hypervisor.
In this case, vlapic_state(per VM) is left in TRANSITION
state after BSP boot, which blocks interupts to be injected
to this UOS.
Tracked-On: #4803
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Li, Fei <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
There's no need to look up MSI ptirq entry by virtual SID any more since the MSI
ptirq entry would be removed before the device is assigned to a VM.
Now the logic of MSI interrupt remap could simplify as:
1. Add the MSI interrupt remap first;
2. If step is already done, just do the remap part.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com>
Now we could know a device status by 'user' filed, like
---------------------------------------------------------------------------
| NULL | == vdev | != NULL && != vdev
vdev->user | device is de-init | used by itself VM | assigned to another VM
---------------------------------------------------------------------------
So we don't need to modify 'vpci' field accordingly.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
For post-launched VMs, the configured CPU affinity could be different
from the actual running CPU affinity. This new field acrn_vm->cpu_affinity
recognizes this difference so that it's possible that CREATE_VM
hypercall won't overwrite the configured CPU afifnity.
Change name cpu_affinity_bitmap in acrn_vm_config to cpu_affinity.
This is read-only in run time, never overwritten by acrn-dm.
Remove vm_config->vcpu_num, which means the number of vCPUs of the
configured CPU affinity. This is not to be confused with the actual
running vCPU number: vm->hw.created_vcpus.
Changed get_vm_bsp_pcpu_id() to get_configured_bsp_pcpu_id() for less
confusion.
Tracked-On: #4616
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The virtual MSI information could be included in ptirq_remapping_info structrue,
there's no need to pass another input paramater for this puepose. So we could
remove the ptirq_msi_info input.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
If CPU has MSR_TEST_CTL, show an emulaued one to VCPU
Tracked-On: #4496
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
remove unnecessary state check and
add pre-condition for vcpu APIs.
Tracked-On: #4320
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
check the vm state in hypercall api,
add pre-condition for vm api.
Tracked-On: #4320
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Currently vlapic_build_id() uses vcpu_id to retrieve the lapic_id
per_cpu variable:
vlapic_id = per_cpu(lapic_id, vcpu->vcpu_id);
SOS vcpu_id may not equal to pcpu_id, and in that case it runs into
problems. For example, if any pre-launched VMs are launched on PCPUs
whose IDs are smaller than any PCPU IDs that are used by SOS.
This patch fixes the issue and simplify the code to create or get
vapic_id by:
- assign vapic_id in create_vlapic(), which now takes pcpu_id as input
argument, and save it in the new field: vlapic->vapic_id, which will
never be changed.
- simplify vlapic_get_apicid() by returning te saved vapid_id directly.
- remove vlapic_build_id().
- vlapic_init() is only called once, merge it into vlapic_create().
Tracked-On: #4268
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Maintain a per-pCPU array of vCPUs (struct acrn_vcpu *vcpu_array[CONFIG_MAX_VM_NUM]),
one VM cannot have multiple vCPUs share one pcpu, so we can utilize this property
and use the containing VM's vm_id as the index to the vCPU array:
In create_vcpu(), we simply do:
per_cpu(vcpu_array, pcpu_id)[vm->vm_id] = vcpu;
In offline_vcpu():
per_cpu(vcpu_array, pcpuid_from_vcpu(vcpu))[vcpu->vm->vm_id] = NULL;
so basically we use the containing VM's vm_id as the index to the vCPU array,
as well as the index of posted interrupt IRQ/vector pair that are assigned
to this vCPU:
0: first vCPU and first posted interrupt IRQs/vector pair
(POSTED_INTR_IRQ/POSTED_INTR_VECTOR)
...
CONFIG_MAX_VM_NUM-1: last vCPU and last posted interrupt IRQs/vector pair
((POSTED_INTR_IRQ + CONFIG_MAX_VM_NUM - 1U)/(POSTED_INTR_VECTOR + CONFIG_MAX_VM_NUM - 1U)
In the posted interrupt handler, it will do the following:
Translate the IRQ into a zero based index of where the vCPU
is located in the vCPU list for current pCPU. Once the
vCPU is found, we wake up the waiting thread and record
this request as ACRN_REQUEST_EVENT
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Add platform_caps.c to maintain platform related information
Set platform_caps.pi to true if all iommus are posted interrupt capable, false
otherwise
If lapic passthru is not configured and platform_caps.pi is true, the vm
may be able to use posted interrupt for a ptdev, if the ptdev's IRQ is
single-destination
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
EPT table can be changed concurrently by more than one vcpus.
This patch add a lock to protect the add/modify/delete operations
from different vcpus concurrently.
Tracked-On: #4253
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
We could use container_of to get vcpu structure pointer from vmtrr. So vcpu
structure pointer is no need in vmtrr structure.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
We could use container_of to get vcpu/vm structure pointer from vlapic. So vcpu/vm
structure pointer is no need in vlapic structure.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
The posted interrupt descriptor is more of a vmx/vmcs concept than a vlapic
concept. struct acrn_vcpu_arch stores the vmx/vmcs info, so put struct pi_desc
in struct acrn_vcpu_arch.
Remove the function apicv_get_pir_desc_paddr()
A few coding style/typo fixes
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
Rename struct vlapic_pir_desc to pi_desc
Rename struct member and local variable pir_desc to pid
pir=posted interrupt request, pi=posted interrupt
pid=posted interrupt descriptor
pir is part of pi descriptor, so it is better to use pi instead of pir
struct pi_desc will be moved to vmx.h in subsequent commit.
Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
For SOS VM, when the target platform has multiple IO-APICs, there
should be equal number of virtual IO-APICs.
This patch adds support for emulating multiple vIOAPICs per VM.
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
As ACRN prepares to support platforms with multiple IO-APICs,
GSI is a better way to represent physical and virtual INTx interrupt
source.
1) This patch replaces usage of "pin" with "gsi" whereever applicable
across the modules.
2) PIC pin to gsi is trickier and needs to consider the usage of
"Interrupt Source Override" structure in ACPI for the corresponding VM.
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
Reverts 538ba08c: hv:Add vpin to ptdev entry mapping for vpic/vioapic
ACRN uses an array of size per VM to store ptirq entries against the vIOAPIC pin
and an array of size per VM to store ptirq entries against the vPIC pin.
This is done to speed up "ptirq entry" lookup at runtime for Level triggered
interrupts in API ptirq_intx_ack used on EOI.
This patch switches the lookup API for INTx interrupts to the API,
ptirq_lookup_entry_by_sid
This could add delay to processing EOI for Level triggered interrupts.
Trade-off here is space saved for array/s of size CONFIG_MAX_IOAPIC_LINES with 8 bytes
per data. On a server platform, ACRN needs to emulate multiple vIOAPICs for
SOS VM, same as the number of physical IO-APICs. Thereby ACRN would need around
10 such arrays per VM.
Removes the need of "pic_pin" except for the APIs facing the hypercalls
hcall_set_ptdev_intr_info, hcall_reset_ptdev_intr_info
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
ve820.c is a common file in arch/x86/guest/ now, so move function of
create_sos_vm_e820() to this file to make code structure clear;
Tracked-On: #4458
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
hypervisor/arch/x86/configs/$(BOARD)/ve820.c is used to store pre-launched
VM specific e820 entries according to memory configuration of customer.
It should be a scenario based configurations but we had to put it in per
board foler because of different board memory settings. This brings concerns
to customer on configuration orgnization.
Currently the file provides same e820 layout for all pre-launched VMs, but
they should have different e820 when their memory are configured differently.
Although we have acrn-config tool to generate ve802.c automatically, it
is not friendly to modify hardcoded ve820 layout manually, so the patch
changes the entries initialization method by calculating each entry item
in C code.
Tracked-On: #4458
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
VM needs to check if it owns this device before deiniting it.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1. Renames DEFINE_IOAPIC_SID with DEFINE_INTX_SID as the virtual source can
be IOAPIC or PIC
2. Rename the src member of source_id.intx_id to ctlr to indicate interrupt
controller
2. Changes the type of src member of source_id.intx_id from uint32_t to
enum with INTX_CTLR_IOAPIC and INTX_CTLR_PIC
Tracked-On: #4447
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
1. Rename BOOT_CPU_ID to BSP_CPU_ID
2. Repace hardcoded value with BSP_CPU_ID when
ID of BSP is referenced.
Tracked-On: #4420
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Now we split passthrough PCI device from DM to HV, we could remove all the passthrough
PCI device unused code.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
is_polling_ioreq is more straightforward. Rename it.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Sometimes HV wants to know if there are pending interrupts of one vcpu.
Add .has_pending_intr interface in acrn_apicv_ops and return the pending
interrupts status by check IRRs of apicv.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Introduce two kinds of events for each vcpu,
VCPU_EVENT_IOREQ: for vcpu waiting for IO request completion
VCPU_EVENT_VIRTUAL_INTERRUPT: for vcpu waiting for virtual interrupts events
vcpu can wait for such events, and resume to run when the
event get signalled.
This patch also change IO request waiting/notifying to this way.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Per SDM 10.12.5.1 vol.3, local APIC should keep LAPIC state after receiving
INIT. The local APIC ID register should also be preserved.
Tracked-On: #4267
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The patch abstract a vcpu_reset_internal() api for internal usage, the
function would not touch any vcpu state transition and just do vcpu reset
processing. It will be called by create_vcpu() and reset_vcpu().
The reset_vcpu() will act as a public api and should be called
only when vcpu receive INIT or vm reset/resume from S3. It should not be
called when do shutdown_vm() or hcall_sos_offline_cpu(), so the patch remove
reset_vcpu() in shutdown_vm() and hcall_sos_offline_cpu().
The patch also introduced reset_mode enum so that vcpu and vlapic could do
different context operation according to different reset mode;
Tracked-On: #4267
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In current architecutre, the maximum vCPUs number per VM could not
exceed the pCPUs number. Given the MAX_PCPU_NUM macro is provided
in board configurations, so remove the MAX_VCPUS_PER_VM from Kconfig
and add a macro of MAX_VCPUS_PER_VM to reference MAX_PCPU_NUM directly.
Tracked-On: #4230
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
remove 'guest_init_pml4' and 'tmp_pg_array' in vm_arch
since they are not used.
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
With cpu-sharing enabled, there are more than 1 vcpu on 1 pcpu, so the
smp_call handler should switch the vmcs to the target vcpu's vmcs. Then
get the info.
dump_vcpu_reg and dump_guest_mem should run on certain vmcs, otherwise,
there will be #GP error.
Renaming:
vcpu_dumpreg -> dump_vcpu_reg
switch_vmcs -> load_vmcs
Tracked-On: #4178
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
After changing init_vmcs to smp call approach and do it before
launch_vcpu, it could work with noop scheduler. On real sharing
scheudler, it has problem.
pcpu0 pcpu1 pcpu1
vmBvcpu0 vmAvcpu1 vmBvcpu1
vmentry
init_vmcs(vmBvcpu1) vmexit->do_init_vmcs
corrupt current vmcs
vmentry fail
launch_vcpu(vmBvcpu1)
This patch mark a event flag when request vmcs init for specific vcpu. When
it is running and checking pending events, will do init_vmcs firstly.
Tracked-On: #4178
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
xsave area:
legacy region: 512 bytes
xsave header: 64 bytes
extended region: < 3k bytes
So, pre-allocate 4k area for xsave. Use certain instruction to save or
restore the area according to hardware xsave feature set.
Tracked-On: #4166
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
ptirq_msix_remap doesn't do the real remap, that's the vmsi_remap and vmsix_remap_entry
does. ptirq_msix_remap only did the preparation.
Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
add shell command to support dump dump guest memory
e.g.
dump_guest_mem vm_id, gva, length
Tracked-On: #4144
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
After adding PCI BAR remap support, mmio_node may unregister when there's others
access it. This patch add a lock to protect mmio_node access.
Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Since guest could re-program PCI device MSI-X table BAR, we should add mmio
emulation handler unregister.
However, after add unregister_mmio_emulation_handler API, emul_mmio_regions
is no longer accurate. Just replace it with max_emul_mmio_regions which records
the max index of the emul_mmio_node.
Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
In theory, guest could re-program PCI BAR address to any address. However, ACRN
hypervisor only support [0, top_address_space) EPT memory mapping. So we need to
check whether the PCI BAR re-program address is within this scope.
Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>