Commit Graph

1068 Commits

Author SHA1 Message Date
Mingqiang Chi
aa89eb3541 hv:add per-vm lock for vm & vcpu state change
-- replace global hypercall lock with per-vm lock
-- add spinlock protection for vm & vcpu state change

v1-->v2:
   change get_vm_lock/put_vm_lock parameter from vm_id to vm
   move lock obtain before vm state check
   move all lock from vmcall.c to hypercall.c

Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-07-20 11:22:17 +08:00
Li Fei1
82f9233d4a hv: vpci: a minor fix about is_zombie_vf
Now we check whether a device is zombie by the ->user != NULL.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-06-21 12:07:15 +08:00
Mingqiang Chi
1b84741a56 rename vm_lock/vlapic_state in VM structure
rename:
   vlapic_state-->vlapic_mode
   vm_lock -->  vlapic_mode_lock
   check_vm_vlapic_state --> check_vm_vlapic_mode

Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2020-06-19 16:13:20 +08:00
Mingqiang Chi
d0a4052518 remove dead code in io.h
remove thess APIs:
   set64
   set32
   set16
   set8

Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2020-06-19 16:13:20 +08:00
Conghui Chen
2a4c59db74 hv: add check for BASIC VMX INFORMATION
Check bit 48 in IA32_VMX_BASIC MSR, if it is 1, return error, as we only
support Intel 64 architecture.

SDM:
Appendix A.1 BASIC VMX INFORMATION

Bit 48 indicates the width of the physical addresses that may be used for the
VMXON region, each VMCS, anddata structures referenced by pointers in a
VMCS (I/O bitmaps, virtual-APIC page, MSR areas for VMX transitions). If
the bit is 0, these addresses are limited to the processor’s
physical-address width.2 If the bit is 1, these addresses are limited to
32 bits. This bit is always 0 for processors that support Intel 64
architecture.

Tracked-On: #4956
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
2020-06-18 14:05:56 +08:00
Binbin Wu
da1788c9a3 hv: vtd: add an API to reserve continuous irtes
dmar_reserve_irte is added to reserve N coutinuous IRTEs.
N could be 1, 2, 4, 8, 16, or 32.

The reserved IRTEs will not be freed.

Tracked-On:#4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-16 08:52:56 +08:00
Binbin Wu
7bfcc673a6 hv: ptirq: associate an irte with ptirq_remapping_info entry
For a ptirq_remapping_info entry, when build IRTE:
- If the caller provides a valid IRTE, use the IRET
- If the caller doesn't provide a valid IRTE, allocate a IRET when the
entry doesn't have a valid IRTE, in this case, the IRET will be freed
when free the entry.

Tracked-On:#4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-16 08:52:56 +08:00
Binbin Wu
2fe4280cfa hv: vtd: add two paramters for dmar_assign_irte
idx_in:
- If the caller of dmar_assign_irte passes a valid IRTE index, it will
be resued;
- If the caller of dmar_assign_irte passes INVALID_IRTE_ID as IRTE index,
the function will allocate a new IRTE.

idx_out:
This paramter return the actual index of IRTE used. The caller need to
check whether the return value is valid or not.

Also this patch adds an internal function alloc_irte.
The function takes count as input paramter to allocate continuous IRTEs.
The count can only be 1, 2, 4, 8, 16 or 32.
This is prepared for multiple MSI vector support.

Tracked-On: #4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-16 08:52:56 +08:00
Li Fei1
65e4a16e6a hv: mmu: release 1GB cpu side support constrain
There're some platforms still doesn't support 1GB large page on CPU side.
Such as lakefield, TNT and EHL platforms on which have some silicon bug and
this case CPU don't support 1GB large page.

This patch tries to release this constrain to support more hardware platform.

Note this patch doesn't release the constrain on IOMMU side.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-06-15 15:16:34 +08:00
Binbin Wu
c907a820df hv: config: add msix emulation support
The information needed to enable MSI-x emulation.
Only enable MSI-x emuation for the devices in msix_emul_devs array.
Currently, only EHL has the need to enable MSI-x emulation for TSN
devices.

Tracked-On: #4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-10 14:32:15 +08:00
Victor Sun
80262f0602 HV: rename append_seed_arg to fill_seed_arg
Previously append_seed_arg() just do fill in seed arg to dest cmd buffer,
so rename the api name to fill_seed_arg().

Since fill_seed_arg() will be called in SOS VM path only, the param of
bool vm_is_sos is not needed and will be replaced by dest buffer size.

The seed_args[] which used by fill_seed_arg() is pre-defined as all-zero,
so memset() is not needed in fill_seed_arg(), buffer pointer check
and strncpy_s() are not needed also.

Tracked-On: #4885

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-08 13:30:04 +08:00
Victor Sun
47d20f37e1 HV: replace merge_cmdline api with strncat_s
Add a standard string api strncat_s() to replace merge_cmdline() to make code
more readable.

Another change is that the multiboot cmdline will be appended to the end of
configured SOS bootargs instead of the beginning, this would enable a feature
that some kernel cmdline paramter items could be overriden by multiboot cmdline
since the later one would win if same parameters configured in kernel cmdline.

Tracked-On: #4885

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-08 13:30:04 +08:00
Li Fei1
ae4fa40adc hv: vpci: hv: vpci: refine pci device assignment logic
Now Host Bridge and PCI Bridge could only be added to SOS's acrn_vm_pci_dev_config.
So For UOS, we always emualte Host Bridge and PCI Bridge for it and assign PCI device
to it; for SOS, if it's the highest severity VM, we will assign Host Bridge and PCI
Bridge to it directly, otherwise, we will emulate them same as UOS.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-03 22:00:43 +08:00
Li Fei1
b8f151a55f hv: pci: check whether a PCI device is host bridge or not by class
According PCI Code and ID Assignment Specification Revision 1.11, a PCI device
whose Base Class is 06h and Sub-Class is 00h is a Host bridge.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-03 22:00:43 +08:00
Vijay Dhanraj
d03df0c7e2 HV: Fix MP Init sequence hang by adding a delay
As per the BWG a delay should be provided between the
INIT IPI and Startup IPI. Without the delay observe hangs
on certain platforms during MP Init sequence. So Setting
a delay of 10us between assert INIT IPI and Startup IPI.

Also, as per SDM section 10.7 the the de-assert INIT IPI is
only used for Pentium and P6 processors. This is not applicable
for Pentium4 and Xeon processors so removing this sequence.

Tracked-On: #4835
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-27 13:34:59 +08:00
Binbin Wu
3009d9399f hv: vtd: cleanup snoop control related code
Snoop control will not be turned on by hypervisor, delete snoop control
related code.

Tracked-On: #4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-27 11:27:42 +08:00
Shuo A Liu
9a15ea82ee hv: pause all other vCPUs in same VM when do wbinvd emulation
Invalidate cache by scanning and flushing the whole guest memory is
inefficient which might cause long execution time for WBINVD emulation.
A long execution in hypervisor might cause a vCPU stuck phenomenon what
impact Windows Guest booting.

This patch introduce a workaround method that pausing all other vCPUs in
the same VM when do wbinvd emulation.

Tracked-On: #4703
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-21 15:21:29 +08:00
Mingqiang Chi
f994b5ffaf hv:cleanup vcpu state
-- remove VCPU_PAUSED and resume_vcpu
-- remove vcpu->prev_state in vcpu structure
-- rename pause_vcpu to zombie_vcpu

Tracked-On: #4320
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2020-05-21 15:08:49 +08:00
Yonghua Huang
3391bffb27 hv:fix rtvm hang with maxcpus=0/1 in bootargs
RTVM (with lapic PT) boots hang when maxcpus is
 assigned a value less than the CPU number configured
 in hypervisor.

 In this case, vlapic_state(per VM) is left in TRANSITION
 state after BSP boot, which blocks interupts to be injected
 to this UOS.

Tracked-On: #4803
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Li, Fei <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-15 10:09:13 +08:00
Li Fei1
27a66acd0e hv: ptdev: refine look up MSI ptirq entry
There's no need to look up MSI ptirq entry by virtual SID any more since the MSI
ptirq entry would be removed before the device is assigned to a VM.

Now the logic of MSI interrupt remap could simplify as:
1. Add the MSI interrupt remap first;
2. If step is already done, just do the remap part.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com>
2020-05-13 14:31:01 +08:00
Li Fei1
15e3062631 hv: vpci: remove is_own_device()
Now we could know a device status by 'user' filed, like

---------------------------------------------------------------------------
           | NULL              | == vdev           | != NULL && != vdev
vdev->user | device is de-init | used by itself VM | assigned to another VM
---------------------------------------------------------------------------

So we don't need to modify 'vpci' field accordingly.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
2020-05-13 14:31:01 +08:00
Zide Chen
0a956c34c7 hv: add a new field cpu_affinity in struct acrn_vm
For post-launched VMs, the configured CPU affinity could be different
from the actual running CPU affinity. This new field acrn_vm->cpu_affinity
recognizes this difference so that it's possible that CREATE_VM
hypercall won't overwrite the configured CPU afifnity.

Change name cpu_affinity_bitmap in acrn_vm_config to cpu_affinity.
This is read-only in run time, never overwritten by acrn-dm.

Remove vm_config->vcpu_num, which means the number of vCPUs of the
configured CPU affinity. This is not to be confused with the actual
running vCPU number: vm->hw.created_vcpus.

Changed get_vm_bsp_pcpu_id() to get_configured_bsp_pcpu_id() for less
confusion.

Tracked-On: #4616
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-08 11:04:31 +08:00
Yan, Like
869ccb7ba8 HV: RDT: add CDP support in ACRN
CDP is an extension of CAT. It enables isolation and separate prioritization of
code and data fetches to the L2 or L3 cache in a software configurable manner,
depending on hardware support.

This commit adds a Kconfig switch "CDP_ENABLED" which depends on "RDT_ENABLED".
CDP will be enabled if the capability available and "CDP_ENABLED" is selected.

Tracked-On: #4604
Signed-off-by: Yan, Like <like.yan@intel.com>
Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-08 08:50:13 +08:00
Yan, Like
277c668b04 HV: RDT: clean up RDT code
This commit makes some RDT code cleanup, mainling including:
- remove the clos_mask and mba_delay validation check in setup_res_clos_msr(), the check will be done in pre-build;
- rename platform_clos_num to valid_clos_num, which is set as the minimal clos_mas of all enabled RDT resouces;
- init the platform_clos_array in the res_cap_info[] definition;
- remove the unnecessary return values and return value check.

Tracked-On: #4604
Signed-off-by: Yan, Like <like.yan@intel.com>
2020-05-08 08:50:13 +08:00
Yan, Like
f774ee1fba HV: RDT: merge struct rdt_cache and rdt_membw in to a union
A RDT resource could be CAT or MBA, so only one of struct rdt_cache and struct rdt_membw
would be used at a time. They should be a union.
This commit merge struct rdt_cache and struct rdt_membw in to a union res.

Tracked-On: #4604
Signed-off-by: Yan, Like <like.yan@intel.com>
Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com
2020-05-08 08:50:13 +08:00
Li Fei1
0c6b3e57d6 hv: ptdev: minor refine about ptirq_build_physical_msi
The virtual MSI information could be included in ptirq_remapping_info structrue,
there's no need to pass another input paramater for this puepose. So we could
remove the ptirq_msi_info input.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-05-06 11:51:11 +08:00
Li Fei1
067b439e69 hv: irq: minor refine about structure idt_64_descriptor
The 'value' field in structure idt_64_descriptor is no one used. We could remove it.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-04-26 10:48:49 +08:00
Li Fei1
907a0f7c04 hv: vioapic: minor refine about vioapic_init
Most code in the if ... else is duplicated. We could put it out of the
conditional statement.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-04-24 15:35:38 +08:00
Zide Chen
9150284ca7 hv: replace vcpu_affinity array with cpu_affinity_bitmap
Currently the vcpu_affinity[] array fixes the vCPU to pCPU mapping.
While the new cpu_affinity_bitmap doesn't explicitly sepcify this
mapping, instead, it implicitly assumes that vCPU0 maps to the pCPU
with lowest pCPU ID, vCPU1 maps to the second lowest pCPU ID, and
so on.

This makes it possible for post-launched VM to run vCPUs on a subset of
these pCPUs only, and not all of them.

acrn-dm may launch post-launched VMs with the current approach: indicate
VM UUID and hypervisor launches all VCPUs from the PCPUs that are masked
in cpu_affinity_bitmap.

Also acrn-dm can choose to launch the VM on a subset of PCPUs that is
defined in cpu_affinity_bitmap. In this way, acrn-dm must specify the
subset of PCPUs in the CREATE_VM hypercall.

Additionally, with this change, a guest's vcpu_num can be easily calculated
from cpu_affinity_bitmap, so don't assign vcpu_num in vm_configuration.c.

Tracked-On: #4616
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-23 09:38:54 +08:00
Victor Sun
9264f51456 HV: refine usage of idle=halt in sos cmdline
The parameter of "idle=halt" for SOS cmdline is only needed when cpu sharing
is enabled, otherwise it will impact SOS power.

Tracked-On: #4329

Signed-off-by: Victor Sun <victor.sun@intel.com>
2020-04-22 14:49:04 +08:00
Victor Sun
7282b933fb HV: merge sos_pci_dev config to sos macro
The pci_dev config settings of SOS are same so move the config interface
from vm_configurations.c to CONFIG_SOS_VM macro;

Tracked-On: #4616

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-17 13:45:18 +08:00
Victor Sun
55b50f408f HV: init vm uuid and severity in macro
Currently the vm uuid and severity is initilized separately in
vm_config struct, developer need to take care both items carefuly
otherwise hypervisor would have trouble with the configurations.

Given the vm loader_order/uuid and severity are binded tightly, the
patch merged these tree settings in one macro so that developer will
have a simple interface to configure in vm_config struct.

Tracked-On: #4616

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-17 13:45:18 +08:00
yuhong.tao@intel.com
7c80acee95 HV: emulate MSR_TEST_CTL
If CPU has MSR_TEST_CTL, show an emulaued one to VCPU

Tracked-On: #4496
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-17 09:53:59 +08:00
yuhong.tao@intel.com
dd3fa8ed75 HV: enable #AC for Splitlock Access
If CPU support rise #AC for Splitlock Access, then enable this
feature at each CPU.

Tracked-On: #4496
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-17 09:53:59 +08:00
yuhong.tao@intel.com
ea1bce0cbf HV: enumerate capability of #AC for Splitlock Access
When the destination of an atomic memory operation located in 2
cache lines, it is called a Splitlock Access. LOCK# bus signal is
asserted for splitlock access which may lead to long latency. #AC
for Splitlock Access is a CPU feature, it allows rise alignment
check exception #AC(0) instead of asserting LOCK#, that is helpful
to detect Splitlock Access.

This feature is enumerated by MSR(0xcf) IA32_CORE_CAPABILITIES[bit5]
Add helper function:
    bool has_core_cap(uint32_t bitmask)

Tracked-On: #4496
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-17 09:53:59 +08:00
Mingqiang Chi
f90100e382 hv: add pre-condition for vcpu APIs
remove unnecessary state check and
add pre-condition for vcpu APIs.

Tracked-On: #4320
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-16 21:59:03 +08:00
Jason Chen CJ
0584981c03 hv:add pre-condition for vm APIs
check the vm state in hypercall api,
add pre-condition for vm api.

Tracked-On: #4320
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-16 21:59:03 +08:00
Xiaoguang Wu
d4f789f47e hv: iommu: remove snoop related code
ACRN disables Snoop Control in VT-d DMAR engines for simplifing the
implementation. Also, since the snoop behavior of PCIE transactions
can be controlled by guest drivers, some devices may take the advantage
of the NO_SNOOP_ATTRIBUTE of PCIE transactions for better performance
when snoop is not needed. No matter ACRN enables or disables Snoop
Control, the DMA operations of passthrough devices behave correctly
from guests' point of view.

This patch is used to clean all the snoop related code.

Tracked-On: #4509
Signed-off-by: Xiaoguang Wu <xiaoguang.wu@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2020-04-16 08:40:17 +08:00
Conghui Chen
84ad340898 hv: fix for waag 2 core reboot issue
Waag will send NMIs to all its cores during reboot. But currently,
NMI cannot be injected to vcpu which is in HLT state.
To fix the problem, need to wakeup target vcpu, and inject NMI through
interrupt-window.

Tracked-On: #4620
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-15 14:42:00 +08:00
Zide Chen
6040d8f6a2 hv: fix SOS vapic_id assignment issue
Currently vlapic_build_id() uses vcpu_id to retrieve the lapic_id
per_cpu variable:

  vlapic_id = per_cpu(lapic_id, vcpu->vcpu_id);

SOS vcpu_id may not equal to pcpu_id, and in that case it runs into
problems. For example, if any pre-launched VMs are launched on PCPUs
whose IDs are smaller than any PCPU IDs that are used by SOS.

This patch fixes the issue and simplify the code to create or get
vapic_id by:

- assign vapic_id in create_vlapic(), which now takes pcpu_id as input
  argument, and save it in the new field: vlapic->vapic_id, which will
  never be changed.
- simplify vlapic_get_apicid() by returning te saved vapid_id directly.
- remove vlapic_build_id().
- vlapic_init() is only called once, merge it into vlapic_create().

Tracked-On: #4268
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-15 14:34:15 +08:00
dongshen
00ad3863a1 hv: maintain a per-pCPU array of vCPUs and handle posted interrupt IRQs
Maintain a per-pCPU array of vCPUs (struct acrn_vcpu *vcpu_array[CONFIG_MAX_VM_NUM]),
one VM cannot have multiple vCPUs share one pcpu, so we can utilize this property
and use the containing VM's vm_id as the index to the vCPU array:

 In create_vcpu(), we simply do:
   per_cpu(vcpu_array, pcpu_id)[vm->vm_id] = vcpu;

 In offline_vcpu():
   per_cpu(vcpu_array, pcpuid_from_vcpu(vcpu))[vcpu->vm->vm_id] = NULL;

so basically we use the containing VM's vm_id as the index to the vCPU array,
as well as the index of posted interrupt IRQ/vector pair that are assigned
to this vCPU:
  0: first vCPU and first posted interrupt IRQs/vector pair
  (POSTED_INTR_IRQ/POSTED_INTR_VECTOR)
  ...
  CONFIG_MAX_VM_NUM-1: last vCPU and last posted interrupt IRQs/vector pair
  ((POSTED_INTR_IRQ + CONFIG_MAX_VM_NUM - 1U)/(POSTED_INTR_VECTOR + CONFIG_MAX_VM_NUM - 1U)

In the posted interrupt handler, it will do the following:
 Translate the IRQ into a zero based index of where the vCPU
 is located in the vCPU list for current pCPU. Once the
 vCPU is found, we wake up the waiting thread and record
 this request as ACRN_REQUEST_EVENT

Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
2020-04-15 13:47:22 +08:00
dongshen
14fa9c563c hv: define posted interrupt IRQs/vectors
This is a preparation patch for adding support for VT-d PI
related vCPU scheduling.

ACRN does not support vCPU migration, one vCPU always runs on
the same pCPU, so PI's ndst is never changed after startup.

VCPUs of a VM won’t share same pCPU. So the maximum possible number
of VCPUs that can run on a pCPU is CONFIG_MAX_VM_NUM.

Allocate unique Activation Notification Vectors (ANV) for each vCPU
that belongs to the same pCPU, the ANVs need only be unique within each
pCPU, not across all vCPUs. This reduces # of pre-allocated ANVs for
posted interrupts to CONFIG_MAX_VM_NUM, and enables ACRN to avoid
switching between active and wake-up vector values in the posted
interrupt descriptor on vCPU scheduling state changes.

A total of CONFIG_MAX_VM_NUM consecutive IRQs/vectors are reserved
for posted interrupts use.

The code first initializes vcpu->arch.pid.control.bits.nv dynamically
(will be added in subsequent patch), the other code shall use
vcpu->arch.pid.control.bits.nv instead of the hard-coded notification vectors.

Rename some functions:
  apicv_post_intr --> apicv_trigger_pi_anv
  posted_intr_notification --> handle_pi_notification
  setup_posted_intr_notification --> setup_pi_notification

Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
2020-04-15 13:47:22 +08:00
dongshen
f7be985a23 hv: check if the IRQ is intended for a single destination vCPU
Given the vcpumask, check if the IRQ is single destination
and return the destination vCPU if so, the address of associated PI
descriptor for this vCPU can then be passed to dmar_assign_irte() to
set up the posted interrupt IRTE for this device.

For fixed mode interrupt delivery, all vCPUs listed in vcpumask should
service the interrupt requested. But VT-d PI cannot support multicast/broadcast
IRQs, it only supports single CPU destination. So the number of vCPUs
shall be 1 in order to handle IRQ in posted mode for this device.

Add pid_paddr to struct intr_source. If platform_caps.pi is true and
the IRQ is single-destination, pass the physical address of the destination
vCPU's PID to ptirq_build_physical_msi and dmar_assign_irte

Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
2020-04-15 13:47:22 +08:00
dongshen
6496da7c56 hv: add function to check if using posted interrupt is possible for vm
Add platform_caps.c to maintain platform related information

Set platform_caps.pi to true if all iommus are posted interrupt capable, false
otherwise

If lapic passthru is not configured and platform_caps.pi is true, the vm
may be able to use posted interrupt for a ptdev, if the ptdev's IRQ is
single-destination

Tracked-On: #4506
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@Intel.com>
2020-04-15 13:47:22 +08:00
Jian Jun Chen
159c9ec759 hv: add lock for ept add/modify/del
EPT table can be changed concurrently by more than one vcpus.
This patch add a lock to protect the add/modify/delete operations
from different vcpus concurrently.

Tracked-On: #4253
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
2020-04-13 11:38:55 +08:00
Li Fei1
366214e567 hv: virq: refine pending event inject sequence
Inject pending exception prior pending interrupt to complete the previous instruction.

Tracked-On: #1842
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-09 09:40:00 +08:00
Sainath Grandhi
5958d6f65f hv: Fix issues with the patch to reserve EPT 4K pages after boot
This patch fixes couple of minor issues with patch 8ffe6fc6

Tracked-On: #4563
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
2020-04-03 11:06:14 +08:00
Yan, Like
2997c4b570 HV: CAT: support cache allocation for each vcpu
This commit allows hypervisor to allocate cache to vcpu by assigning different clos
to vcpus of a same VM.
For example, we could allocate different cache to housekeeping core and real-time core
of an RTVM in order to isolate the interference of housekeeping core via cache hierarchy.

Tracked-On: #4566
Signed-off-by: Yan, Like <like.yan@intel.com>
Reviewed-by: Chen, Zide <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-02 13:55:35 +08:00
Sainath Grandhi
8ffe6fc67a hv: Reserve space for VMs' EPT 4k pages after boot
As ACRN prepares to support servers with large amounts of memory
current logic to allocate space for 4K pages of EPT at compile time
will increase the size of .bss section of ACRN binary.

Bootloaders could run into a situation where they cannot
find enough contiguous space to load ACRN binary under 4GB,
which is typically heavily fragmented with E820 types Reserved,
ACPI data, 32-bit PCI hole etc.

This patch does the following
1) Works only for "direct" mode of vboot
2) reserves space for 4K pages of EPT, after boot by parsing
platform E820 table, for all types of VMs.

Size comparison:

w/o patch
Size of DRAM            Size of .bss
48 GB                   0xe1bbc98 (~226 MB)
128 GB                  0x222abc98 (~548 MB)

w/ patch
Size of DRAM            Size of .bss
48 GB                   0x1991c98 (~26 MB)
128 GB                  0x1a81c98 (~28 MB)

Tracked-On: #4563
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-01 21:13:37 +08:00
Qian Wang
b55f414a9d HV: Removed unused member variable of iommu_domain and related code
hv: vtd: removed is_host (always false) and is_tt_ept (always true) member
variables of struct iommu_domain and related codes since the values are
always determined.

Tracked-On: #4535
Signed-off-by: Qian Wang <qian1.wang@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-01 10:43:54 +08:00