acrn-hypervisor

mirror of https://github.com/projectacrn/acrn-hypervisor.git synced 2025-07-12 23:00:35 +00:00

Author	SHA1	Message	Date
Binbin Wu	da1788c9a3	hv: vtd: add an API to reserve continuous irtes dmar_reserve_irte is added to reserve N coutinuous IRTEs. N could be 1, 2, 4, 8, 16, or 32. The reserved IRTEs will not be freed. Tracked-On:#4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	7bfcc673a6	hv: ptirq: associate an irte with ptirq_remapping_info entry For a ptirq_remapping_info entry, when build IRTE: - If the caller provides a valid IRTE, use the IRET - If the caller doesn't provide a valid IRTE, allocate a IRET when the entry doesn't have a valid IRTE, in this case, the IRET will be freed when free the entry. Tracked-On:#4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	2fe4280cfa	hv: vtd: add two paramters for dmar_assign_irte idx_in: - If the caller of dmar_assign_irte passes a valid IRTE index, it will be resued; - If the caller of dmar_assign_irte passes INVALID_IRTE_ID as IRTE index, the function will allocate a new IRTE. idx_out: This paramter return the actual index of IRTE used. The caller need to check whether the return value is valid or not. Also this patch adds an internal function alloc_irte. The function takes count as input paramter to allocate continuous IRTEs. The count can only be 1, 2, 4, 8, 16 or 32. This is prepared for multiple MSI vector support. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Li Fei1	65e4a16e6a	hv: mmu: release 1GB cpu side support constrain There're some platforms still doesn't support 1GB large page on CPU side. Such as lakefield, TNT and EHL platforms on which have some silicon bug and this case CPU don't support 1GB large page. This patch tries to release this constrain to support more hardware platform. Note this patch doesn't release the constrain on IOMMU side. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-15 15:16:34 +08:00
Binbin Wu	c907a820df	hv: config: add msix emulation support The information needed to enable MSI-x emulation. Only enable MSI-x emuation for the devices in msix_emul_devs array. Currently, only EHL has the need to enable MSI-x emulation for TSN devices. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-10 14:32:15 +08:00
Victor Sun	80262f0602	HV: rename append_seed_arg to fill_seed_arg Previously append_seed_arg() just do fill in seed arg to dest cmd buffer, so rename the api name to fill_seed_arg(). Since fill_seed_arg() will be called in SOS VM path only, the param of bool vm_is_sos is not needed and will be replaced by dest buffer size. The seed_args[] which used by fill_seed_arg() is pre-defined as all-zero, so memset() is not needed in fill_seed_arg(), buffer pointer check and strncpy_s() are not needed also. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Victor Sun	47d20f37e1	HV: replace merge_cmdline api with strncat_s Add a standard string api strncat_s() to replace merge_cmdline() to make code more readable. Another change is that the multiboot cmdline will be appended to the end of configured SOS bootargs instead of the beginning, this would enable a feature that some kernel cmdline paramter items could be overriden by multiboot cmdline since the later one would win if same parameters configured in kernel cmdline. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Li Fei1	ae4fa40adc	hv: vpci: hv: vpci: refine pci device assignment logic Now Host Bridge and PCI Bridge could only be added to SOS's acrn_vm_pci_dev_config. So For UOS, we always emualte Host Bridge and PCI Bridge for it and assign PCI device to it; for SOS, if it's the highest severity VM, we will assign Host Bridge and PCI Bridge to it directly, otherwise, we will emulate them same as UOS. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-03 22:00:43 +08:00
Li Fei1	b8f151a55f	hv: pci: check whether a PCI device is host bridge or not by class According PCI Code and ID Assignment Specification Revision 1.11, a PCI device whose Base Class is 06h and Sub-Class is 00h is a Host bridge. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-03 22:00:43 +08:00
Vijay Dhanraj	d03df0c7e2	HV: Fix MP Init sequence hang by adding a delay As per the BWG a delay should be provided between the INIT IPI and Startup IPI. Without the delay observe hangs on certain platforms during MP Init sequence. So Setting a delay of 10us between assert INIT IPI and Startup IPI. Also, as per SDM section 10.7 the the de-assert INIT IPI is only used for Pentium and P6 processors. This is not applicable for Pentium4 and Xeon processors so removing this sequence. Tracked-On: #4835 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 13:34:59 +08:00
Binbin Wu	3009d9399f	hv: vtd: cleanup snoop control related code Snoop control will not be turned on by hypervisor, delete snoop control related code. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 11:27:42 +08:00
Shuo A Liu	9a15ea82ee	hv: pause all other vCPUs in same VM when do wbinvd emulation Invalidate cache by scanning and flushing the whole guest memory is inefficient which might cause long execution time for WBINVD emulation. A long execution in hypervisor might cause a vCPU stuck phenomenon what impact Windows Guest booting. This patch introduce a workaround method that pausing all other vCPUs in the same VM when do wbinvd emulation. Tracked-On: #4703 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-21 15:21:29 +08:00
Mingqiang Chi	f994b5ffaf	hv:cleanup vcpu state -- remove VCPU_PAUSED and resume_vcpu -- remove vcpu->prev_state in vcpu structure -- rename pause_vcpu to zombie_vcpu Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-05-21 15:08:49 +08:00
Yonghua Huang	3391bffb27	hv:fix rtvm hang with maxcpus=0/1 in bootargs RTVM (with lapic PT) boots hang when maxcpus is assigned a value less than the CPU number configured in hypervisor. In this case, vlapic_state(per VM) is left in TRANSITION state after BSP boot, which blocks interupts to be injected to this UOS. Tracked-On: #4803 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Li, Fei <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-15 10:09:13 +08:00
Li Fei1	27a66acd0e	hv: ptdev: refine look up MSI ptirq entry There's no need to look up MSI ptirq entry by virtual SID any more since the MSI ptirq entry would be removed before the device is assigned to a VM. Now the logic of MSI interrupt remap could simplify as: 1. Add the MSI interrupt remap first; 2. If step is already done, just do the remap part. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com> Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com>	2020-05-13 14:31:01 +08:00
Li Fei1	15e3062631	hv: vpci: remove is_own_device() Now we could know a device status by 'user' filed, like --------------------------------------------------------------------------- \| NULL \| == vdev \| != NULL && != vdev vdev->user \| device is de-init \| used by itself VM \| assigned to another VM --------------------------------------------------------------------------- So we don't need to modify 'vpci' field accordingly. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com>	2020-05-13 14:31:01 +08:00
Zide Chen	0a956c34c7	hv: add a new field cpu_affinity in struct acrn_vm For post-launched VMs, the configured CPU affinity could be different from the actual running CPU affinity. This new field acrn_vm->cpu_affinity recognizes this difference so that it's possible that CREATE_VM hypercall won't overwrite the configured CPU afifnity. Change name cpu_affinity_bitmap in acrn_vm_config to cpu_affinity. This is read-only in run time, never overwritten by acrn-dm. Remove vm_config->vcpu_num, which means the number of vCPUs of the configured CPU affinity. This is not to be confused with the actual running vCPU number: vm->hw.created_vcpus. Changed get_vm_bsp_pcpu_id() to get_configured_bsp_pcpu_id() for less confusion. Tracked-On: #4616 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-08 11:04:31 +08:00
Yan, Like	869ccb7ba8	HV: RDT: add CDP support in ACRN CDP is an extension of CAT. It enables isolation and separate prioritization of code and data fetches to the L2 or L3 cache in a software configurable manner, depending on hardware support. This commit adds a Kconfig switch "CDP_ENABLED" which depends on "RDT_ENABLED". CDP will be enabled if the capability available and "CDP_ENABLED" is selected. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-08 08:50:13 +08:00
Yan, Like	277c668b04	HV: RDT: clean up RDT code This commit makes some RDT code cleanup, mainling including: - remove the clos_mask and mba_delay validation check in setup_res_clos_msr(), the check will be done in pre-build; - rename platform_clos_num to valid_clos_num, which is set as the minimal clos_mas of all enabled RDT resouces; - init the platform_clos_array in the res_cap_info[] definition; - remove the unnecessary return values and return value check. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com>	2020-05-08 08:50:13 +08:00
Yan, Like	f774ee1fba	HV: RDT: merge struct rdt_cache and rdt_membw in to a union A RDT resource could be CAT or MBA, so only one of struct rdt_cache and struct rdt_membw would be used at a time. They should be a union. This commit merge struct rdt_cache and struct rdt_membw in to a union res. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com	2020-05-08 08:50:13 +08:00
Li Fei1	0c6b3e57d6	hv: ptdev: minor refine about ptirq_build_physical_msi The virtual MSI information could be included in ptirq_remapping_info structrue, there's no need to pass another input paramater for this puepose. So we could remove the ptirq_msi_info input. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-05-06 11:51:11 +08:00
Li Fei1	067b439e69	hv: irq: minor refine about structure idt_64_descriptor The 'value' field in structure idt_64_descriptor is no one used. We could remove it. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-26 10:48:49 +08:00
Li Fei1	907a0f7c04	hv: vioapic: minor refine about vioapic_init Most code in the if ... else is duplicated. We could put it out of the conditional statement. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-24 15:35:38 +08:00
Zide Chen	9150284ca7	hv: replace vcpu_affinity array with cpu_affinity_bitmap Currently the vcpu_affinity[] array fixes the vCPU to pCPU mapping. While the new cpu_affinity_bitmap doesn't explicitly sepcify this mapping, instead, it implicitly assumes that vCPU0 maps to the pCPU with lowest pCPU ID, vCPU1 maps to the second lowest pCPU ID, and so on. This makes it possible for post-launched VM to run vCPUs on a subset of these pCPUs only, and not all of them. acrn-dm may launch post-launched VMs with the current approach: indicate VM UUID and hypervisor launches all VCPUs from the PCPUs that are masked in cpu_affinity_bitmap. Also acrn-dm can choose to launch the VM on a subset of PCPUs that is defined in cpu_affinity_bitmap. In this way, acrn-dm must specify the subset of PCPUs in the CREATE_VM hypercall. Additionally, with this change, a guest's vcpu_num can be easily calculated from cpu_affinity_bitmap, so don't assign vcpu_num in vm_configuration.c. Tracked-On: #4616 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-23 09:38:54 +08:00
Victor Sun	9264f51456	HV: refine usage of idle=halt in sos cmdline The parameter of "idle=halt" for SOS cmdline is only needed when cpu sharing is enabled, otherwise it will impact SOS power. Tracked-On: #4329 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-22 14:49:04 +08:00
Victor Sun	7282b933fb	HV: merge sos_pci_dev config to sos macro The pci_dev config settings of SOS are same so move the config interface from vm_configurations.c to CONFIG_SOS_VM macro; Tracked-On: #4616 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 13:45:18 +08:00
Victor Sun	55b50f408f	HV: init vm uuid and severity in macro Currently the vm uuid and severity is initilized separately in vm_config struct, developer need to take care both items carefuly otherwise hypervisor would have trouble with the configurations. Given the vm loader_order/uuid and severity are binded tightly, the patch merged these tree settings in one macro so that developer will have a simple interface to configure in vm_config struct. Tracked-On: #4616 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 13:45:18 +08:00
yuhong.tao@intel.com	7c80acee95	HV: emulate MSR_TEST_CTL If CPU has MSR_TEST_CTL, show an emulaued one to VCPU Tracked-On: #4496 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 09:53:59 +08:00
yuhong.tao@intel.com	dd3fa8ed75	HV: enable #AC for Splitlock Access If CPU support rise #AC for Splitlock Access, then enable this feature at each CPU. Tracked-On: #4496 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 09:53:59 +08:00
yuhong.tao@intel.com	ea1bce0cbf	HV: enumerate capability of #AC for Splitlock Access When the destination of an atomic memory operation located in 2 cache lines, it is called a Splitlock Access. LOCK# bus signal is asserted for splitlock access which may lead to long latency. #AC for Splitlock Access is a CPU feature, it allows rise alignment check exception #AC(0) instead of asserting LOCK#, that is helpful to detect Splitlock Access. This feature is enumerated by MSR(0xcf) IA32_CORE_CAPABILITIES[bit5] Add helper function: bool has_core_cap(uint32_t bitmask) Tracked-On: #4496 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 09:53:59 +08:00
Mingqiang Chi	f90100e382	hv: add pre-condition for vcpu APIs remove unnecessary state check and add pre-condition for vcpu APIs. Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-16 21:59:03 +08:00
Jason Chen CJ	0584981c03	hv:add pre-condition for vm APIs check the vm state in hypercall api, add pre-condition for vm api. Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-16 21:59:03 +08:00
Xiaoguang Wu	d4f789f47e	hv: iommu: remove snoop related code ACRN disables Snoop Control in VT-d DMAR engines for simplifing the implementation. Also, since the snoop behavior of PCIE transactions can be controlled by guest drivers, some devices may take the advantage of the NO_SNOOP_ATTRIBUTE of PCIE transactions for better performance when snoop is not needed. No matter ACRN enables or disables Snoop Control, the DMA operations of passthrough devices behave correctly from guests' point of view. This patch is used to clean all the snoop related code. Tracked-On: #4509 Signed-off-by: Xiaoguang Wu <xiaoguang.wu@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-16 08:40:17 +08:00
Conghui Chen	84ad340898	hv: fix for waag 2 core reboot issue Waag will send NMIs to all its cores during reboot. But currently, NMI cannot be injected to vcpu which is in HLT state. To fix the problem, need to wakeup target vcpu, and inject NMI through interrupt-window. Tracked-On: #4620 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-15 14:42:00 +08:00
Zide Chen	6040d8f6a2	hv: fix SOS vapic_id assignment issue Currently vlapic_build_id() uses vcpu_id to retrieve the lapic_id per_cpu variable: vlapic_id = per_cpu(lapic_id, vcpu->vcpu_id); SOS vcpu_id may not equal to pcpu_id, and in that case it runs into problems. For example, if any pre-launched VMs are launched on PCPUs whose IDs are smaller than any PCPU IDs that are used by SOS. This patch fixes the issue and simplify the code to create or get vapic_id by: - assign vapic_id in create_vlapic(), which now takes pcpu_id as input argument, and save it in the new field: vlapic->vapic_id, which will never be changed. - simplify vlapic_get_apicid() by returning te saved vapid_id directly. - remove vlapic_build_id(). - vlapic_init() is only called once, merge it into vlapic_create(). Tracked-On: #4268 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-15 14:34:15 +08:00
dongshen	00ad3863a1	hv: maintain a per-pCPU array of vCPUs and handle posted interrupt IRQs Maintain a per-pCPU array of vCPUs (struct acrn_vcpu *vcpu_array[CONFIG_MAX_VM_NUM]), one VM cannot have multiple vCPUs share one pcpu, so we can utilize this property and use the containing VM's vm_id as the index to the vCPU array: In create_vcpu(), we simply do: per_cpu(vcpu_array, pcpu_id)[vm->vm_id] = vcpu; In offline_vcpu(): per_cpu(vcpu_array, pcpuid_from_vcpu(vcpu))[vcpu->vm->vm_id] = NULL; so basically we use the containing VM's vm_id as the index to the vCPU array, as well as the index of posted interrupt IRQ/vector pair that are assigned to this vCPU: 0: first vCPU and first posted interrupt IRQs/vector pair (POSTED_INTR_IRQ/POSTED_INTR_VECTOR) ... CONFIG_MAX_VM_NUM-1: last vCPU and last posted interrupt IRQs/vector pair ((POSTED_INTR_IRQ + CONFIG_MAX_VM_NUM - 1U)/(POSTED_INTR_VECTOR + CONFIG_MAX_VM_NUM - 1U) In the posted interrupt handler, it will do the following: Translate the IRQ into a zero based index of where the vCPU is located in the vCPU list for current pCPU. Once the vCPU is found, we wake up the waiting thread and record this request as ACRN_REQUEST_EVENT Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com> Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-04-15 13:47:22 +08:00
dongshen	14fa9c563c	hv: define posted interrupt IRQs/vectors This is a preparation patch for adding support for VT-d PI related vCPU scheduling. ACRN does not support vCPU migration, one vCPU always runs on the same pCPU, so PI's ndst is never changed after startup. VCPUs of a VM won’t share same pCPU. So the maximum possible number of VCPUs that can run on a pCPU is CONFIG_MAX_VM_NUM. Allocate unique Activation Notification Vectors (ANV) for each vCPU that belongs to the same pCPU, the ANVs need only be unique within each pCPU, not across all vCPUs. This reduces # of pre-allocated ANVs for posted interrupts to CONFIG_MAX_VM_NUM, and enables ACRN to avoid switching between active and wake-up vector values in the posted interrupt descriptor on vCPU scheduling state changes. A total of CONFIG_MAX_VM_NUM consecutive IRQs/vectors are reserved for posted interrupts use. The code first initializes vcpu->arch.pid.control.bits.nv dynamically (will be added in subsequent patch), the other code shall use vcpu->arch.pid.control.bits.nv instead of the hard-coded notification vectors. Rename some functions: apicv_post_intr --> apicv_trigger_pi_anv posted_intr_notification --> handle_pi_notification setup_posted_intr_notification --> setup_pi_notification Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
dongshen	f7be985a23	hv: check if the IRQ is intended for a single destination vCPU Given the vcpumask, check if the IRQ is single destination and return the destination vCPU if so, the address of associated PI descriptor for this vCPU can then be passed to dmar_assign_irte() to set up the posted interrupt IRTE for this device. For fixed mode interrupt delivery, all vCPUs listed in vcpumask should service the interrupt requested. But VT-d PI cannot support multicast/broadcast IRQs, it only supports single CPU destination. So the number of vCPUs shall be 1 in order to handle IRQ in posted mode for this device. Add pid_paddr to struct intr_source. If platform_caps.pi is true and the IRQ is single-destination, pass the physical address of the destination vCPU's PID to ptirq_build_physical_msi and dmar_assign_irte Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
dongshen	6496da7c56	hv: add function to check if using posted interrupt is possible for vm Add platform_caps.c to maintain platform related information Set platform_caps.pi to true if all iommus are posted interrupt capable, false otherwise If lapic passthru is not configured and platform_caps.pi is true, the vm may be able to use posted interrupt for a ptdev, if the ptdev's IRQ is single-destination Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
Jian Jun Chen	159c9ec759	hv: add lock for ept add/modify/del EPT table can be changed concurrently by more than one vcpus. This patch add a lock to protect the add/modify/delete operations from different vcpus concurrently. Tracked-On: #4253 Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2020-04-13 11:38:55 +08:00
Li Fei1	366214e567	hv: virq: refine pending event inject sequence Inject pending exception prior pending interrupt to complete the previous instruction. Tracked-On: #1842 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-09 09:40:00 +08:00
Sainath Grandhi	5958d6f65f	hv: Fix issues with the patch to reserve EPT 4K pages after boot This patch fixes couple of minor issues with patch `8ffe6fc6` Tracked-On: #4563 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-04-03 11:06:14 +08:00
Yan, Like	2997c4b570	HV: CAT: support cache allocation for each vcpu This commit allows hypervisor to allocate cache to vcpu by assigning different clos to vcpus of a same VM. For example, we could allocate different cache to housekeeping core and real-time core of an RTVM in order to isolate the interference of housekeeping core via cache hierarchy. Tracked-On: #4566 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Chen, Zide <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-02 13:55:35 +08:00
Sainath Grandhi	8ffe6fc67a	hv: Reserve space for VMs' EPT 4k pages after boot As ACRN prepares to support servers with large amounts of memory current logic to allocate space for 4K pages of EPT at compile time will increase the size of .bss section of ACRN binary. Bootloaders could run into a situation where they cannot find enough contiguous space to load ACRN binary under 4GB, which is typically heavily fragmented with E820 types Reserved, ACPI data, 32-bit PCI hole etc. This patch does the following 1) Works only for "direct" mode of vboot 2) reserves space for 4K pages of EPT, after boot by parsing platform E820 table, for all types of VMs. Size comparison: w/o patch Size of DRAM Size of .bss 48 GB 0xe1bbc98 (~226 MB) 128 GB 0x222abc98 (~548 MB) w/ patch Size of DRAM Size of .bss 48 GB 0x1991c98 (~26 MB) 128 GB 0x1a81c98 (~28 MB) Tracked-On: #4563 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 21:13:37 +08:00
Qian Wang	b55f414a9d	HV: Removed unused member variable of iommu_domain and related code hv: vtd: removed is_host (always false) and is_tt_ept (always true) member variables of struct iommu_domain and related codes since the values are always determined. Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Li Fei1	2b7168da9e	hv: vmtrr: remove vcpu structure pointer from vmtrr We could use container_of to get vcpu structure pointer from vmtrr. So vcpu structure pointer is no need in vmtrr structure. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 10:57:47 +08:00
Li Fei1	a7768fdb6a	hv: vlapic: remove vcpu/vm structure pointer from vlapic We could use container_of to get vcpu/vm structure pointer from vlapic. So vcpu/vm structure pointer is no need in vlapic structure. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 10:57:47 +08:00
dongshen	1328dcb205	hv: extend union dmar_ir_entry to support VT-d posted interrupts Exend union dmar_ir_entry to support VT-d posted interrupts. Rename some fields of union dmar_ir_entry: entry --> value sw_bits --> avail Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	016c1a5073	hv: pass pointer to functions Pass intr_src and dmar_ir_entry irte as pointers to dmar_assign_irte(), which fixes the "Attempt to change parameter passed by value" MISRA C violation. A few coding style fixes Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	0f3c876a91	hv: extend struct pi_desc to support VT-d posted interrupts For CPU side posted interrupts, it only uses bit 0 (ON) of the PI's 64-bit control , other bits are don't care. This is not the case for VT-d posted interrupts, define more bit fields for the PI's 64-bit control. Use bitmap functions to manipulate the bit fields atomically. Some MISRA-C violation and coding style fixes Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	8f732f2809	hv: move pi_desc related code from vlapic.h/vlapic.c to vmx.h/vmx.c/vcpu.h The posted interrupt descriptor is more of a vmx/vmcs concept than a vlapic concept. struct acrn_vcpu_arch stores the vmx/vmcs info, so put struct pi_desc in struct acrn_vcpu_arch. Remove the function apicv_get_pir_desc_paddr() A few coding style/typo fixes Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	b384d04ad1	hv: rename vlapic_pir_desc to pi_desc Rename struct vlapic_pir_desc to pi_desc Rename struct member and local variable pir_desc to pid pir=posted interrupt request, pi=posted interrupt pid=posted interrupt descriptor pir is part of pi descriptor, so it is better to use pi instead of pir struct pi_desc will be moved to vmx.h in subsequent commit. Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
Li Fei1	4512ef7ec9	hv: cpuid: remove cpuid() The cupid() can be replaced with cupid_subleaf, which is more clear. Having both APIs makes reading difficult. Tracked-On: #4526 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-25 13:26:58 +08:00
Sainath Grandhi	fe5a108c7b	hv: vioapic init for SOS VM on platforms with multiple IO-APICs For SOS VM, when the target platform has multiple IO-APICs, there should be equal number of virtual IO-APICs. This patch adds support for emulating multiple vIOAPICs per VM. Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	f67ac09141	hv: Handle holes in GSI i.e. Global System Interrupt for multiple IO-APICs MADT is used to specify the GSI base for each IO-APIC and the number of interrupt pins per IO-APIC is programmed into Max. Redir. Entry register of that IO-APIC. On platforms with multiple IO-APICs, there can be holes in the GSI space. For example, on a platform with 2 IO-APICs, the following configuration has a hole (from 24 to 31) in the GSI space. IO-APIC 1: GSI base - 0, number of pins - 24 IO-APIC 2: GSI base - 32, number of pins - 8 This patch also adjusts the size for variables used to represent the total number of IO-APICs on the system from uint16_t to uint8_t as the ACPI MADT uses only 8-bits to indicate the unique IO-APIC IDs. Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	85217e362f	hv: Introduce Global System Interrupt (GSI) into INTx Remapping As ACRN prepares to support platforms with multiple IO-APICs, GSI is a better way to represent physical and virtual INTx interrupt source. 1) This patch replaces usage of "pin" with "gsi" whereever applicable across the modules. 2) PIC pin to gsi is trickier and needs to consider the usage of "Interrupt Source Override" structure in ACPI for the corresponding VM. Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	06b59e0bc1	hv: Use ptirq_lookup_entry_by_sid to lookup virtual source id in IOAPIC irq entries Reverts `538ba08c`: hv:Add vpin to ptdev entry mapping for vpic/vioapic ACRN uses an array of size per VM to store ptirq entries against the vIOAPIC pin and an array of size per VM to store ptirq entries against the vPIC pin. This is done to speed up "ptirq entry" lookup at runtime for Level triggered interrupts in API ptirq_intx_ack used on EOI. This patch switches the lookup API for INTx interrupts to the API, ptirq_lookup_entry_by_sid This could add delay to processing EOI for Level triggered interrupts. Trade-off here is space saved for array/s of size CONFIG_MAX_IOAPIC_LINES with 8 bytes per data. On a server platform, ACRN needs to emulate multiple vIOAPICs for SOS VM, same as the number of physical IO-APICs. Thereby ACRN would need around 10 such arrays per VM. Removes the need of "pic_pin" except for the APIs facing the hypercalls hcall_set_ptdev_intr_info, hcall_reset_ptdev_intr_info Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Li Fei1	e5c7a96513	hv: vpci: sos could access low severity guest pci cfg space There're some cases the SOS (higher severity guest) needs to access the post-launched VM (lower severity guest) PCI CFG space: 1. The SR-IOV PF needs to reset the VF 2. Some pass through device still need DM to handle some quirk. In the case a device is assigned to a UOS and is not in a zombie state, the SOS is able to access, if and only if the SOS has higher severity than the UOS. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-20 10:08:43 +08:00
Mingqiang Chi	14692ef60c	hv:Rename two VM states Rename: VM_STARTED --> VM_RUNNING VM_POWERING_OFF --> VM_READY_TO_POWEROFF Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-13 10:34:29 +08:00
Victor Sun	a68f655a11	HV: update ept address range for pre-launched VM For a pre-launched VM, a region from PTDEV_HI_MMIO_START is used to store 64bit vBARs of PT devices which address is high than 4G. The region should be located after all user memory space and be coverd by guest EPT address. Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	e74553492a	HV: move create_sos_vm_e820 to ve820.c ve820.c is a common file in arch/x86/guest/ now, so move function of create_sos_vm_e820() to this file to make code structure clear; Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	d7eac3fe6a	HV: decouple prelaunch VM ve820 from board configs hypervisor/arch/x86/configs/$(BOARD)/ve820.c is used to store pre-launched VM specific e820 entries according to memory configuration of customer. It should be a scenario based configurations but we had to put it in per board foler because of different board memory settings. This brings concerns to customer on configuration orgnization. Currently the file provides same e820 layout for all pre-launched VMs, but they should have different e820 when their memory are configured differently. Although we have acrn-config tool to generate ve802.c automatically, it is not friendly to modify hardcoded ve820 layout manually, so the patch changes the entries initialization method by calculating each entry item in C code. Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	4c0965d89e	HV: correct ept page array usage Currently ept_pages_info[] is initialized with first element only that force VM of id 0 using SOS EPT pages. This is incorrect for logical partition and hybrid scenario. Considering SOS_RAM_SIZE and UOS_RAM_SIZE are configured separately, we should use different ept pages accordingly. So, the PRE_VM_NUM/SOS_VM_NUM and MAX_POST_VM_NUM macros are introduced to resolve this issue. The macros would be generated by acrn-config tool when user configure ACRN for their specific scenario. One more thing, that when UOS_RAM_SIZE is less then 2GB, the EPT address range should be (4G + PLATFORM_HI_MMIO_SIZE). Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Mingqiang Chi	790614e952	hv:rename several variables and api for ioapic rename: ioapic_get_gsi_irq_addr --> gsi_to_ioapic_base ioapic_addr -->ioapic_base Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-03-11 13:26:15 +08:00
Yuan Liu	696f6c7ba4	hv: the VM can only deinit its own devices VM needs to check if it owns this device before deiniting it. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-11 08:35:30 +08:00
Sainath Grandhi	460e7ee5b1	hv: Variable/macro renaming for intr handling of PT devices using IO-APIC/PIC 1. Renames DEFINE_IOAPIC_SID with DEFINE_INTX_SID as the virtual source can be IOAPIC or PIC 2. Rename the src member of source_id.intx_id to ctlr to indicate interrupt controller 2. Changes the type of src member of source_id.intx_id from uint32_t to enum with INTX_CTLR_IOAPIC and INTX_CTLR_PIC Tracked-On: #4447 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-03-06 11:29:02 +08:00
Zide Chen	49ffe168af	hv: fixup relocation delta for symbols belong to entry section This is to enable relocation for code32. - RIP relative addressing is available in x86-64 only so we manually add relocation delta to the target symbols to fixup code32. - both code32 and code64 need to load GDT hence both need to fixup GDT pointer. This patch declares separate GDT pointer cpu_primary64_gdt_ptr for code64 to avoid double fixup. - manually fixup cpu_primary64_gdt_ptr in code64, but not rely on relocate() to do that. Otherwise it's very confusing that symbols from same file could be fixed up externally by relocate() or self-relocated. - to make it clear, define a new symbol ld_entry_end representing the end of the boot code that needs manually fixup, and use this symbol in relocate() to filter out all symbols belong to the entry sections. Tracked-On: #4441 Reviewed-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-06 08:27:46 +08:00
Vijay Dhanraj	92ee33b035	HV: Add MBA support in ACRN This patch adds RDT MBA support to detect, configure and and setup MBA throttle registers based on VM configuration. Tracked-On: #3725 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-04 17:33:50 +08:00
Yuan Liu	320ed6c238	hv: refine init_one_dev_config The init_one_dev_config is used to initialize a acrn_vm_pci_dev_config SRIOV needs a explicit acrn_vm_pci_dev_config to create a VF vdev,so refine it to return acrn_vm_pci_dev_config. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Conghui Chen	595cefe3f2	hv: xsave: move assembler to individual function Current code avoid the rule 88 S in MISRA-C, so move xsaves and xrstors assembler to individual functions. Tracked-On: #4436 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 17:55:06 +08:00
Yuan Liu	5e989f13c6	hv: check if there is enough room for all SRIOV VFs. Make the SRIOV-Capable device invisible from SOS if there is no room for its all virtual functions. v2: fix a issue that if a PF has been dropped, the subsequent PF will be dropped too even there is room for its VFs. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 14:04:01 +08:00
Conghui Chen	c246d1c9b8	hv: xsave: bugfix for init value The init value for XCR0 and XSS should be the same with spec: In SDM Vol1 13.3: XCR0[0] is associated with x87 state (see Section 13.5.1). XCR0[0] is always 1. The other bits in XCR0 are all 0 coming out of RESET. The IA32_XSS MSR (with MSR index DA0H) is zero coming out of RESET. The previous code try to fix the xsave area leak to other VMs during init phase, but bring the error to linux. Besides, it cannot avoid the possible leak in running phase. Need find a better solution. Tracked-On: #4430 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 09:19:29 +08:00
Vijay Dhanraj	eaad91fd71	HV: Remove RDT code if CONFIG_RDT_ENABLED flag is not set This patch does the following, 1. Removes RDT code if CONFIG_RDT_ENABLED flag is not set. 2. Set the CONFIG_RDT_ENABLED flag only on platforms that support RDT so that build scripts will automatically reflect the config. Tracked-On: #3715 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	d0665fe220	HV: Generalize RDT infrastructure and fix RDT cache configuration. This patch creates a generic infrastructure for RDT resources instead of just L2 or L3 cache. This patch also fixes L3 CAT config overwrite by L2 in cases where both L2 and L3 CAT are supported. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	887e3813bc	HV: Add both HW and SW checks for RDT support There can be times when user unknowinlgy enables CONFIG_CAT_ENBALED SW flag, but the hardware might not support L3 or L2 CAT. In such case software can end up writing to the CAT MSRs which can cause undefined results. The patch fixes the issue by enabling CAT only when both HW as well software via the CONFIG_CAT_ENABLED supports CAT. The patch also address typo with "clos2prq_msr" function name. It should be "clos2pqr_msr" instead. PQR stands for platform qos register. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	b8a021d658	HV: split L2 and L3 cache resource MSR Upcoming intel platforms can support both L2 and L3 but our current code only supports either L2 or L3 CAT. So split the MSRs so that we can support allocation for both L2 and L3. This patch does the following, 1. splits programming of L2 and L3 cache resource based on the resource ID. 2. Replace generic platform_clos_array struct with resource specific struct in all the existing board.c files. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	2597429903	HV: Rename cat.c/.h files to rdt.c/.h As part of rdt cat refactoring, goal is to combine all rdt specific features such as CAT under one module. So renaming rdt resouce specific files such as cat.c/.h to generic rdt.c/.h files. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Victor Sun	da3d181f62	HV: init efi info with multiboot2 Initialize efi info of acrn mbi when boot from multiboot2 protocol, with this patch hypervisor could get host efi info and pass it to Linux zeropage, then make guest Linux possible to boot with efi environment; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	520a0222d3	HV: re-arch boot component header The patch re-arch boot component header files by: - moving multiboot.h from include/arch/x86/ to boot/include/ and keep this header for multiboot1 protocol data struct only; - moving multiboot related MACROs in cpu_primary.S to multiboot.h; - creating an independent boot.h to store acrn specific boot information for other files' reference; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Conghui Chen	a7563cb9bd	hv: sched_bvt: add BVT scheduler BVT (Borrowed virtual time) scheduler is used to schedule vCPUs on pCPU. It has the concept of virtual time, vCPU with earliset virtual time is dispatched first. Main concepts: tick timer: a period tick is used to measure the physcial time in units of MCU (minimum charing unit). runqueue: thread in the runqueue is ordered by virtual time. weight: each thread receives a share of the pCPU in proportion to its weight. context switch allowance: the physcial time by which the current thread is allowed to advance beyond the next runnable thread. warp: a thread with warp enabled will have a change to minus a value (Wi) from virtual time to achieve higher priority. virtual time: AVT: actual virtual time, advance in proportional to weight. EVT: effective virtual time. EVT <- AVT - ( warp ? Wi : 0 ) SVT: scheduler virtual time, the minimum AVT in the runqueue. Tracked-On: #4410 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-25 09:11:32 +08:00
Yonghua Huang	64b874ce4c	hv: rename BOOT_CPU_ID to BSP_CPU_ID 1. Rename BOOT_CPU_ID to BSP_CPU_ID 2. Repace hardcoded value with BSP_CPU_ID when ID of BSP is referenced. Tracked-On: #4420 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-02-25 09:08:14 +08:00
Li Fei1	e8479f84cd	hv: vPCI: remove passthrough PCI device unuse code Now we split passthrough PCI device from DM to HV, we could remove all the passthrough PCI device unused code. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-24 16:17:38 +08:00
Junming Liu	1303861d26	hv:enable gpu iommu except APL platforms To enable gvt-d,need to allow the GPU IOMMU. While gvt-d hasn't been enabled on APL yet, so let APL disable GPU IOMMU. v2 -> v3: * let APL platforms disable GPU IOMMU. Tracked-On: #4405 Signed-off-by: Junming Liu <junming.liu@intel.com> Reviewed-by: Wu Binbin <binbin.wu@intel.com>	2020-02-24 11:47:10 +08:00
Zide Chen	cc6f094926	hv: CAT is supposed to be enabled in the system level In platforms that support CAT, when it is enabled by ACRN, i.e. IA32_resourceType_MASK_n registers are programmed with customized values, it has impacts to the whole system. The per guest flag GUEST_FLAG_CLOS_REQUIRED suggests that CAT may be enabled in some guests, but not in others who don't have this flag, which is conceptually incorrect. This patch removes GUEST_FLAG_CLOS_REQUIRED, and adds a new Kconfig entry CAT_ENABLED for CAT enabling. When it's enabled, platform_clos_array[] defines a set of system-wide Class of Service (COS, or CLOS), and the per guest vm_configs[].clos associates the guest with particular CLOS. Tracked-On: #2462 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-02-17 08:51:59 +08:00
Yonghua Huang	fd4775d044	hv: rename VECTOR_XXX and XXX_IRQ Macros 1. Align the coding style for these MACROs 2. Align the values of fixed VECTORs Tracked-On: #4348 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-01-14 10:21:23 +08:00
Yonghua Huang	b90862921e	hv: rename the ACRN_DBG_XXX Refine this MACRO 'ACRN_DBG_XXX' to 'DBG_LEVEL_XXX' Tracked-On: #4348 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-01-14 10:21:23 +08:00
Shuo A Liu	db708fc3e8	hv: rename is_completion_polling to is_polling_ioreq is_polling_ioreq is more straightforward. Rename it. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-13 10:16:30 +08:00
Li Fei1	65f3751ea3	hv: pci: add hide pci devices configuration for apl-up2 Other Platforms are not added for now. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-01-07 16:05:30 +08:00
Shuo A Liu	a8f6bdd479	hv: Add vlapic_has_pending_intr of apicv to check pending interrupts Sometimes HV wants to know if there are pending interrupts of one vcpu. Add .has_pending_intr interface in acrn_apicv_ops and return the pending interrupts status by check IRRs of apicv. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Shuo A Liu	e3c303363b	hv: vcpu: wait and signal vcpu event support Introduce two kinds of events for each vcpu, VCPU_EVENT_IOREQ: for vcpu waiting for IO request completion VCPU_EVENT_VIRTUAL_INTERRUPT: for vcpu waiting for virtual interrupts events vcpu can wait for such events, and resume to run when the event get signalled. This patch also change IO request waiting/notifying to this way. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Li Fei1	26670d7ab3	hv: vpci: revert do FLR and BAR restore Since we restore BAR values when writing Command Register if necessary. We don't need to trap FLR and do the BAR restore then. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-30 13:43:07 +08:00
Victor Sun	c6f7803f06	HV: restore lapic state and apic id upon INIT Per SDM 10.12.5.1 vol.3, local APIC should keep LAPIC state after receiving INIT. The local APIC ID register should also be preserved. Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	ab13228591	HV: ensure valid vcpu state transition The vcpu state machine transition should follow below rule: old vcpu state new vcpu state ============== ============== VCPU_OFFLINE --- create_vcpu --> VCPU_INIT VCPU_INIT --- launch_vcpu --> VCPU_RUNNING VCPU_RUNNING --- pause_vcpu --> VCPU_PAUSED VCPU_PAUSED --- resume_vcpu --> VCPU_RUNNING VCPU_RUNNING/PAUSED --- pause_vcpu --> VCPU_ZOMBIE VCPU_INIT --- pause_vcpu --> VCPU_ZOMBIE VCPU_ZOMBIE --- reset_vcpu --> VCPU_INIT VCPU_ZOMBIE --- offline_vcpu--> VCPU_OFFLINE Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	a5158e2c16	HV: refine reset_vcpu api The patch abstract a vcpu_reset_internal() api for internal usage, the function would not touch any vcpu state transition and just do vcpu reset processing. It will be called by create_vcpu() and reset_vcpu(). The reset_vcpu() will act as a public api and should be called only when vcpu receive INIT or vm reset/resume from S3. It should not be called when do shutdown_vm() or hcall_sos_offline_cpu(), so the patch remove reset_vcpu() in shutdown_vm() and hcall_sos_offline_cpu(). The patch also introduced reset_mode enum so that vcpu and vlapic could do different context operation according to different reset mode; Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	9ecac8629a	HV: clean up redundant macro in lapic.h Some MACROs in lapic.h are duplicated with apicreg.h, and some MACROs are never referenced, remove them. Tracked-On: #4268 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Yin Fengwei	e5117bf19a	vm: add severity for vm_config Add severity definitions for different scenarios. The static guest severity is defined according to guest configurations. Also add sanity check to make sure the severity for all guests are correct. Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Yin Fengwei	bfa19e9104	pm: S5: update the system shutdown logical in ACRN For system S5, ACRN had assumption that SOS shutdown will trigger system shutdown. So the system shutdown logical is: 1. Trap SOS shutdown 2. Wait for all other guest shutdown 3. Shutdown system The new logical is refined as: If all guest is shutdown, shutdown whole system Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Li Fei1	a90e0f6c84	hv: vpci: restore PCI BARs when doing PCIe FLR ACRN hypervisor should trap guest doing PCIe FLR. Besides, it should save some status before doing the FLR and restore them later, only BARs values for now. This patch will trap guest Device Capabilities Register write operation if the device supports PCI Express Capability and check whether it wants to do device FLR. If it does, call pdev_do_flr to do the job. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-23 10:14:37 +08:00
Kaige Fu	5f9d1379bc	HV: Remove INIT signal notification related code We don't use INIT signal notification method now. This patch removes them. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	a13909cedc	HV: Use NMI-window exiting to address req missing issue There is a window where we may miss the current request in the notification period when the work flow is as the following: CPUx + + CPUr \| \| \| +--+ \| \| \| Handle pending req \| <--+ +--+ \| \| \| Set req flag \| <--+ \| +------------------>---+ \| Send NMI \| \| Handle NMI \| <--+ \| \| \| \| \| +--> vCPU enter \| \| + + So, this patch enables the NMI-window exiting to trigger the next vmexit once there is no "virtual-NMI blocking" after vCPU enter into VMX non-root mode. Then we can process the pending request on time. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	40ba7e8686	HV: Don't make NMI injection req when notifying vCPU The NMI for notification should not be inject to guest. So, this patch drops NMI injection request when we use NMI to notify vCPUs. Meanwhile, ACRN doesn't support vNMI well and there is no well-designed way to check if the NMI is for notification or for guest now. So, we take all the NMIs as notificaton NMI for hard rtvm temporarily. It means that the hard rtvm will never receive NMI with this patch applied. TODO: vNMI support is not ready yet. we will add it later. Tracked-On: #3886 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Shiqing Gao	3cee259583	hv: msr: remove redundant check in write_pat_msr Reserved bits in a 8-bit PAT field has been checked in pat_mem_type_invalid. Remove this redundant check "(PAT_FIELD_RSV_BITS & field) != 0UL" in write_pat_msr. Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-12-16 14:32:42 +08:00
Kaige Fu	2777f23075	HV: Add helper function send_single_nmi This patch adds a helper function send_single_nmi. The fisrt caller will soon come with the following patch. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-13 10:13:09 +08:00
Kaige Fu	525d4d3cd0	HV: Install a NMI handler in acrn IDT This patch installs a NMI handler in acrn IDT to handle NMIs out of dispatch_exception. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-13 10:13:09 +08:00
Victor Sun	a44c1c900c	HV: Kconfig: remove MAX_VCPUS_PER_VM in Kconfig In current architecutre, the maximum vCPUs number per VM could not exceed the pCPUs number. Given the MAX_PCPU_NUM macro is provided in board configurations, so remove the MAX_VCPUS_PER_VM from Kconfig and add a macro of MAX_VCPUS_PER_VM to reference MAX_PCPU_NUM directly. Tracked-On: #4230 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 13:49:28 +08:00
Victor Sun	ea3476d22d	HV: rename CONFIG_MAX_PCPU_NUM to MAX_PCPU_NUM rename the macro since MAX_PCPU_NUM could be parsed from board file and it is not a configurable item anymore. Tracked-On: #4230 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 13:49:28 +08:00
Mingqiang Chi	b6bffd01ff	hv:remove 2 unused variables in vm_arch structure remove 'guest_init_pml4' and 'tmp_pg_array' in vm_arch since they are not used. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-12-12 10:13:11 +08:00
Vijay Dhanraj	c8a4ca6c78	HV: Extend non-contiguous HPA for hybrid scenario This patch extends non-contiguous HPA allocations for pre-launched VMs in hybrid scenario. Tracked-On: #4217 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 10:12:46 +08:00
Shuo A Liu	ed4008630d	hv: sched_iorr: Add IO sensitive Round-robin scheduler IO sensitive Round-robin scheduler aim to schedule threads with round-robin policy. Meanwhile, we also enhance it with some fairness configuration, such as thread will be scheduled out without properly timeslice. IO request on thread will be handled in high priority. This patch only add a skeleton for the sched_iorr scheduler. Tracked-On: #4178 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 09:31:39 +08:00
Vijay Dhanraj	6e8b413689	HV: Add support to assign non-contiguous HPA regions for pre-launched VM On some platforms, HPA regions for Virtual Machine can not be contiguous because of E820 reserved type or PCI hole. In such cases, pre-launched VMs need to be assigned non-contiguous memory regions and this patch addresses it. To keep things simple, current design has the following assumptions, 1. HPA2 always will be placed after HPA1 2. HPA1 and HPA2 don’t share a single ve820 entry. (Create multiple entries if needed but not shared) 3. Only support 2 non-contiguous HPA regions (can extend at a later point for multiple non-contiguous HPA) Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Tracked-On: #4195 Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-12-09 11:28:38 +08:00
Conghui Chen	d48da2af3a	hv: bugfix for debug commands with smp_call With cpu-sharing enabled, there are more than 1 vcpu on 1 pcpu, so the smp_call handler should switch the vmcs to the target vcpu's vmcs. Then get the info. dump_vcpu_reg and dump_guest_mem should run on certain vmcs, otherwise, there will be #GP error. Renaming: vcpu_dumpreg -> dump_vcpu_reg switch_vmcs -> load_vmcs Tracked-On: #4178 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-05 11:19:35 +08:00
Binbin Wu	3d412266bc	hv: ept: build 4KB page mapping in EPT for RTVM for MCE on PSC Deterministic is important for RTVM. The mitigation for MCE on Page Size Change converts a large page to 4KB pages runtimely during the vmexit triggered by the instruction fetch in the large page. These vmexits increase nondeterminacy, which should be avoided for RTVM. This patch builds 4KB page mapping in EPT for RTVM to avoid these vmexits. Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 09:17:04 +08:00
Binbin Wu	192859ee02	hv: ept: apply MCE on page size change mitigation conditionally Only apply the software workaround on the models that might be affected by MCE on page size change. For these models that are known immune to the issue, the mitigation is turned off. Atom processors are not afftected by the issue. Also check the CPUID & MSR to check whether the model is immune to the issue: CPU is not vulnerable when both CPUID.(EAX=07H,ECX=0H).EDX[29] and IA32_ARCH_CAPABILITIES[IF_PSCHANGE_MC_NO] are 1. Other cases not listed above, CPU may be vulnerable. This patch also changes MACROs for MSR IA32_ARCH_CAPABILITIES bits to UL instead of U since the MSR is 64bit. Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 09:17:04 +08:00
Shuo A Liu	3cb32bb6e3	hv: make init_vmcs as a event of VCPU After changing init_vmcs to smp call approach and do it before launch_vcpu, it could work with noop scheduler. On real sharing scheudler, it has problem. pcpu0 pcpu1 pcpu1 vmBvcpu0 vmAvcpu1 vmBvcpu1 vmentry init_vmcs(vmBvcpu1) vmexit->do_init_vmcs corrupt current vmcs vmentry fail launch_vcpu(vmBvcpu1) This patch mark a event flag when request vmcs init for specific vcpu. When it is running and checking pending events, will do init_vmcs firstly. Tracked-On: #4178 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:20:43 +08:00
Conghui Chen	e61412981d	hv: support xsave in context switch xsave area: legacy region: 512 bytes xsave header: 64 bytes extended region: < 3k bytes So, pre-allocate 4k area for xsave. Use certain instruction to save or restore the area according to hardware xsave feature set. Tracked-On: #4166 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 09:31:12 +08:00
Li Fei1	6ee076f7df	hv: assign: rename ptirq_msix_remap to ptirq_prepare_msix_remap ptirq_msix_remap doesn't do the real remap, that's the vmsi_remap and vmsix_remap_entry does. ptirq_msix_remap only did the preparation. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-29 08:53:07 +08:00
Alexander Merritt	ea131eea41	HV: add DRHD index to pci_pdev We add new member pci_pdev.drhd_idx associating the DRHD (IOMMU) with this pdev, and a method to convert a pbdf of a device to this index by searching the pdev list. Partial patch: drhd_index initialization handled in subsequent patch. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Mingqiang Chi	bd0dbd274d	hv:add dump_guest_mem add shell command to support dump dump guest memory e.g. dump_guest_mem vm_id, gva, length Tracked-On: #4144 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-26 10:58:19 +08:00
Sainath Grandhi	22a1bd6948	hv: Fix the definition of struct representing interrupt hw frame In 64-bit mode, processor pushes SS and RSP onto stack unconditionally. Also when dumping the exception info, it makes more sense to dump the RSP at the point of interrupt, rather than the RSP after pushing context (including GPRs) Tracked-On: #4102 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-13 16:06:35 +08:00
Binbin Wu	fa3888c12a	hv: ept: disable execute right on large pages Issue description: ----------------- Machine Check Error on Page Size Change Instruction fetch may cause machine check error if page size and memory type was changed without invalidation on some processors[1][2]. Malicious guest kernel could trigger this issue. This issue applies to both primary page table and extended page tables (EPT), however the primary page table is controlled by hypervisor only. This patch mitigates the situation in EPT. Mitigation details: ------------------ Implement non-execute huge pages in EPT. This patch series clears the execute permission (bit 2) in the EPT entries for large pages. When EPT violation is triggered by guest instruction fetch, hypervisor converts the large page to smaller 4 KB pages and restore the execute permission, and then re-execute the guest instruction. The current patch turns on the mitigation by default. The follow-up patches will conditionally turn on/off the feature per processor model. [1] Refer to erratum KBL002 in "7th Generation Intel Processor Family and 8th Generation Intel Processor Family for U Quad Core Platforms Specification Update" https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/7th-gen-core-family-spec-update.pdf [2] Refer to erratum SKL002 in "6th Generation Intel Processor Family Specification Update" https://www.intel.com/content/www/us/en/products/docs/processors/core/desktop-6th-gen-core-family-spec-update.html Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-11-13 08:00:36 +08:00
Victor Sun	3411f00b5b	HV: fix misra violation on platform clos array MISRA C requires specified bounds for arrays declaration, previous declaration of platform_clos_array in board.h does not meet the requirement. Tracked-On: #3987 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-11-08 16:40:14 +08:00
Victor Sun	9e92f3cdf5	HV: move dmar info definition to board.c The DMAR info is board specific so move the structure definition to board.c. As a configruation file, the whole board.c could be generated by acrn-config tool for each board. Please note we only provide DMAR info MACROs for nuc7i7dnb board. For other boards, ACPI_PARSE_ENABLED must be set to y in Kconfig to let hypervisor parse DMAR info, or use acrn-config tool to generate DMAR info MACROs if user won't enable ACPI parse code for FuSa consideration. The patch also moves the function of get_dmar_info() to vtd.c, so dmar_info.c could be removed. Tracked-On: #3977 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 16:40:14 +08:00
Victor Sun	589be88cf6	HV: link CONFIG_MAX_IOMMU_NUM and MAX_DRHDS to DRHD_COUNT The value of CONFIG_MAX_IOMMU and MAX_DRHDS are identical to DRHD_COUNT which defined in platform ACPI table, so remove CONFIG_MAX_IOMMU_NUM from Kconfig and link these three MACROs together. Tracked-On: #3977 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 16:40:14 +08:00
Li Fei1	620a1c5215	hv: mmu: rename e820 to hv_e820 Now the e820 structure store ACRN HV memory layout, not the physical memory layout. Rename e820 to hv_hv_e820 to show this explicitly. Tracked-On: #4007 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-07 08:47:02 +08:00
Li, Fei1	9d26dab6d6	hv: mmio: add a lock to protect mmio_node access After adding PCI BAR remap support, mmio_node may unregister when there's others access it. This patch add a lock to protect mmio_node access. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-11-01 14:44:11 +08:00
Li, Fei1	2c158d5ad4	hv: io: add unregister_mmio_emulation_handler API Since guest could re-program PCI device MSI-X table BAR, we should add mmio emulation handler unregister. However, after add unregister_mmio_emulation_handler API, emul_mmio_regions is no longer accurate. Just replace it with max_emul_mmio_regions which records the max index of the emul_mmio_node. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-10-29 14:49:55 +08:00
Li, Fei1	dc1e2adaec	hv: vpci: add PCI BAR re-program address check In theory, guest could re-program PCI BAR address to any address. However, ACRN hypervisor only support [0, top_address_space) EPT memory mapping. So we need to check whether the PCI BAR re-program address is within this scope. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-29 14:49:55 +08:00
Shuo A Liu	5f8e7a6cb7	hv: sched: add kick_thread to support notification kick means to notify one thread_object. If the target thread object is running, send a IPI to notify it; if the target thread object is runnable, make reschedule on it. Also add kick_vcpu API in vcpu layer to notify vcpu. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Shuo A Liu	f04c491259	hv: sched: decouple scheduler from schedule framework This patch decouple some scheduling logic and abstract into a scheduler. Then we have scheduler, schedule framework. From modulization perspective, schedule framework provides some APIs for other layers to use, also interact with scheduler through scheduler interaces. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Yonghua Huang	2e62ad9574	hv[v2]: remove registration of default port IO and MMIO handlers - The default behaviors of PIO & MMIO handlers are same for all VMs, no need to expose dedicated APIs to register default hanlders for SOS and prelaunched VM. Tracked-On: #3904 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2019-10-24 13:21:19 +08:00
Mingqiang Chi	d81872ba18	hv:Change the function parameter for init_ept_mem_ops Currently the parameter of init_ept_mem_ops is 'struct acrn_vm vm' for this api,change it to 'struct memory_ops mem_ops' and 'vm_id' to avoid the reversed dependency, page.c is hardware layer and vm structure is its upper-layer stuff. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:48:30 +08:00
Shuo A Liu	dadcdcefa0	hv: sched: support vcpu context switch on one pcpu To support cpu sharing, multiple vcpu can run on same pcpu. We need do necessary vcpu context switch. This patch add below actions in context switch. 1) fxsave/fxrstor; 2) save/restore MSRs: MSR_IA32_STAR, MSR_IA32_LSTAR, MSR_IA32_FMASK, MSR_IA32_KERNEL_GS_BASE; 3) switch vmcs. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	7e66c0d4fa	hv: sched: use get_running_vcpu to replace per_cpu vcpu with cpu sharing With cpu sharing enabled, per_cpu vcpu cannot work properly as we might has multiple vcpus running on one pcpu. Add a schedule API sched_get_current to get current thread_object on specific pcpu, also add a vcpu API get_running_vcpu to get corresponding vcpu of the thread_object. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	891e46453d	hv: sched: move pcpu_id from acrn_vcpu to thread_object With cpu sharing enabled, we will map acrn_vcpu to thread_object in scheduling. From modulization perspective, we'd better hide the pcpu_id in acrn_vcpu and move it to thread_object. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Jian Jun Chen	1d194ede61	hv: support reference time enlightenment Two time related synthetic MSRs are implemented in this patch. Both of them are partition wide MSR. - HV_X64_MSR_TIME_REF_COUNT is read only and it is used to return the partition's reference counter value in 100ns units. - HV_X64_MSR_REFERENCE_TSC is used to set/get the reference TSC page, a sequence number, an offset and a multiplier are defined in this page by hypervisor and guest OS can use them to calculate the normalized reference time since partition creation, in 100ns units. Tracked-On: #3831 Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-10-22 10:09:16 +08:00
wenwumax	048155d3d6	hv: support minimum set of TLFS This patch implements the minimum set of TLFS functionality. It includes 6 vCPUID leaves and 3 vMSRs. - 0x40000001 Hypervisor Vendor-Neutral Interface Identification - 0x40000002 Hypervisor System Identity - 0x40000003 Hypervisor Feature Identification - 0x40000004 Implementation Recommendations - 0x40000005 Hypervisor Implementation Limits - 0x40000006 Implementation Hardware Features - HV_X64_MSR_GUEST_OS_ID Reporting the guest OS identity - HV_X64_MSR_HYPERCALL Establishing the hypercall interface - HV_X64_MSR_VP_INDEX Retrieve the vCPU ID from hypervisor Tracked-On: #3832 Signed-off-by: wenwumax <wenwux.ma@intel.com> Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-10-22 10:09:16 +08:00
Mingqiang Chi	292d1a15f9	hv:Wrap some APIs related with guest pm -- change some APIs to static -- combine two APIs to init_guest_pm Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-10-21 10:13:02 +08:00
Shuo A Liu	837e4d8788	hv: sched: rename schedule related structs and vars prepare_switch_out -> switch_out prepare_switch_in -> switch_in prepare_switch -> do_switch run_thread_t -> thread_entry_t sched_object -> thread_object sched_object.thread -> thread_object.thread_entry sched_obj -> thread_obj sched_context -> sched_control sched_ctx -> sched_ctl Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-16 10:25:53 +08:00
Binbin Wu	d19592a33e	hv: vmsr: disable prmrr related msrs in vm PRMRR related MSRs need to be configured by platform BIOS / bootloader. These settings are not allowed to be changed by guest. VMs currently have no requirement to access these MSRs even when vSGX is enabled. So, this patch disables PRMRR related MSRs in VM. Tracked-On: #3739 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-10-15 15:13:11 +08:00
Mingqiang Chi	de0a5a48d6	hv:remove some unnecessary includes --remove unnecessary includes --remove unnecssary forward-declaration for 'struct vhm_request' Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-10-15 14:40:39 +08:00
Shuo A Liu	1c526e6d16	hv: use vcpu_affinity[] in vm_config to support vcpu assignment Add this vcpu_affinity[] for each VM to indicate the assignment policy. With it, pcpu_bitmap is not needed, so remove it from vm_config. Instead, vcpu_affinity is a must for each VM. This patch also add some sanitize check of vcpu_affinity[]. Here are some rules: 1) only one bit can be set for each vcpu_affinity of vcpu. 2) two vcpus in same VM cannot be set with same vcpu_affinity. 3) vcpu_affinity cannot be set to the pcpu which used by pre-launched VM. v4: config SDC with CONFIG_MAX_KATA_VM_NUM v5: config SDC with CONFIG_MAX_PCPU_NUM Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	ca2540fe8c	hv: return pre-defined vcpu_num from HV to upper layer There is plan that define each VM configuration statically in HV and let DM just do VM creating and destroying. So DM need get vcpu_num information when VM creating. This patch return the vcpu_num via the API param. And also initial the VMs' cpu_num for existing scenarios. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	d588703976	hv: Add a helper to account bitmap weight Sometimes we need know the number of 1 in one bitmap. This patch provide a inline function bitmap_weight for it. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-24 11:58:45 +08:00
Mingqiang Chi	489937f7b8	hv:check pcpu numbers during init_pcpu_pre it will panic if phys_cpu_num > CONFIG_MAX_PCPU_NUM during init_pcpu_pre,after that no need to check it again. Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-09-24 09:02:05 +08:00
Qi Yadong	3ebeecf060	hv: save/restore TSC in host's suspend/resume path TSC would be reset to 0 when enter suspend state on some platform. This will fail the secure timer checking in secure world because secure world leverage the TSC as source of secure timer which should be increased monotonously. This patch save/restore TSC in host suspend/resume path to guarantee the mono increasing TSC. Note: There should no timer setup before TSC resumed. Tracked-On: #3697 Signed-off-by: Qi Yadong <yadong.qi@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-19 13:50:50 +08:00
Mingqiang Chi	60adef33d3	hv:move down structures run_context and ext_context Now the structures(run_context & ext_context) are defined in vcpu.h,and they are used in the lower-layer modules(wakeup.S), this patch move down the structures from vcpu.h to cpu.h to avoid reversed dependency. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-16 14:51:36 +08:00
Mingqiang Chi	4f98cb03a7	hv:move down the structure intr_source Now the structures(union source & struct intr_source) are defined in ptdev.h,they are used in vtd.c and assign.c, vtd is the hardware layer and ptdev is the upper-layer module from the modularization perspective, this patch move down these structures to avoid reversed dependency. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-16 14:51:36 +08:00
Shuo A Liu	4742d1c747	hv: ptdev: move softirq_dev_entry_list from vm structure to per_cpu region Using per_cpu list to record ptdev interrupts is more reasonable than recording them per-vm. It makes dispatching such interrupts more easier as we now do it in softirq which happens following interrupt context of each pcpu. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-16 09:36:52 +08:00
Shuo A Liu	2cc45534d6	hv: move pcpu offline request and vm shutdown request from schedule From modulization perspective, it's not suitable to put pcpu and vm related request operations in schedule. So move them to pcpu and vm module respectively. Also change need_offline return value to bool. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-09-16 09:36:52 +08:00
Yin Fengwei	6b6aa80600	hv: pm: fix coding style issue This patch fix the coding style issue introduced by previous two patches. Tracked-On: #3564 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-09-11 17:30:24 +08:00
Yin Fengwei	f039d75998	hv: pm: enhencement platform S5 entering operation Now, we have assumption that SOS control whether the platform should enter S5 or not. So when SOS tries enter S5, we just forward the S5 request to native port which make sure platform S5 is totally aligned with SOS S5. With higher serverity guest introduced,this assumption is not true any more. We need to extend the platform S5 process to handle higher severity guest: - For DM launched RTVM, we need to make sure these guests is off before put the whole platfrom to S5. - For pre-launched VM, there are two cases: * if os running in it support S5, we wait for guests off. * if os running in it doesn't support S5, we expect it will invoke one hypercall to notify HV to shutdown it. NOTE: this case is not supported yet. Will add it in the future. Tracked-On: #3564 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2019-09-11 17:30:24 +08:00
Mingqiang Chi	c691c5bd3c	hv:add volatile keyword for some variables pcpu_active_bitmap was read continuously in wait_pcpus_offline(), acrn_vcpu->running was read continuously in pause_vcpu(), add volatile keyword to ensure that such accesses are not optimised away by the complier. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-09-10 11:26:35 +08:00
Mingqiang Chi	cd40980d5f	hv:change function parameter for invept change the input parameter from vcpu to eptp in order to let this api more generic, no need to care normal world or secure world. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-09-05 16:32:30 +08:00
Binbin Wu	cd1ae7a89e	hv: cat: isolate hypervisor from rtvm Currently, the clos id of the cpu cores in vmx root mode is the same as non-root mode. For RTVM, if hypervisor share the same clos id with non-root mode, the cacheline may be polluted due to the hypervisor code execution when vmexit. The patch adds hv_clos in vm_configurations.c Hypervisor initializes clos setting according to hv_clos during physical cpu cores initialization. For RTVM, MSR auto load/store areas are used to switch different settings for VMX root/non-root mode for RTVM. Tracked-On: #2462 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-09-05 09:59:13 +08:00
Mingqiang Chi	38ca8db19f	hv:tiny cleanup -- remove some unnecessary includes -- fix a typo -- remove unnecessary void before launch_vms Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-09-05 09:58:47 +08:00
Yan, Like	3f84acda09	hv: add "invariant TSC" cap detection ACRN HV is designed/implemented with "invariant TSC" capability, which wasn't checked at boot time. This commit adds the "invairant TSC" detection, ACRN fails to boot if there wasn't "invariant TSC" capability. Tracked-On: #3636 Signed-off-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-05 09:58:16 +08:00
dongshen	295701cc55	hv: remove mptable code for pre-launched VMs Now that ACPI is enabled for pre-launched VMs, we can remove all mptable code. Tracked-On: #3601 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-08-29 10:12:25 +08:00
dongshen	b447ce3d86	hv: add ACPI support for pre-launched VMs Statically define the per vm RSDP/XSDT/MADT ACPI template tables in vacpi.c, RSDP/XSDT tables are copied to guest physical memory after checksum is calculated. For MADT table, first fix up process id/lapic id in its lapic subtable, then the MADT table's checksum is calculated before it is copies to guest physical memory. Add 8-bit checksum function in util.h Tracked-On: #3601 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-08-29 10:12:25 +08:00
Binbin Wu	5c81659713	hv: ept: flush cache for modified ept entries EPT tables are shared by MMU and IOMMU. Some IOMMUs don't support page-walk coherency, the cpu cache of EPT entires should be flushed to memory after modifications, so that the modifications are visible to the IOMMUs. This patch adds a new interface to flush the cache of modified EPT entires. There are different implementations for EPT/PPT entries: - For PPT, there is no need to flush the cpu cache after update. - For EPT, need to call iommu_flush_cache to make the modifications visible to IOMMUs. Tracked-On: #3607 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com>	2019-08-26 10:47:17 +08:00
Binbin Wu	2abd8b34ef	hv: vtd: export iommu_flush_cache VT-d shares the EPT tables as the second level translation tables. For the IOMMUs that don't support page-walk coherecy, cpu cache should be flushed for the IOMMU EPT entries that are modified. For the current implementation, EPT tables for translating from GPA to HPA for EPT/IOMMU are not modified after VM is created, so cpu cache invlidation is done once per VM before starting execution of VM. However, this may be changed, runtime EPT modification is possible. When cpu cache of EPT entries is invalidated when modification, there is no need invalidate cpu cache globally per VM. This patch exports iommu_flush_cache for EPT entry cache invlidation operations. - IOMMUs share the same copy of EPT table, cpu cache should be flushed if any of the IOMMU active doesn't support page-walk coherency. - In the context of ACRN, GPA to HPA mapping relationship is not changed after VM created, skip flushing iotlb to avoid potential performance penalty. Tracked-On: #3607 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com>	2019-08-26 10:47:17 +08:00
Mingqiang Chi	2310d99ebf	hv: cleanup vmcs.h -- move 'RFLAGS_AC' to cpu.h -- move 'VMX_SUPPORT_UNRESTRICTED_GUEST' to msr.h and rename it to 'MSR_IA32_MISC_UNRESTRICTED_GUEST' -- move 'get_vcpu_mode' to vcpu.h -- remove deadcode 'vmx_eoi_exit()' Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-22 14:13:15 +08:00
Mingqiang Chi	bd09f471a6	hv:move some APIs related host reset to pm.c move some data structures and APIs related host reset from vm_reset.c to pm.c, these are not related with guest. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-08-22 14:09:18 +08:00
Yin Fengwei	6beb34c3cb	vm_load: update init gdt preparation Now, we use native gdt saved in boot context for guest and assume it could be put to same address of guest. But it may not be true after the pre-launched VM is introduced. The gdt for guest could be overwritten by guest images. This patch make 32bit protect mode boot not use saved boot context. Insteadly, we use predefined vcpu_regs value for protect guest to initialize the guest bsp registers and copy pre-defined gdt table to a safe place of guest memory to avoid gdt table overwritten by guest images. Tracked-On: #3532 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-20 09:22:20 +08:00
huihuang.shi	f147c388a5	hv: fix Violations touched ACRN Coding Guidelines fix violations touched below: 1.Cast operation on a constant value 2.signed/unsigned implicity conversion 3.return value unused. V1->V2: 1.bitmap api will return boolean type, not need to check "!= 0", deleted. 2.The behaves ~(uint32_t)X and (uint32_t)~X are not defined in ACRN hypervisor Coding Guidelines, removed the change of it. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2019-08-15 09:47:11 +08:00
Shiqing Gao	062fe19800	hv: move vmx_rdmsr_pat/vmx_wrmsr_pat from vmcs.c to vmsr.c This patch moves vmx_rdmsr_pat/vmx_wrmsr_pat from vmcs.c to vmsr.c, so that these two functions would become internal functions inside vmsr.c. This approach improves the modularity. v1 -> v2: * remove 'vmx_rdmsr_pat' * rename 'vmx_wrmsr_pat' with 'write_pat_msr' Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-08-14 10:51:35 +08:00
Li, Fei1	4c8e60f1d0	hv: vpci: add each vdev_ops for each emulated PCI device Add a field (vdev_ops) in struct acrn_vm_pci_dev_config to configure a PCI CFG operation for an emulated PCI device. Use pci_pt_dev_ops for PCI_DEV_TYPE_PTDEV by default if there's no such configure. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-09 14:19:49 +08:00
Li, Fei1	ff54fa2325	hv: vpci: add emulated PCI device configure for SOS Add emulated PCI device configure for SOS to prepare for add support for customizing special pci operations for each emulated PCI device. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-08-09 14:19:49 +08:00
Li, Fei1	eb21f205e4	hv: vm_config: build pci device configure for SOS Align SOS pci device configure with pre-launched VM and filter pre-launched VM's PCI PT device from SOS pci device configure. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-08-06 11:51:02 +08:00
Li, Fei1	adbaaaf6cb	hv: vpci: rename ptdev_config to pci_dev_config pci_dev_config in VM configure stores all the PCI devices for a VM. Besides PT devices, there're other type devices, like virtual host bridge. So rename ptdev to pci_dev for these configure. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-06 11:51:02 +08:00
Victor Sun	363daf6aa2	HV: return extended info in vCPUID leaf 0x40000001 In some case, guest need to get more information under virtual environment, like guest capabilities. Basically this could be done by hypercalls, but hypercalls are designed for trusted VM/SOS VM, We need a machenism to report these information for normal VMs. In this patch, vCPUID leaf 0x40000001 will be used to satisfy this needs that report some extended information for guest by CPUID. Tracked-On: #3498 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Zhao Yakui <yakui.zhao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-31 14:13:39 +08:00
Victor Sun	555a03db99	HV: add board specific cpu state table to support Px Cx Currently the Px Cx supported SoCs which listed in cpu_state_tbl.c is limited, and it is not a wise option to build a huge state table data base to support Px/Cx for other SoCs. This patch give a alternative solution that build a board specific cpu state table in board.c which could be auto-generated by offline tool, then the CPU Px/Cx of customer board could be enabled; Hypervisor will search the cpu state table in cpu_state_tbl[] first, if not found then go check board_cpu_state_tbl. If no matched cpu state table is found then Px/Cx will not be supported; Tracked-On: #3477 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-07-29 20:25:16 +08:00
Li, Fei1	4a27d08360	hv: schedule: schedule to idel after SOS resume form S3 After "commit `f0e1c5e` init vcpu host stack when reset vcpu", SOS resume form S3 wants to schedule to vcpu_thread not the point where SOS enter S3. So we should schedule to idel first then reschedule to execute vcpu_thread. Tracked-On: #3387 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-29 09:53:18 +08:00
Zhao Yakui	baf7d90fdf	HV: Refine the usage of monitor/mwait to avoid the possible lockup Based on SDM Vol2 the monitor uses the RAX register to setup the address monitored by HW. The mwait uses the rax/rcx as the hints that the process will enter. It is incorrect that the same value is used for monitor/mwait. The ecx in mwait specifies the optional externsions. At the same time it needs to check whether the the value of monitored addr is already expected before entering mwait. Otherwise it will have possible lockup. V1->V2: Add the asm wrappper of monitor/mwait to avoid the mixed usage of inline assembly in wait_sync_change v2-v3: Remove the unnecessary line break in asm_monitor/asm_mwait. Follow Fei's comment to remove the mwait ecx hint setting that treats the interrupt as break event. It only needs to check whether the value of psync_change is already expected. Tracked-On: #3442 Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-07-26 10:55:58 +08:00
Li, Fei1	11cf9a4a8a	hv: mmu: add hpa2hva_early API for earlt boot When need hpa and hva translation before init_paging, we need hpa2hva_early and hva2hpa_early since init_paging may modify hva2hpa to not be identical mapping. Tracked-On: #2987 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-26 09:10:06 +08:00
Yin, Fengwei	11e67f1c4a	softirq: move softirq from hv_main to interrupt context softirq shouldn't be bounded to vcpu thread. One issue for this is shell (based on timer) can't work if we don't start any guest. This change also is trying best to make softirq handler running with irq enabled. Also update the irq disable/enabel in vmexit handler to align with the usage in vcpu_thread. Tracked-On: #3387 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-07-22 09:55:06 +08:00
Yan, Like	97f6097f04	hv: add ops to vlapic structure This commit adds ops to vlapic structure, and add an *ops parameter to vlapic_reset(). At vlapic reset, the ops is set to the global apicv_ops, and may be assigned to other ops later. Tracked-On: #3227 Signed-off-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-19 16:47:06 +08:00
Victor Sun	600aa8ea5a	HV: change param type of init_pcpu_pre When initialize secondary pcpu, pass INVALID_CPU_ID as param of init_pcpu_pre() looks weird, so change the param type to bool to represent whether the pcpu is a BSP or AP. Tracked-On: #3420 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-17 13:48:00 +08:00
Li, Fei1	e352553e1a	hv: atomic: remove atomic load/store and set/clear In x86 architecture, word/doubleword/quadword aligned read/write on its boundary is atomic, so we may remove atomic load/store. As for atomic set/clear, use bitmap_set/claer seems more reasonable. After replace them all, we could remove them too. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-17 09:20:54 +08:00
Li, Fei1	b39526f759	hv: schedule: vCPU schedule state setting don't need to be atomic vCPU schedule state change is under schedule lock protection. So there's no need to be atomic. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-17 09:20:54 +08:00
Huihuang Shi	2ec1694901	HV: fix sbuf "Casting operation to a pointer" ACRN Coding guidelines requires two different types pointer can't convert to each other, except void *. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-11 13:57:21 +08:00
Huihuang Shi	9063504bc8	HV: ve820 fix "Casting operation to a pointer" ACRN Coding guidelines requires two different types pointer can't convert to each other, except void *. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>	2019-07-11 13:57:21 +08:00
Huihuang Shi	714162fb8b	HV: fix violations touched type conversion ACRN Coding guidelines requires type conversion shall be explicity. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-11 09:16:09 +08:00
Li, Fei1	5d6c9c33ca	hv: vlapic: clear up where needs atomic operation in vLAPIC In almost case, vLAPIC will only be accessed by the related vCPU. There's no synchronization issue in this case. However, other vCPUs could deliver interrupts to the current vCPU, in this case, the IRR (for APICv base situation) or PIR (for APICv advanced situation) and TMR for both cases could be accessed by more than one vCPUS simultaneously. So operations on IRR or PIR should be atomical and visible to other vCPUs immediately. In another case, vLAPIC could be accessed by another vCPU when create vCPU or reset vCPU which could be supposed to be consequently. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-11 09:15:47 +08:00
Li, Fei1	5930e96d12	hv: io_req: refine vhm_req status setting In spite of vhm_req status could be updated in HV and DM on different CPUs, they only change vhm_req status when they detect vhm_req status has been updated by each other. So vhm_req status will not been misconfigured. However, before HV sets vhm_req status to REQ_STATE_PENDING, vhm_req buffer filling should be visible to DM. Add a write memory barrier to guarantee this. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-11 09:15:47 +08:00
Yonghua Huang	1ea3052f80	HV: check security mitigation support for SSBD Hypervisor exposes mitigation technique for Speculative Store Bypass(SSB) to guests and allows a guest to determine whether to enable SSBD mitigation by providing direct guest access to IA32_SPEC_CTRL. Before that, hypervisor should check the SSB mitigation support on underlying processor, this patch is to add this capability check. Tracked-On: #3385 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-07-10 10:55:34 +08:00
Shuo A Liu	4129b72b2e	hv: remove unnecessary cancel_event_injection related stuff cancel_event_injection is not need any more if we do 'scheudle' prior to acrn_handle_pending_request. Commit "921288a6672: hv: fix interrupt lost when do acrn_handle_pending_request twice" bring 'schedule' forward, so remove cancel_event_injection related stuff. Tracked-On: #3374 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-07-09 09:23:12 +08:00
Huihuang Shi	9a7043e83f	HV: remove instr_emul.c dead code ACRN Coding guidelines requires no dead code. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2019-07-09 09:22:53 +08:00
Yonghua Huang	3164f3976a	hv: Mitigation for CPU MDS vulnerabilities. Microarchitectural Data Sampling (MDS) is a hardware vulnerability which allows unprivileged speculative access to data which is available in various CPU internal buffers. 1. Mitigation on ACRN: 1) Microcode update is required. 2) Clear CPU internal buffers (store buffer, load buffer and load port) if current CPU is affected by MDS, when VM entry to avoid any information leakage to guest thru above buffers. 3) Mitigation is not needed if ARCH_CAP_MDS_NO bit (bit5) is set in IA32_ARCH_CAPABILITIES MSR (10AH), in this case, current processor is no affected by MDS vulnerability, in other cases mitigation for MDS is required. 2. Methods to clear CPU buffers (microcode update is required): 1) L1D cache flush 2) VERW instruction Either of above operations will trigger clearing all CPU internal buffers if this CPU is affected by MDS. Above mechnism is enumerated by: CPUID.(EAX=7H, ECX=0):EDX[MD_CLEAR=10]. 3. Mitigation details on ACRN: if (processor is affected by MDS) if (processor is not affected by L1TF OR L1D flush is not launched on VM Entry) execute VERW instruction when VM entry. endif endif 4. Referrence: Deep Dive: Intel Analysis of Microarchitectural Data Sampling https://software.intel.com/security-software-guidance/insights/ deep-dive-intel-analysis-microarchitectural-data-sampling Deep Dive: CPUID Enumeration and Architectural MSRs https://software.intel.com/security-software-guidance/insights/ deep-dive-cpuid-enumeration-and-architectural-msrs Tracked-On: #3317 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com> Reviewed-by: Jason CJ Chen <jason.cj.chen@intel.com>	2019-07-05 15:17:27 +08:00
Yonghua Huang	076a30b555	hv: refine security capability detection function. ACRN hypervisor always print CPU microcode update warning message on KBL NUC platform, even after BIOS was updated to the latest. 'check_cpu_security_cap()' returns false if no ARCH_CAPABILITIES MSR support on current platform, but this MSR may not be available on some platforms. This patch is to remove this pre-condition. Tracked-On: #3317 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Jason CJ Chen <jason.cj.chen@intel.com>	2019-07-05 15:17:27 +08:00
Li, Fei1	09a63560f4	hv: vm_manage: minor fix about triple_fault_shutdown_vm The current implement will trigger shutdown vm request on the BSP VCPU on the VM, not the VCPU will trap out because triple fault. However, if the BSP VCPU on the VM is handling another IO emulation, it may overwrite the triple fault IO request on the vhm_request_buffer in function acrn_insert_request. The atomic operation of get_vhm_req_state can't guarantee the vhm_request_buffer will not access by another IO request if it is not running on the corresponding VCPU. So it should trigger triple fault shutdown VM IO request on the VCPU which trap out because of triple fault exception. Besides, rt_vm_pm1a_io_write will do the right thing which we shouldn't do it in triple_fault_shutdown_vm. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-03 17:44:45 +08:00
Binbin Wu	4a22801dd1	hv: ept: mask EPT leaf entry bit 52 to bit 63 in gpa2hpa According to SDM, bit N (physical address width) to bit 63 should be masked when calculate host page frame number. Currently, hypervisor doesn't set any of these bits, so gpa2hpa can work as expectd. However, any of these bit set, gpa2hpa return wrong value. Hypervisor never sets bit N to bit 51 (reserved bits), for simplicity, just mask bit 52 to bit 63. Tracked-On: #3352 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-07-03 09:39:41 +08:00
dongshen	0247c0b942	Hv: minor cosmetic fix Define/Use variable in place of code to improve readability: Define new local variable struct pci_bar *vbar, and use vbar-> in place of vdev->bar[idx]. Define new local variable uint64_t vbar_base in init_vdev_pt Rename uint64_t vbar[PCI_BAR_COUNT] of struct acrn_vm_pci_ptdev_config to uint64_t vbar_base[PCI_BAR_COUNT] Tracked-On: #3241 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-01 09:57:05 +08:00
Huihuang Shi	198e01716a	HV:fix vcpu violations vcpu is never scan because of scan tool will be crashed! After modulization, the vcpu can be scaned by the scan tool. Clean up the violations in vcpu.c. Fix the violations: 1.No brackets to then/else. 2.Function return value not checked. 3.Signed/unsigned coversion without cast. V1->V2: change the type of "vcpu->arch.irq_window_enabled" to bool. V2->V3: add "void *" prefix on the 1st parameter of memset. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-28 13:28:26 +08:00
Li, Fei1	e793b5d091	hv: vlapic: remove ISR vector stack The current implement will cache each ISR vector in ISR vector stack and do ISR vector stack check when updating PPR. However, there is no need to do this because: 1) We will not touch vlapic->isrvec_stk[0] except doing vlapic_reset: So we don't need to do vlapic->isrvec_stk[0] check. 2) We only deliver higher priority interrupt from IRR to ISR: So we don't need to check whether vlapic->isrvec_stk interrupts is always increasing. 3) There're only 15 different priority interrupt, It will not happened that more that 15 interrupts could been delivered to ISR: So we don't need to check whether vlapic->isrvec_stk_top will larger than ISRVEC_STK_SIZE which is 16. This patch try to remove ISR vector stack and use isrv to cache the vector number for the highest priority bit that is set in the ISR. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-06-27 15:27:37 +08:00
Sainath Grandhi	0a8bf6cee4	hv: Avoid run-time buffer overflows with IOAPIC data structures Remove couple of run-time ASSERTs in ioapic module by checking for the number of interrupt pins per IO-APICs against the configured MAX_IOAPIC_LINES in the initialization flow. Also remove the need for two MACROs specifying the max. number of interrupt lines per IO-APIC and add a config item MAX_IOAPIC_LINES for the same. Tracked-On: #3299 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-24 11:41:10 +08:00
Yuan Liu	ea699af861	HV: Add has_rt_vm API The has_rt_vm walk through all VMs to check RT VM flag and if there is no any RT VM, then return false otherwise return true. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2019-06-20 09:32:55 +08:00
Yuan Liu	7018a13cb6	HV: Add ept_flush_leaf_page API The ept_flush_leaf_page API is used to flush address space from a ept page entry, user can use it to match walk_ept_mr to flush VM address space. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-20 09:32:55 +08:00
Yuan Liu	f320130d58	HV: Add walk_ept_table and get_ept_entry APIs The walk_ept_table API is used to walk through EPT table for getting all of present pages, user can get each page entry and its size from the walk_ept_table callback. The get_ept_entry is used to getting EPT pointer of the vm, if current context of mv is secure world, return secure world EPT pointer, otherwise return normal world EPT pointer. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-20 09:32:55 +08:00
Yuan Liu	f81585eb3d	HV: Add flush_address_space API. flush_address_space is used to flush address space by clflushopt instruction. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-20 09:32:55 +08:00
Yuan Liu	6fd397e82b	HV: Add CLFLUSHOPT instruction. CLFLUSHOPT is used to invalidate from every level of the cache hierarchy in the cache coherence domain the cache line that contains the linear address specified with memory operand. If that cache line contains modified date at any level of the cache hierarchy, that data is written back to memory. If the platform does not support CLFLUSHOPT instruction, boot will fail. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-20 09:32:55 +08:00
Li, Fei1	0e046c7a0a	hv: vlapic: clear which access type we support for APIC-Access VM Exit The current implement doesn't clear which access type we support for APIC-Access VM Exit: 1) linear access for an instruction fetch -- APIC-access page is mapped as UC which doesn't support fetch 2) linear access (read or write) during event delivery -- Which is not happened in normal case except the guest went wrong, such as, set the IDT table in APIC-access page. In this case, we don't need to support. 3) guest-physical access during event delivery; guest-physical access for an instruction fetch or during instruction execution -- Do we plan to support enable APIC in real mode ? I don't think so. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-06-20 08:53:25 +08:00
Li, Fei1	9960ff98c5	hv: ept: unify EPT API name to verb-object style Rename ept_mr_add to ept_add_mr Rename ept_mr_modify to ept_modify_mr Rename ept_mr_del to ept_del_mr Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-06-14 14:40:25 +08:00
Sainath Grandhi	7d44cd5c28	hv: Introduce check_vm_vlapic_state API This patch introduces check_vm_vlapic_state API instead of is_lapic_pt_enabled to check if all the vCPUs of a VM are using x2APIC mode and LAPIC pass-through is enabled on all of them. When the VM is in VM_VLAPIC_TRANSITION or VM_VLAPIC_DISABLED state, following conditions apply. 1) For pass-thru MSI interrupts, interrupt source is not programmed. 2) For DM emulated device MSI interrupts, interrupt is not delivered. 3) For IPIs, it will work only if the sender and destination are both in x2APIC mode. Tracked-On: #3253 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-14 13:55:26 +08:00
Sainath Grandhi	f3627d4839	hv: Add update_vm_vlapic_state API to sync the VM vLAPIC state This patch introduces vLAPIC state for a VM. The VM vLAPIC state can be one of the following * VM_VLAPIC_X2APIC - All the vCPUs/vLAPICs (Except for those in Disabled mode) of this VM use x2APIC mode * VM_VLAPIC_XAPIC - All the vCPUs/vLAPICs (Except for those in Disabled mode) of this VM use xAPIC mode * VM_VLAPIC_DISABLED - All the vCPUs/vLAPICs of this VM are in Disabled mode * VM_VLAPIC_TRANSITION - Some of the vCPUs/vLAPICs of this VM (Except for those in Disabled mode) are in xAPIC and the others in x2APIC Upon a vCPU updating the IA32_APIC_BASE MSR to switch LAPIC mode, this API is called to sync the vLAPIC state of the VM. Upon VM creation and reset, vLAPIC state is set to VM_VLAPIC_XAPIC, as ACRN starts the vCPUs vLAPIC in XAPIC mode. Tracked-On: #3253 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-14 13:55:26 +08:00
Sainath Grandhi	a3fdc7a496	hv: Add is_xapic_enabled API to check vLAPIC moe is_xapic_enabled API returns true if vLAPIC is in xAPIC mode. In all other cases, it returns false. Tracked-On: #3253 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-14 13:55:26 +08:00
Sainath Grandhi	7cb71a317e	hv: Make is_x2apic_enabled API visible across source code Remove static and inline attributes to the API is_x2apic_enabled and declare a prototype in vlapic.h. Also fix the check performed on guest APICBASE_MSR value to query vLAPIC mode. Tracked-On: #3253 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-14 13:55:26 +08:00
Victor Sun	f83ddd393f	HV: introduce relative vm id for hcall api On SDC scenario, SOS VM id is fixed to 0 so some hypercalls from guest are using hardcoded "0" to represent SOS VM, this would bring issues for HYBRID scenario which SOS VM id is non-zero. Now introducing a new VM id concept for DM/VHM hypercall APIs, that return a relative VM id which is from SOS view when create VM for post- launched VMs. DM/VHM could always treat their own vm id is "0". When they make hypercalls, hypervisor will convert the VM id to the absolute id when dispatch the hypercalls. Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2019-06-12 11:00:40 +08:00
Yin Fengwei	6b7233446f	xsave: inject GP when guest tries to write 1 to XCR0 reserved bit According to SDM vol1 13.3: Write 1 to reserved bit of XCR0 will trigger GP. This patch make ACRN behavior align with SDM definition. Tracked-On: #3239 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-12 08:28:53 +08:00
Victor Sun	31aa37d349	HV: remove unused INVALID_VM_ID The VM IDs which is high or equal then CONFIG_MAX_VM_NUM are all invalid VM IDs, the MACRO has never been referenced in code, so remove it; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 15:22:10 +08:00
Victor Sun	50e09c41b4	HV: remove cpu_num from vm configurations The vcpu num could be calculated based on pcpu_bitmap when prepare_vcpu() is done, so remove this redundant configuration item; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 15:22:10 +08:00
Victor Sun	1906def29e	HV: enable load zephyr kernel Zephyr kernel is stripped ram image, its entry and load address are explicitly defined in vm configurations, hypervisor will load Zephyr directly based on these configurations. Currently we only support boot Zephyr from protected mode. Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 09:40:52 +08:00
Victor Sun	ea7ca8595c	HV: use tag to specify multiboot module Previously multiboot mods[0] is designed for kernel module for all pre-launched VMs including SOS VM, and mods[0].mm_string is used to store kernel cmdline. This design could not satisfy the requirement of hybrid mode scenarios that each VM might use their own kernel image also ramdisk image. To resolve this problem, we will use a tag in mods mm_string field to specify the module type. If the tag could be matched with os_config of VM configurations, the corresponding module would be loaded; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 09:40:52 +08:00
Victor Sun	bb55489e5c	HV: make vm kernel type configurable Different kernel has different load method, it should be configurable in vm configurations; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 09:40:52 +08:00
Victor Sun	0f00a4b0da	HV: refine sw_linux struct The guest OS of ACRN will not be limited to Linux, so refine the struct of sw_linux to more generic sw_module_info. Currently bootargs and ramdisk are only supported modules but we can include more modules in future; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 09:40:52 +08:00
Binbin Wu	7a915dc397	hv: vmsr: present sgx related msr to guest Present SGX related MSRs to guest if SGX is supported. - MSR_IA32_SGXLEPUBKEYHASH0 ~ MSR_IA32_SGXLEPUBKEYHASH3: SGX Launch Control is not supported, so these MSRs are read only. - MSR_IA32_SGX_SVN_STATUS: read only - MSR_IA32_FEATURE_CONTROL: If SGX is support in VM, opt-in SGX in this MSR. - MSR_SGXOWNEREPOCH0 ~ MSR_SGXOWNEREPOCH1: The two MSRs' scope is package level, not allow guest to change them. Still leave them in unsupported_msrs array. Tracked-On: #3179 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-29 11:24:13 +08:00
Binbin Wu	1724996bc5	hv: vcpuid: present sgx capabilities to guest If sgx is supported in guest, present SGX capabilities to guest. There will be only one EPC section presented to guest, even if EPC memory for a guest is from muiltiple physcial EPC sections. Tracked-On: #3179 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-29 11:24:13 +08:00
Binbin Wu	c078f90d77	hv: vm_config: add epc info in vm config Add EPC information in vm configuration structure. EPC information contains the EPC base and size allocated to a VM. Tracked-On: #3179 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-29 11:24:13 +08:00
Binbin Wu	245a732055	hv: sgx: add basic support to init sgx resource for vm Get the platform EPC resource and partiton the EPC resource for VMs according to VM configurations. Don't support sgx capability in SOS VM. init_sgx is called during platform bsp initialization. If init_sgx() fails, consider it as configuration error, panic the system. init_sgx() fails if one of the following happens when at least one VM requests EPC resource if no enough EPC resource for all VMs. No further check if sgx is not supported by platform or not opted-in in BIOS, just disable SGX support for VMs. Tracked-On: #3179 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-29 11:24:13 +08:00
Vijay Dhanraj	517707dee4	DM/HV: Increase VM name len VM Name length is restricted to 32 characters. kata creates a VM name with GUID added as a part of VM name making it around 80 characters. So increasing this size to 128. v1->v2: It turns out that MAX_VM_OS_NAME_LEN usage in DM and HV are for different use cases. So removing the macro from acrn_common.h. Definied macro MAX_VMNAME_LEN for DM purposes in dm.h. Retaining original macron name MAX_VM_OS_NAME_LEN for HV purposes but defined in vm_config.h. Tracked-On: #3138 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-05-27 12:13:51 +08:00
Victor Sun	f2fe35472b	HV: remove mptable in vm_config Define a static mptable array and each VM could index its vmptable by vm id, then mptable is not needed in vm configurations; Tracked-On: #2291 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-05-27 12:13:37 +08:00
Zide Chen	bfc08c2812	hv: move msr_bitmap from acrn_vm to acrn_vcpu_arch At the time the guest is switching to X2APIC mode, different VCPUs in the same VM could expect the setting of the per VM msr_bitmap differently, which could cause problems. Considering different approaches to address this issue: 1) only guest BSP can update msr_bitmap when switching to X2APIC. 2) carefully re-write the update_msr_bitmap_x2apic_xxx() functions to make sure any bit in the bitmap won't be toggled by the VCPUs. 3) make msr_bitmap as per VCPU. We chose option 3) because it's simple and clean, though it takes more memory than other options. BTW, need to remove const modifier from update_msr_bitmap_x2apic_xxx() functions to get it compiled. Tracked-On: #3166 Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-05-24 11:37:13 +08:00
Zide Chen	5a23f7b664	hv: initial host reset implementation - add the GUEST_FLAG_HIGHEST_SEVERITY flag to indicate that the guest has privilege to reboot the host system. - this flag is statically assigned to guest(s) in vm_configurations.c in different scenarios. - implement reset_host() function to reset the host. First try the ACPI reset register if available, then try the 0xcf9 PIO. Tracked-On: #3145 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-23 18:24:17 +08:00
Sainath Grandhi	536c69b9ff	hv: distinguish between LAPIC_PASSTHROUGH configured vs enabled ACRN supports LAPIC emulation for guests using x86 APICv. When guest OS/BIOS switches from xAPIC to x2APIC mode of operation, ACRN also supports switching froom LAPIC emulation to LAPIC passthrough to guest. User/developer needs to configure GUEST_FLAG_LAPIC_PASSTHROUGH for guest_flags in the corresponding VM's config for ACRN to enable LAPIC passthrough. This patch does the following 1)Fixes a bug in the abovementioned feature. For a guest that is configured with GUEST_FLAG_LAPIC_PASSTHROUGH, during the time period guest is using xAPIC mode of LAPIC, virtual interrupts are not delivered. This can be manifested as guest hang when it does not receive virtual timer interrupts. 2)ACRN exposes physical topology via CPUID leaf 0xb to LAPIC PT VMs. This patch removes that condition and exposes virtual topology via CPUID leaf 0xb. Tracked-On: #3136 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-05-23 11:15:31 +08:00
Shiqing Gao	474496fc0e	doc: remove hard-coded interfaces in .rst files This patch removes hard-coded interfaces in .rst files and refers to the definition via doxygen style comments. This patch mainly focus on Hypervisor part. Other parts will be covered in seperate patches. Tracked-On: #1595 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-05-22 12:40:52 -07:00
yliu79	fe4fcf491f	xHV: remove unused function is_dbg_uart_enabled Change-Id: I64b3e08818f1cb15ec7c41557900d6e462c4e107 Tracked-On: #3123 Signed-off-by: yliu79 <ying2.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-22 16:36:03 +08:00
yliu79	c5391d2592	HV: remove unused function vcpu_inject_ac Change-Id: I4b139e78c372d0941923fc8260db7a3578a894f2 Tracked-On: #3123 Signed-off-by: yliu79 <ying2.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-22 16:36:03 +08:00
yliu79	26de86d761	HV: remove unused function copy_to_gva Change-Id: I18a6c860ba4bcec4e5915fa6a2a18ed1ecb20fff Tracked-On: #3123 Signed-off-by: yliu79 <ying2.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-22 16:36:03 +08:00
yliu79	163c63d21f	HV: remove unused function resume_vm Change-Id: Ia6b6617e55044b5555a2d80c26a4ef7d7e56b7fa Tracked-On: #3123 Signed-off-by: yliu79 <ying2.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-22 16:36:03 +08:00
yliu79	c68c6e4af2	HV: remove unused function shutdown_vcpu Change-Id: Ia6f9aa4d2d603d23bc0cb9c3b12032d1a96504db Tracked-On: #3123 Signed-off-by: yliu79 <ying2.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-22 16:36:03 +08:00
yliu79	83012a5a0a	HV: remove unused function disable_iommu Change-Id: Ia2347008082991d56cdbfab9f9940cfccc473702 Tracked-On: #3123 Signed-off-by: yliu79 <ying2.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-22 16:36:03 +08:00
Minggui Cao	fc1cbebe31	HV: remove vcpu arch lock, not needed. the pcpu just write its own vmcs, not need spinlock. and the arch.lock not used other places, remove it too. Tracked-On: #3130 Signed-off-by: Minggui Cao <minggui.cao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-22 16:35:26 +08:00
Victor Sun	90f3ce442d	HV: remove unused UNDEFINED_VM The enum of UNDEFINED_VM has never been used, remove it; Tracked-On: #2291 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-05-22 10:01:20 +08:00
dongshen	73cff9ef08	HV: predefine pci vbar's base address for pre-launched VMs in vm_config For pre-launched VMs, currently we set all vbars to 0s initially in bar emulation code, guest OS will reprogram the bars when it sees the bars are uninited (0s). We consider this is not the right solution, change to populate the vbars (to non zero valid pci hole address) based on the vbar base addresses predefined in vm_config. Store a pointer to acrn_vm_pci_ptdev_config in struct pci_vdev Tracked-On: #3022 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-22 10:00:15 +08:00
Jason Chen CJ	238d8bbaa2	reshuffle init_vm_boot_info now only SOS need decide boot with de-privilege or direct boot mode, while for other pre-launched VMs, they should use direct boot mode. this patch merge boot/guest/direct_boot_info.c & boot/guest/deprivilege_boot_info.c into boot/guest/vboot_info.c, and change init_direct_vboot_info() function name to init_general_vm_boot_info(). in init_vm_boot_info(), depend on get_sos_boot_mode(), SOS may choose to init vm boot info by setting the vm_sw_loader to deprivilege specific one; for SOS using DIRECT_BOOT_MODE and all other VMS, they will use general_sw_loader as vm_sw_loader and go through init_general_vm_boot_info() for virtual boot vm info filling. this patch also move spurious handler initilization for de-privilege mode from boot/guest/deprivilege_boot.c to boot/guest/vboot_info.c, and just set it in deprivilege sw_loader before irq enabling. Changes to be committed: modified: Makefile modified: arch/x86/guest/vm.c modified: boot/guest/deprivilege_boot.c deleted: boot/guest/deprivilege_boot_info.c modified: boot/guest/direct_boot.c renamed: boot/guest/direct_boot_info.c -> boot/guest/vboot_info.c modified: boot/guest/vboot_wrapper.c modified: boot/include/guest/deprivilege_boot.h modified: boot/include/guest/direct_boot.h modified: boot/include/guest/vboot.h new file: boot/include/guest/vboot_info.h modified: common/vm_load.c modified: include/arch/x86/guest/vm.h Tracked-On: #1842 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-05-20 18:49:59 +08:00
Mingqiang Chi	517cff1bbe	hv:remove some unnecessary includes remove some unnecessary includes, some can cause reverse dependency. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> modified: acpi_parser/acpi_ext.c modified: acpi_parser/dmar_parse.c modified: boot/acpi_base.c modified: boot/guest/direct_boot_info.c modified: include/arch/x86/per_cpu.h	2019-05-16 10:33:01 +08:00
Zide Chen	865ee2956e	hv: emulate ACPI reset register for Service OS guest Handle the PIO reset register that is defined in host ACPI: Parse host FADT table to get the host reset register info, and emulate it for Service OS: - return all '1' for guest reads because the read behavior is not defined in ACPI. - ignore guest writes with the reset value to stop it from resetting host; if guest writes other values, passthru it to hardware in case the reset register supports other functionalities. Tracked-On: #2700 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-15 11:20:12 +08:00
Zide Chen	26f08680eb	hv: shutdown guest VM upon triple fault exceptions This patch implements triple fault vmexit handler and base on VM types: - post-launched VMs: shutdown_target_vm() injects S5 PIO write to request DM to shut down the target VM. - pre-launched VMs: shut down the guest. - SOS: similarly, but shut down all the non real-time post-launched VMs that depend to SOS before shutting down SOS. Tracked-On: #2700 Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-15 11:20:12 +08:00
Zide Chen	9aa3fe646b	hv: emulate reset register 0xcf9 and 0x64 - post-launched RTVM: intercept both PIO ports so that hypervisor has a chance to set VM_POWERING_OFF flag. - all other type of VMs: deny these 2 ports from guest access so that guests are not able to reset host. Tracked-On: #2700 Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-15 11:20:12 +08:00
Zide Chen	8ad0fd98a3	hv: implement NEED_SHUTDOWN_VM request to idle thread For pre-launched VMs and SOS, VM shutdown should not be executed in the current VM context. - implement NEED_SHUTDOWN_VM request so that the BSP of the target VM can shut down the guest in idle thread. - implement shutdown_vm_from_idle() to shut down target VM. Tracked-On: #2700 Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-15 11:20:12 +08:00
Victor Sun	db952315c5	HV: fix MISRA violation of host_pm.h The header need struct pm_s_state_data info which declared in acrn_common.h; Tracked-On: #1842 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-05-15 09:31:43 +08:00
Victor Sun	8afbdb7505	HV: enable Kconfig of ACPI_PARSE_ENABLED Previously we use Kconfig of DMAR_PARSE_ENABLED to choose pre-defined DMAR info or parse it at runtime, at the same time we use MACRO of CONFIG_CONSTANT_ACPI to decide whether parse PM related ACPI info at runtime. This looks redundant so use a unified ACPI_PARSE_ENABLED Kconfig to replace them. Tracked-On: #3107 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-14 11:53:02 +08:00
Mingqiang Chi	795d6de0fb	hv:move several files related X86 for lib modified: Makefile renamed: lib/memory.c -> arch/x86/lib/memory.c renamed: include/lib/atomic.h -> include/arch/x86/lib/atomic.h renamed: include/lib/bits.h -> include/arch/x86/lib/bits.h renamed: include/lib/spinlock.h -> include/arch/x86/lib/spinlock.h Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-05-13 10:12:20 +08:00
Mingqiang Chi	350d6a9eb6	hv:Move BUS_LOCK to atomic.h now this MACRO is used in atomic.h and bits.h, move it from cpu.h to atomic.h to avoid reverse dependency(i.e. from lower layer to upper one) Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-05-13 10:12:20 +08:00
Shiqing Gao	773889bb65	hv: dmar_parse: remove dynamic memory allocation This patch removes the dynamic memory allocation in dmar_parse.c. v1 -> v2: - rename 'const_dmar.c' to 'dmar_info.c' and move it to 'boot' directory - add CONFIG_DMAR_PARSE_ENABLED check for function declaration Tracked-On: #861 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-05-10 11:33:37 +08:00
Binbin Wu	a581f50600	hv: vmsr: enable msr ia32_misc_enable emulation Add MSR_IA32_MISC_ENABLE to emulated_guest_msrs to enable the emulation. Init MSR_IA32_MISC_ENABLE for guest. Tracked-On: #2834 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-09 16:35:15 +08:00
Binbin Wu	8e310e6ea1	hv: vcpuid: modify vcpuid according to msr ia32_misc_enable According to SDM Vol4 2.1, modify vcpuid according to msr ia32_misc_enable: - Clear CPUID.01H: ECX[3] if guest disabled monitor/mwait. - Clear CPUID.80000001H: EDX[20] if guest set XD Bit Disable. - Limit the CPUID leave maximum value to 2 if guest set Limit CPUID MAXVal. Tracked-On: #2834 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-09 16:35:15 +08:00
Binbin Wu	f0d06165d3	hv: vmsr: handle guest msr ia32_misc_enable read/write Guest MSR_IA32_MISC_ENABLE read simply returns the value set by guest. Guest MSR_IA32_MISC_ENABLE write: - Clear EFER.NXE if MSR_IA32_MISC_ENABLE_XD_DISABLE set. - MSR_IA32_MISC_ENABLE_MONITOR_ENA: Allow guest to control this feature when HV doesn't use this feature and hw has no bug. vcpuid update according to the change of the msr will be covered in following patch. Tracked-On: #2834 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-09 16:35:15 +08:00
Binbin Wu	a0a6eb43c4	hv: msr: use UL since ia32_misc_enable is 64bit Merge two parts of different definitions for MSR_IA32_MISC_ENABLE fields. - use the prefix "MSR_IA32_" to align with others - Change MSR_IA32_MISC_ENABLE_XD to MSR_IA32_MISC_ENABLE_XD_DISABLE to align the meaning of the filed since it is "XD bit disable" Use UL instead of U as the filed bit mask because MSR_IA32_MISC_ENABLE is 64-bit. Tracked-On: #2834 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-09 16:35:15 +08:00
Jason Chen CJ	20f97f7559	restruct boot and bsp dir for firmware stuff currently, ACRN hypervisor can either boot from sbl/abl or uefi, that's why we have different firmware method under bsp & boot dirs. but the fact is that we actually have two different operations based on different guest boot mode: 1. de-privilege-boot: ACRN hypervisor will boot VM0 in the same context as native(before entering hypervisor) - it means hypervisor will co-work with ACRN UEFI bootloader, restore the context env and de-privilege this env to VM0 guest. 2. direct-boot: ACRN hypervisor will directly boot different pre-launched VM(including SOS), it will setup guest env by pre-defined configuration, and prepare guest kernel image, ramdisk which fetch from multiboot modules. this patch is trying to: - rename files related with firmware, change them to guest vboot related - restruct all guest boot stuff in boot & bsp dirs into a new boot/guest dir - use de-privilege & direct boot to distinguish two different boot operations this patch is pure file movement, the rename of functions based on old assumption will be in the following patch. Changes to be committed: modified: ../efi-stub/Makefile modified: ../efi-stub/boot.c modified: Makefile modified: arch/x86/cpu.c modified: arch/x86/guest/vm.c modified: arch/x86/init.c modified: arch/x86/irq.c modified: arch/x86/trampoline.c modified: boot/acpi.c renamed: bsp/cmdline.c -> boot/cmdline.c renamed: bsp/firmware_uefi.c -> boot/guest/deprivilege_boot.c renamed: boot/uefi/uefi_boot.c -> boot/guest/deprivilege_boot_info.c renamed: bsp/firmware_sbl.c -> boot/guest/direct_boot.c renamed: boot/sbl/multiboot.c -> boot/guest/direct_boot_info.c renamed: bsp/firmware_wrapper.c -> boot/guest/vboot_wrapper.c modified: boot/include/acpi.h renamed: bsp/include/firmware_uefi.h -> boot/include/guest/deprivilege_boot.h renamed: bsp/include/firmware_sbl.h -> boot/include/guest/direct_boot.h renamed: bsp/include/firmware.h -> boot/include/guest/vboot.h modified: include/arch/x86/multiboot.h Tracked-On: #1842 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-09 16:33:44 +08:00
Yin Fengwei	8626e5aa3a	vm_state: Update vm state VM_STATE_INVALID to VM_POWERED_OFF Replace the vm state VM_STATE_INVALID to VM_POWERED_OFF. Also replace is_valid_vm() with is_poweroff_vm(). Add API is_created_vm() to identify VM created state. Tracked-On: #3082 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-05-08 16:58:41 +08:00

... 3 4 5 6 7 ...

1263 Commits