acrn-hypervisor

mirror of https://github.com/projectacrn/acrn-hypervisor.git synced 2025-07-08 04:49:45 +00:00

Author	SHA1	Message	Date
Li Fei1	4367657771	hv: vpci: add a global CFG header configuration access handler Add cfg_header_read_cfg and cfg_header_write_cfg to handle the 1st 64B CFG Space header PCI configuration space. Only Command and Status Registers are pass through; Only Command and Status Registers and Base Address Registers are writable. In order to implement this, we add two type bit mask for per 4B register: pass through mask and read-only mask. When pass through bit mask is set, this means this bit of this 4B register is pass through, otherwise, it is virtualized; When read-only mask is set, this means this bit of this 4B register is read-only, otherwise, it's writable. We should write it to physical CFG space or virtual CFG space base on whether the pass through bit mask is set or not. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-06 14:08:04 +08:00
Sainath Grandhi	460e7ee5b1	hv: Variable/macro renaming for intr handling of PT devices using IO-APIC/PIC 1. Renames DEFINE_IOAPIC_SID with DEFINE_INTX_SID as the virtual source can be IOAPIC or PIC 2. Rename the src member of source_id.intx_id to ctlr to indicate interrupt controller 2. Changes the type of src member of source_id.intx_id from uint32_t to enum with INTX_CTLR_IOAPIC and INTX_CTLR_PIC Tracked-On: #4447 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-03-06 11:29:02 +08:00
Minggui Cao	ad4d14e37f	HV: enable ARI if PCI bridge support it For SRIOV needs ARI support, so enable it in HV if the PCI bridge support it. TODO: need check all the PCI devices under this bridge can support ARI, if not, it is better not enable it as PCIe spec. That check will be done when scanning PCI devices. Tracked-On: #3381 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Minggui Cao <minggui.cao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-06 08:47:46 +08:00
Zide Chen	49ffe168af	hv: fixup relocation delta for symbols belong to entry section This is to enable relocation for code32. - RIP relative addressing is available in x86-64 only so we manually add relocation delta to the target symbols to fixup code32. - both code32 and code64 need to load GDT hence both need to fixup GDT pointer. This patch declares separate GDT pointer cpu_primary64_gdt_ptr for code64 to avoid double fixup. - manually fixup cpu_primary64_gdt_ptr in code64, but not rely on relocate() to do that. Otherwise it's very confusing that symbols from same file could be fixed up externally by relocate() or self-relocated. - to make it clear, define a new symbol ld_entry_end representing the end of the boot code that needs manually fixup, and use this symbol in relocate() to filter out all symbols belong to the entry sections. Tracked-On: #4441 Reviewed-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-06 08:27:46 +08:00
Binbin Wu	667639b591	doc: fix a missing argument in the function description One argument is missing for the function ptirq_alloc_entry. This patch fixes the doc generation error. Tracked-On: #3882 Signed-off-by: Binbin Wu <binbin.wu@intel.com>	2020-03-05 13:08:57 +08:00
Binbin Wu	76f2e28e13	doc: update hv device passthrough document Fixed misspellings and rst formatting issues. Added ptdev.h to the list of include file for doxygen Tracked-On: #3882 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Signed-off-by: David B. Kinder <david.b.kinder@intel.com>	2020-03-04 18:05:15 -05:00
Binbin Wu	b05c1afa0b	doc: add doxygen style comments to ptdev Add doxygen style comments to ptdev public APIs. Add these API descriptions to group acrn_passthrough. Tracked-On: #3882 Signed-off-by: Binbin Wu <binbin.wu@intel.com>	2020-03-04 18:05:15 -05:00
Vijay Dhanraj	92ee33b035	HV: Add MBA support in ACRN This patch adds RDT MBA support to detect, configure and and setup MBA throttle registers based on VM configuration. Tracked-On: #3725 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-04 17:33:50 +08:00
Yuan Liu	176cb31c31	hv: refine vpci_init_vdev function Add a new parameter pf_vdev for function vpci_init_vdev to support SRIOV VF vdev initializaiton. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Yuan Liu	320ed6c238	hv: refine init_one_dev_config The init_one_dev_config is used to initialize a acrn_vm_pci_dev_config SRIOV needs a explicit acrn_vm_pci_dev_config to create a VF vdev,so refine it to return acrn_vm_pci_dev_config. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Yuan Liu	87e7d79112	hv: refine init_pdev function Due to SRIOV VF physical device needs to be initialized when VF_ENABLE is set and a SRIOV VF physical device initialization is same with standard PCIe physical device, so expose the init_pdev for SRIOV VF physical device initialization. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Yuan Liu	abbdef4f5d	hv: implement SRIOV VF_BAR initialization All SRIOV VF physical devices don't have bars in configuration space, they are from the VF associated PF's VF_BAR registers of SRIOV capability. Adding a vbars data structure in pci_cap_sriov data structure to store SRIOV VF_BAR information, so that each VF bars can be initialized directly through the vbars instead multiple accessing of the PF VF_BAR registers. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Conghui Chen	595cefe3f2	hv: xsave: move assembler to individual function Current code avoid the rule 88 S in MISRA-C, so move xsaves and xrstors assembler to individual functions. Tracked-On: #4436 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 17:55:06 +08:00
Yuan Liu	2f7483065b	hv: introduce SRIOV interception VF_ENABLE is one field of SRIOV capability that is used to create or remove VF physical devices. If VF_ENABLE is set, hv can detect if the VF physical devices are ready after waiting 100 ms. v2: Add sanity check for writing NumVFs register, add precondition and application constraints when VF_ENABLE is set and refine code style. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 14:04:01 +08:00
Yuan Liu	14931d11e0	hv: add SRIOV capability read/write entries Introduce SRIOV capability field for pci_vdev and add SRIOV capability interception entries. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 14:04:01 +08:00
Yuan Liu	5e989f13c6	hv: check if there is enough room for all SRIOV VFs. Make the SRIOV-Capable device invisible from SOS if there is no room for its all virtual functions. v2: fix a issue that if a PF has been dropped, the subsequent PF will be dropped too even there is room for its VFs. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 14:04:01 +08:00
Yuan Liu	ac1477956c	hv: implement SRIOV-Capable device detection. if the device has PCIe capability, walks all PCIe extended capabilities for SRIOV discovery. v2: avoid type casting and refine naming. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 14:04:01 +08:00
Minggui Cao	bd92304dcf	HV: add vpci bridge operations support add vpci bridge operations in hypervisor, to avoid SOS mis-operations to affect other VM's PCI devices. assumption: before hypervisor bootup, the physical pci-bridge shall be configured correctly by BIOS or other bootloader; for ACS (Access Control Service) capability, it is configured by BIOS to support the devices under it to be isolated and allocated to different VMs. to simplify the emulations of vpci bridge, set limitations as following: 1. expose all configure space registers, but readonly 2. BIST not support; by default is 0 3. not support interrupt, including INTx and MSI. TODO: 1. configure tool can select whether a PCI bridge is emulated or pass through. Open: 1. SOS how to reset PCI device under the PCI bridge? Tracked-On: #3381 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Minggui Cao <minggui.cao@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-02-28 09:24:51 +08:00
Conghui Chen	c246d1c9b8	hv: xsave: bugfix for init value The init value for XCR0 and XSS should be the same with spec: In SDM Vol1 13.3: XCR0[0] is associated with x87 state (see Section 13.5.1). XCR0[0] is always 1. The other bits in XCR0 are all 0 coming out of RESET. The IA32_XSS MSR (with MSR index DA0H) is zero coming out of RESET. The previous code try to fix the xsave area leak to other VMs during init phase, but bring the error to linux. Besides, it cannot avoid the possible leak in running phase. Need find a better solution. Tracked-On: #4430 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 09:19:29 +08:00
Vijay Dhanraj	eaad91fd71	HV: Remove RDT code if CONFIG_RDT_ENABLED flag is not set This patch does the following, 1. Removes RDT code if CONFIG_RDT_ENABLED flag is not set. 2. Set the CONFIG_RDT_ENABLED flag only on platforms that support RDT so that build scripts will automatically reflect the config. Tracked-On: #3715 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	d0665fe220	HV: Generalize RDT infrastructure and fix RDT cache configuration. This patch creates a generic infrastructure for RDT resources instead of just L2 or L3 cache. This patch also fixes L3 CAT config overwrite by L2 in cases where both L2 and L3 CAT are supported. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	887e3813bc	HV: Add both HW and SW checks for RDT support There can be times when user unknowinlgy enables CONFIG_CAT_ENBALED SW flag, but the hardware might not support L3 or L2 CAT. In such case software can end up writing to the CAT MSRs which can cause undefined results. The patch fixes the issue by enabling CAT only when both HW as well software via the CONFIG_CAT_ENABLED supports CAT. The patch also address typo with "clos2prq_msr" function name. It should be "clos2pqr_msr" instead. PQR stands for platform qos register. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	b8a021d658	HV: split L2 and L3 cache resource MSR Upcoming intel platforms can support both L2 and L3 but our current code only supports either L2 or L3 CAT. So split the MSRs so that we can support allocation for both L2 and L3. This patch does the following, 1. splits programming of L2 and L3 cache resource based on the resource ID. 2. Replace generic platform_clos_array struct with resource specific struct in all the existing board.c files. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	2597429903	HV: Rename cat.c/.h files to rdt.c/.h As part of rdt cat refactoring, goal is to combine all rdt specific features such as CAT under one module. So renaming rdt resouce specific files such as cat.c/.h to generic rdt.c/.h files. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Victor Sun	da3d181f62	HV: init efi info with multiboot2 Initialize efi info of acrn mbi when boot from multiboot2 protocol, with this patch hypervisor could get host efi info and pass it to Linux zeropage, then make guest Linux possible to boot with efi environment; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	520a0222d3	HV: re-arch boot component header The patch re-arch boot component header files by: - moving multiboot.h from include/arch/x86/ to boot/include/ and keep this header for multiboot1 protocol data struct only; - moving multiboot related MACROs in cpu_primary.S to multiboot.h; - creating an independent boot.h to store acrn specific boot information for other files' reference; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Conghui Chen	a7563cb9bd	hv: sched_bvt: add BVT scheduler BVT (Borrowed virtual time) scheduler is used to schedule vCPUs on pCPU. It has the concept of virtual time, vCPU with earliset virtual time is dispatched first. Main concepts: tick timer: a period tick is used to measure the physcial time in units of MCU (minimum charing unit). runqueue: thread in the runqueue is ordered by virtual time. weight: each thread receives a share of the pCPU in proportion to its weight. context switch allowance: the physcial time by which the current thread is allowed to advance beyond the next runnable thread. warp: a thread with warp enabled will have a change to minus a value (Wi) from virtual time to achieve higher priority. virtual time: AVT: actual virtual time, advance in proportional to weight. EVT: effective virtual time. EVT <- AVT - ( warp ? Wi : 0 ) SVT: scheduler virtual time, the minimum AVT in the runqueue. Tracked-On: #4410 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-25 09:11:32 +08:00
Yonghua Huang	64b874ce4c	hv: rename BOOT_CPU_ID to BSP_CPU_ID 1. Rename BOOT_CPU_ID to BSP_CPU_ID 2. Repace hardcoded value with BSP_CPU_ID when ID of BSP is referenced. Tracked-On: #4420 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-02-25 09:08:14 +08:00
Li Fei1	4adad73cfc	hv: mmio: refine mmio access handle lock granularity Now only PCI MSI-X BAR access need dynamic register/unregister. Others don't need unregister once it's registered. So we don't need to lock the vm level emul_mmio_lock when we handle the MMIO access. Instead, we could use finer granularity lock in the handler to ptotest the shared resource. This patch fixed the dead lock issue when OVMF try to size the BAR size: Becasue OVMF use ECAM to access the PCI configuration space, it will first hold vm emul_mmio_lock, then calls vpci_handle_mmconfig_access. While this tries to size a BAR which is also a MSI-X Table BAR, it will call register_mmio_emulation_handler to register the MSI-X Table BAR MMIO access handler. This will causes the emul_mmio_lock dead lock. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-02-24 16:17:38 +08:00
Li Fei1	e8479f84cd	hv: vPCI: remove passthrough PCI device unuse code Now we split passthrough PCI device from DM to HV, we could remove all the passthrough PCI device unused code. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-24 16:17:38 +08:00
Li Fei1	fe3182ea05	hv: vPCI: add assign/deassign PCI device HC APIs Add assign/deassign PCI device hypercall APIs to assign a PCI device from SOS to post-launched VM or deassign a PCI device from post-launched VM to SOS. This patch is prepared for spliting passthrough PCI device from DM to HV. The old assign/deassign ptdev APIs will be discarded. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-02-24 16:17:38 +08:00
Junming Liu	1303861d26	hv:enable gpu iommu except APL platforms To enable gvt-d,need to allow the GPU IOMMU. While gvt-d hasn't been enabled on APL yet, so let APL disable GPU IOMMU. v2 -> v3: * let APL platforms disable GPU IOMMU. Tracked-On: #4405 Signed-off-by: Junming Liu <junming.liu@intel.com> Reviewed-by: Wu Binbin <binbin.wu@intel.com>	2020-02-24 11:47:10 +08:00
Zide Chen	cc6f094926	hv: CAT is supposed to be enabled in the system level In platforms that support CAT, when it is enabled by ACRN, i.e. IA32_resourceType_MASK_n registers are programmed with customized values, it has impacts to the whole system. The per guest flag GUEST_FLAG_CLOS_REQUIRED suggests that CAT may be enabled in some guests, but not in others who don't have this flag, which is conceptually incorrect. This patch removes GUEST_FLAG_CLOS_REQUIRED, and adds a new Kconfig entry CAT_ENABLED for CAT enabling. When it's enabled, platform_clos_array[] defines a set of system-wide Class of Service (COS, or CLOS), and the per guest vm_configs[].clos associates the guest with particular CLOS. Tracked-On: #2462 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-02-17 08:51:59 +08:00
Yonghua Huang	fd4775d044	hv: rename VECTOR_XXX and XXX_IRQ Macros 1. Align the coding style for these MACROs 2. Align the values of fixed VECTORs Tracked-On: #4348 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-01-14 10:21:23 +08:00
Yonghua Huang	b90862921e	hv: rename the ACRN_DBG_XXX Refine this MACRO 'ACRN_DBG_XXX' to 'DBG_LEVEL_XXX' Tracked-On: #4348 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-01-14 10:21:23 +08:00
Shuo A Liu	db708fc3e8	hv: rename is_completion_polling to is_polling_ioreq is_polling_ioreq is more straightforward. Rename it. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-13 10:16:30 +08:00
Yonghua Huang	ddebefb9b4	hv: remove depreciated code for hc_assign/deassign_ptdev 'param' is BDF value instead of GPA when VHM driver issues below 2 hypercalls: - HC_ASSIGN_PTEDEV - HC_DEASSIGN_PTDEV This patch is to remove related code in hc_assign/deassign() functions. Tracked-On: #4334 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com>	2020-01-08 11:54:49 +08:00
Li Fei1	65ed6c3529	hv: vpci: trap PCIe ECAM access for SOS SOS will use PCIe ECAM access PCIe external configuration space. HV should trap this access for security(Now pre-launched VM doesn't want to support PCI ECAM; post-launched VM trap PCIe ECAM access in DM). Besides, update PCIe MMCONFIG region to be owned by hypervisor and expose and pass through platform hide PCI devices by BIOS to SOS. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-01-07 16:05:30 +08:00
Li Fei1	1e50ec8899	hv: pci: use ECAM to access PCIe Configuration Space Use Enhanced Configuration Access Mechanism (MMIO) instead of PCI-compatible Configuration Mechanism (IO port) to access PCIe Configuration Space PCI-compatible Configuration Mechanism (IO port) access is used for UART in debug version. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-01-07 16:05:30 +08:00
Li Fei1	65f3751ea3	hv: pci: add hide pci devices configuration for apl-up2 Other Platforms are not added for now. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-01-07 16:05:30 +08:00
Shuo A Liu	a8f6bdd479	hv: Add vlapic_has_pending_intr of apicv to check pending interrupts Sometimes HV wants to know if there are pending interrupts of one vcpu. Add .has_pending_intr interface in acrn_apicv_ops and return the pending interrupts status by check IRRs of apicv. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Shuo A Liu	e3c303363b	hv: vcpu: wait and signal vcpu event support Introduce two kinds of events for each vcpu, VCPU_EVENT_IOREQ: for vcpu waiting for IO request completion VCPU_EVENT_VIRTUAL_INTERRUPT: for vcpu waiting for virtual interrupts events vcpu can wait for such events, and resume to run when the event get signalled. This patch also change IO request waiting/notifying to this way. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Shuo A Liu	1f23fe3fd8	hv: sched: simple event implemention This simple event implemention can only support exclusive waiting at same time. It mainly used by thread who want to wait for special event happens. Thread A who want to wait for some events calls wait_event(struct sched_event ); Thread B who can give the event signal calls signal_event(struct sched_event ); Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Li Fei1	21b405d109	hv: vpci: an assign PT device should support FLR or PM reset Before we assign a PT device to post-launched VM, we should reset the PCI device first. However, ACRN hypervisor doesn't plan to support PCIe hot-plug and doesn't support PCIe bridge Secondary Bus Reset. So the PT device must support FLR or PM reset. This patch do this check when assigning a PT device to post-launched VM. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-30 13:43:07 +08:00
Li Fei1	e74a9f397d	hv: pci: add PCIe PM reset check Add PCIe PM reset capability check. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-30 13:43:07 +08:00
Li Fei1	26670d7ab3	hv: vpci: revert do FLR and BAR restore Since we restore BAR values when writing Command Register if necessary. We don't need to trap FLR and do the BAR restore then. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-30 13:43:07 +08:00
Li Fei1	6c549d48a8	hv: vpci: restore physical BARs when writing Command Register if necessary When PCIe does Conventinal Reset or FLR, almost PCIe configurations and states will lost. So we should save the configurations and states before do the reset and restore them after the reset. This was done well by BIOS or Guest now. However, ACRN will trap these access and handle them properly for security. Almost of these configurations and states will be written to physical configuration space at last except for BAR values for now. So we should do the restore for BAR values. One way is to do restore after one type reset is detected. This will be too complex. Another way is to do the restore when BIOS or guest tries to write the Command Register. This could work because: 1. The I/O Space Enable bit and Memory Space Enable bits in Command Register will reset to zero. 2. Before BIOS or guest wants to enable these bits, the BAR couldn't be accessed. 3. So we could restore the BAR values before enable these bits if reset is detected. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-30 13:43:07 +08:00
Victor Sun	c6f7803f06	HV: restore lapic state and apic id upon INIT Per SDM 10.12.5.1 vol.3, local APIC should keep LAPIC state after receiving INIT. The local APIC ID register should also be preserved. Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	ab13228591	HV: ensure valid vcpu state transition The vcpu state machine transition should follow below rule: old vcpu state new vcpu state ============== ============== VCPU_OFFLINE --- create_vcpu --> VCPU_INIT VCPU_INIT --- launch_vcpu --> VCPU_RUNNING VCPU_RUNNING --- pause_vcpu --> VCPU_PAUSED VCPU_PAUSED --- resume_vcpu --> VCPU_RUNNING VCPU_RUNNING/PAUSED --- pause_vcpu --> VCPU_ZOMBIE VCPU_INIT --- pause_vcpu --> VCPU_ZOMBIE VCPU_ZOMBIE --- reset_vcpu --> VCPU_INIT VCPU_ZOMBIE --- offline_vcpu--> VCPU_OFFLINE Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	a5158e2c16	HV: refine reset_vcpu api The patch abstract a vcpu_reset_internal() api for internal usage, the function would not touch any vcpu state transition and just do vcpu reset processing. It will be called by create_vcpu() and reset_vcpu(). The reset_vcpu() will act as a public api and should be called only when vcpu receive INIT or vm reset/resume from S3. It should not be called when do shutdown_vm() or hcall_sos_offline_cpu(), so the patch remove reset_vcpu() in shutdown_vm() and hcall_sos_offline_cpu(). The patch also introduced reset_mode enum so that vcpu and vlapic could do different context operation according to different reset mode; Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	9ecac8629a	HV: clean up redundant macro in lapic.h Some MACROs in lapic.h are duplicated with apicreg.h, and some MACROs are never referenced, remove them. Tracked-On: #4268 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Li Fei1	58b3a05863	hv: vpci: rename pci_bar to pci_vbar Structure pci_vbar is used to define the virtual BAR rather than physical BAR. It's better to name as pci_vbar. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-26 08:54:23 +08:00
Yin Fengwei	e5117bf19a	vm: add severity for vm_config Add severity definitions for different scenarios. The static guest severity is defined according to guest configurations. Also add sanity check to make sure the severity for all guests are correct. Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Yin Fengwei	f7df43e7cd	reset: detect highest severity guest dynamically For guest reset, if the highest severity guest reset will reset system. There is vm flag to call out the highest severity guest in specific scenario which is a static guest severity assignment. There is case that the static highest severity guest is shutdown and the highest severity guest should be transfer to other guest. For example, in ISD scenario, if RTVM (static highest severity guest) is shutdown, SOS should be highest severity guest instead. The is_highest_severity_vm() is updated to detect highest severity guest dynamically. And promote the highest severity guest reset to system reset. Also remove the GUEST_FLAG_HIGHEST_SEVERITY definition. Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Yin Fengwei	bfa19e9104	pm: S5: update the system shutdown logical in ACRN For system S5, ACRN had assumption that SOS shutdown will trigger system shutdown. So the system shutdown logical is: 1. Trap SOS shutdown 2. Wait for all other guest shutdown 3. Shutdown system The new logical is refined as: If all guest is shutdown, shutdown whole system Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Li Fei1	1fddf943d8	hv: vpci: restore PCI BARs when doing AF FLR ACRN hypervisor should trap guest doing PCI AF FLR. Besides, it should save some status before doing the FLR and restore them later, only BARs values for now. This patch will trap guest Conventional PCI Advanced Features Control Register write operation if the device supports Conventional PCI Advanced Features Capability and check whether it wants to do device AF FLR. If it does, call pdev_do_flr to do the job. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-23 10:14:37 +08:00
Li Fei1	a90e0f6c84	hv: vpci: restore PCI BARs when doing PCIe FLR ACRN hypervisor should trap guest doing PCIe FLR. Besides, it should save some status before doing the FLR and restore them later, only BARs values for now. This patch will trap guest Device Capabilities Register write operation if the device supports PCI Express Capability and check whether it wants to do device FLR. If it does, call pdev_do_flr to do the job. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-23 10:14:37 +08:00
Kaige Fu	5f9d1379bc	HV: Remove INIT signal notification related code We don't use INIT signal notification method now. This patch removes them. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	6d1f63aef0	HV: Use NMI to replace INIT signal for lapic-pt VMs S5 We have implemented a new notification method using NMI. So replace the INIT notification method with the NMI one. Then we can remove INIT notification related code later. Tracked-On: #3886 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	a13909cedc	HV: Use NMI-window exiting to address req missing issue There is a window where we may miss the current request in the notification period when the work flow is as the following: CPUx + + CPUr \| \| \| +--+ \| \| \| Handle pending req \| <--+ +--+ \| \| \| Set req flag \| <--+ \| +------------------>---+ \| Send NMI \| \| Handle NMI \| <--+ \| \| \| \| \| +--> vCPU enter \| \| + + So, this patch enables the NMI-window exiting to trigger the next vmexit once there is no "virtual-NMI blocking" after vCPU enter into VMX non-root mode. Then we can process the pending request on time. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	40ba7e8686	HV: Don't make NMI injection req when notifying vCPU The NMI for notification should not be inject to guest. So, this patch drops NMI injection request when we use NMI to notify vCPUs. Meanwhile, ACRN doesn't support vNMI well and there is no well-designed way to check if the NMI is for notification or for guest now. So, we take all the NMIs as notificaton NMI for hard rtvm temporarily. It means that the hard rtvm will never receive NMI with this patch applied. TODO: vNMI support is not ready yet. we will add it later. Tracked-On: #3886 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	72f7f69c47	HV: Use NMI to kick lapic-pt vCPU's thread ACRN hypervisor needs to kick vCPU off VMX non-root mode to do some operations in hypervisor, such as interrupt/exception injection, EPT flush etc. For non lapic-pt vCPUs, we can use IPI to do so. But, it doesn't work for lapic-pt vCPUs as the IPI will be injected to VMs directly without vmexit. Without the way to kick the vCPU off VMX non-root mode to handle pending request on time, there may be fatal errors triggered. 1). Certain operation may not be carried out on time which may further lead to fatal errors. Taking the EPT flush request as an example, once we don't flush the EPT on time and the guest access the out-of-date EPT, fatal error happens. 2). ACRN now will send an IPI with vector 0xF0 to target vCPU to kick the vCPU off VMX non-root mode if it wants to do some operations on target vCPU. However, this way doesn't work for lapic-pt vCPUs. The IPI will be delivered to the guest directly without vmexit and the guest will receive a unexpected interrupt. Consequently, if the guest can't handle this interrupt properly, fatal error may happen. The NMI can be used as the notification signal to kick the vCPU off VMX non-root mode for lapic-pt vCPUs. So, this patch uses NMI as notification signal to address the above issues for lapic-pt vCPUs. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Shiqing Gao	3cee259583	hv: msr: remove redundant check in write_pat_msr Reserved bits in a 8-bit PAT field has been checked in pat_mem_type_invalid. Remove this redundant check "(PAT_FIELD_RSV_BITS & field) != 0UL" in write_pat_msr. Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-12-16 14:32:42 +08:00
Kaige Fu	2777f23075	HV: Add helper function send_single_nmi This patch adds a helper function send_single_nmi. The fisrt caller will soon come with the following patch. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-13 10:13:09 +08:00
Kaige Fu	525d4d3cd0	HV: Install a NMI handler in acrn IDT This patch installs a NMI handler in acrn IDT to handle NMIs out of dispatch_exception. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-13 10:13:09 +08:00
Victor Sun	a44c1c900c	HV: Kconfig: remove MAX_VCPUS_PER_VM in Kconfig In current architecutre, the maximum vCPUs number per VM could not exceed the pCPUs number. Given the MAX_PCPU_NUM macro is provided in board configurations, so remove the MAX_VCPUS_PER_VM from Kconfig and add a macro of MAX_VCPUS_PER_VM to reference MAX_PCPU_NUM directly. Tracked-On: #4230 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 13:49:28 +08:00
Victor Sun	ea3476d22d	HV: rename CONFIG_MAX_PCPU_NUM to MAX_PCPU_NUM rename the macro since MAX_PCPU_NUM could be parsed from board file and it is not a configurable item anymore. Tracked-On: #4230 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 13:49:28 +08:00
Mingqiang Chi	b6bffd01ff	hv:remove 2 unused variables in vm_arch structure remove 'guest_init_pml4' and 'tmp_pg_array' in vm_arch since they are not used. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-12-12 10:13:11 +08:00
Vijay Dhanraj	c8a4ca6c78	HV: Extend non-contiguous HPA for hybrid scenario This patch extends non-contiguous HPA allocations for pre-launched VMs in hybrid scenario. Tracked-On: #4217 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 10:12:46 +08:00
Shuo A Liu	6a144e6e3e	hv: sched: add yield support Add yield support for schedule, which can give up pcpu proactively. Tracked-On: #4178 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 09:31:39 +08:00
Shuo A Liu	ed4008630d	hv: sched_iorr: Add IO sensitive Round-robin scheduler IO sensitive Round-robin scheduler aim to schedule threads with round-robin policy. Meanwhile, we also enhance it with some fairness configuration, such as thread will be scheduled out without properly timeslice. IO request on thread will be handled in high priority. This patch only add a skeleton for the sched_iorr scheduler. Tracked-On: #4178 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 09:31:39 +08:00
Vijay Dhanraj	6e8b413689	HV: Add support to assign non-contiguous HPA regions for pre-launched VM On some platforms, HPA regions for Virtual Machine can not be contiguous because of E820 reserved type or PCI hole. In such cases, pre-launched VMs need to be assigned non-contiguous memory regions and this patch addresses it. To keep things simple, current design has the following assumptions, 1. HPA2 always will be placed after HPA1 2. HPA1 and HPA2 don’t share a single ve820 entry. (Create multiple entries if needed but not shared) 3. Only support 2 non-contiguous HPA regions (can extend at a later point for multiple non-contiguous HPA) Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Tracked-On: #4195 Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-12-09 11:28:38 +08:00
Li Fei1	5f5ba1d647	hv: vmsi: refine write_vmsi_cfg implementation 1. disable physical MSI before writing the virtual MSI CFG space 2. do the remap_vmsi if the guest wants to enable MSI or update MSI address or data 3. disable INTx and enable MSI after step 2. The previous Message Control check depends on the guest write MSI Message Control Register at the offset of Message Control Register. However, the guest could access this register at the offset of MSI Capability ID register. This patch remove this constraint. Also, The previous implementation didn't really disable MSI when guest wanted to disable MSI. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-05 16:43:22 +08:00
Conghui Chen	d48da2af3a	hv: bugfix for debug commands with smp_call With cpu-sharing enabled, there are more than 1 vcpu on 1 pcpu, so the smp_call handler should switch the vmcs to the target vcpu's vmcs. Then get the info. dump_vcpu_reg and dump_guest_mem should run on certain vmcs, otherwise, there will be #GP error. Renaming: vcpu_dumpreg -> dump_vcpu_reg switch_vmcs -> load_vmcs Tracked-On: #4178 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-05 11:19:35 +08:00
Binbin Wu	3d412266bc	hv: ept: build 4KB page mapping in EPT for RTVM for MCE on PSC Deterministic is important for RTVM. The mitigation for MCE on Page Size Change converts a large page to 4KB pages runtimely during the vmexit triggered by the instruction fetch in the large page. These vmexits increase nondeterminacy, which should be avoided for RTVM. This patch builds 4KB page mapping in EPT for RTVM to avoid these vmexits. Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 09:17:04 +08:00
Binbin Wu	192859ee02	hv: ept: apply MCE on page size change mitigation conditionally Only apply the software workaround on the models that might be affected by MCE on page size change. For these models that are known immune to the issue, the mitigation is turned off. Atom processors are not afftected by the issue. Also check the CPUID & MSR to check whether the model is immune to the issue: CPU is not vulnerable when both CPUID.(EAX=07H,ECX=0H).EDX[29] and IA32_ARCH_CAPABILITIES[IF_PSCHANGE_MC_NO] are 1. Other cases not listed above, CPU may be vulnerable. This patch also changes MACROs for MSR IA32_ARCH_CAPABILITIES bits to UL instead of U since the MSR is 64bit. Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 09:17:04 +08:00
Shuo A Liu	3cb32bb6e3	hv: make init_vmcs as a event of VCPU After changing init_vmcs to smp call approach and do it before launch_vcpu, it could work with noop scheduler. On real sharing scheudler, it has problem. pcpu0 pcpu1 pcpu1 vmBvcpu0 vmAvcpu1 vmBvcpu1 vmentry init_vmcs(vmBvcpu1) vmexit->do_init_vmcs corrupt current vmcs vmentry fail launch_vcpu(vmBvcpu1) This patch mark a event flag when request vmcs init for specific vcpu. When it is running and checking pending events, will do init_vmcs firstly. Tracked-On: #4178 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:20:43 +08:00
Victor Sun	15da33d8af	HV: parse default pci mmcfg base The default PCI mmcfg base is stored in ACPI MCFG table, when CONFIG_ACPI_PARSE_ENABLED is set, acpi_fixup() function will parse and fix up the platform mmcfg base in ACRN boot stage; when it is not set, platform mmcfg base will be initialized to DEFAULT_PCI_MMCFG_BASE which generated by acrn-config tool; Please note we will not support platform which has multiple PCI segment groups. Tracked-On: #4157 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:20:24 +08:00
Conghui Chen	e61412981d	hv: support xsave in context switch xsave area: legacy region: 512 bytes xsave header: 64 bytes extended region: < 3k bytes So, pre-allocate 4k area for xsave. Use certain instruction to save or restore the area according to hardware xsave feature set. Tracked-On: #4166 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 09:31:12 +08:00
Li Fei1	6ee076f7df	hv: assign: rename ptirq_msix_remap to ptirq_prepare_msix_remap ptirq_msix_remap doesn't do the real remap, that's the vmsi_remap and vmsix_remap_entry does. ptirq_msix_remap only did the preparation. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-29 08:53:07 +08:00
Sainath Grandhi	422330d4ab	HV: reimplement PCI device discovery Major changes: 1. Correct handling of device multi-function capability We only check function zero for this feature. If it has it, we continue looking at all remaining functions, ignoring those with invalid vendors. The PCI spec says we are not to probe beyond function zero if it does not exist or indicates it is not a multi-function device. 2a. Walk ALL buses in the PCI space, however, Before walking the PCI hierarchy, post-processed ACPI DMAR info is parsed and a map is created between all device-scopes across all DRHDs and the corresponding IOMMU index. This map is used at the time of walking the PCI hierarchy. If a BDF that ACRN is currently working on, is found in the above-mentioned map, the BDF device is mapped to the corresponding DRHD in the map. If the BDF were a bridge type, realized with "Header Type" in config space, the BDF device along with all its downstream devices are mapped to the corresponding DRHD in the map. To avoid walking previously visited buses, we maintain a bitmap that stores which bus is walked when we handle Bridge type devices. Once ACPI information is included into ACRN about the PCI-Express Root Complexes / PCI Host Bridges, we can avoid the final loop which probes all remainder buses, and instead jump to the next Host Bridge bus. From prior patches, init_pdev returns the pdev structure it created to the caller. This allows us to complete initialization by updating its drhd_idx to the correct DRHD. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Alexander Merritt	ea131eea41	HV: add DRHD index to pci_pdev We add new member pci_pdev.drhd_idx associating the DRHD (IOMMU) with this pdev, and a method to convert a pbdf of a device to this index by searching the pdev list. Partial patch: drhd_index initialization handled in subsequent patch. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Alexander Merritt	0b7bcd6408	HV: extra methods for extracting header fields Add some encapsulation of utilities which read PCI header space using wrapper functions. Also contain verification of PCI vendor to its own function, rather than having hard-coded integrals exposed among other code. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Mingqiang Chi	bd0dbd274d	hv:add dump_guest_mem add shell command to support dump dump guest memory e.g. dump_guest_mem vm_id, gva, length Tracked-On: #4144 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-26 10:58:19 +08:00
Li Fei1	c049c5c965	hv: vpci: reshuffle pci_bar structure The current code declare pci_bar structure following the PCI bar spec. However, we could not tell whether the value in virtual BAR configuration space is valid base address base on current pci_bar structure. We need to add more fields which are duplicated instances of the vBAR information. Basides these fields which will added, bar_base_mapped is another duplicated instance of the vBAR information. This patch try to reshuffle the pci_bar structure to declare pci_bar structure following the software implement benefit not the PCI bar spec. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-15 13:54:21 +08:00
Sainath Grandhi	22a1bd6948	hv: Fix the definition of struct representing interrupt hw frame In 64-bit mode, processor pushes SS and RSP onto stack unconditionally. Also when dumping the exception info, it makes more sense to dump the RSP at the point of interrupt, rather than the RSP after pushing context (including GPRs) Tracked-On: #4102 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-13 16:06:35 +08:00
Binbin Wu	fa3888c12a	hv: ept: disable execute right on large pages Issue description: ----------------- Machine Check Error on Page Size Change Instruction fetch may cause machine check error if page size and memory type was changed without invalidation on some processors[1][2]. Malicious guest kernel could trigger this issue. This issue applies to both primary page table and extended page tables (EPT), however the primary page table is controlled by hypervisor only. This patch mitigates the situation in EPT. Mitigation details: ------------------ Implement non-execute huge pages in EPT. This patch series clears the execute permission (bit 2) in the EPT entries for large pages. When EPT violation is triggered by guest instruction fetch, hypervisor converts the large page to smaller 4 KB pages and restore the execute permission, and then re-execute the guest instruction. The current patch turns on the mitigation by default. The follow-up patches will conditionally turn on/off the feature per processor model. [1] Refer to erratum KBL002 in "7th Generation Intel Processor Family and 8th Generation Intel Processor Family for U Quad Core Platforms Specification Update" https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/7th-gen-core-family-spec-update.pdf [2] Refer to erratum SKL002 in "6th Generation Intel Processor Family Specification Update" https://www.intel.com/content/www/us/en/products/docs/processors/core/desktop-6th-gen-core-family-spec-update.html Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-11-13 08:00:36 +08:00
Victor Sun	3411f00b5b	HV: fix misra violation on platform clos array MISRA C requires specified bounds for arrays declaration, previous declaration of platform_clos_array in board.h does not meet the requirement. Tracked-On: #3987 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-11-08 16:40:14 +08:00
Victor Sun	9e92f3cdf5	HV: move dmar info definition to board.c The DMAR info is board specific so move the structure definition to board.c. As a configruation file, the whole board.c could be generated by acrn-config tool for each board. Please note we only provide DMAR info MACROs for nuc7i7dnb board. For other boards, ACPI_PARSE_ENABLED must be set to y in Kconfig to let hypervisor parse DMAR info, or use acrn-config tool to generate DMAR info MACROs if user won't enable ACPI parse code for FuSa consideration. The patch also moves the function of get_dmar_info() to vtd.c, so dmar_info.c could be removed. Tracked-On: #3977 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 16:40:14 +08:00
Victor Sun	589be88cf6	HV: link CONFIG_MAX_IOMMU_NUM and MAX_DRHDS to DRHD_COUNT The value of CONFIG_MAX_IOMMU and MAX_DRHDS are identical to DRHD_COUNT which defined in platform ACPI table, so remove CONFIG_MAX_IOMMU_NUM from Kconfig and link these three MACROs together. Tracked-On: #3977 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 16:40:14 +08:00
Conghui Chen	75f512ce8c	hv: rename vuart operations fifo_reset -> reset_fifo vuart_fifo_init -> init_fifo vuart_setup - > setup_vuart vuart_init -> init_vuart vuart_deinit -> deinit_vuart vuart_lock_init -> init_vuart_lock vuart_lock -> obtain_vuart_lock vuart_unlock -> release_vuart_lock vuart_deinit_connect -> vuart_deinit_connection Tracked-On: #4017 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 09:01:01 +08:00
Li Fei1	620a1c5215	hv: mmu: rename e820 to hv_e820 Now the e820 structure store ACRN HV memory layout, not the physical memory layout. Rename e820 to hv_hv_e820 to show this explicitly. Tracked-On: #4007 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-07 08:47:02 +08:00
Li, Fei1	9d26dab6d6	hv: mmio: add a lock to protect mmio_node access After adding PCI BAR remap support, mmio_node may unregister when there's others access it. This patch add a lock to protect mmio_node access. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-11-01 14:44:11 +08:00
Li, Fei1	21cb120bcc	hv: vpci: add a global PCI lock for each VM Concurrent access on PCI device may happened if UOS try to access PCI configuration space on different vCPUs through IO port. This patch just adds a global PCI lock for each VM to prevent the concurrent access. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-11-01 14:44:11 +08:00
Li, Fei1	f711d3a639	hv: vpci: define PCI CONFIG_ADDRESS Register as its physical layout Refine PCI CONFIG_ADDRESS Register definition as its physical layout. In this case, we could read/write PCI CONFIG_ADDRESS Register atomically. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-11-01 14:44:11 +08:00
Li, Fei1	2c158d5ad4	hv: io: add unregister_mmio_emulation_handler API Since guest could re-program PCI device MSI-X table BAR, we should add mmio emulation handler unregister. However, after add unregister_mmio_emulation_handler API, emul_mmio_regions is no longer accurate. Just replace it with max_emul_mmio_regions which records the max index of the emul_mmio_node. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-10-29 14:49:55 +08:00
Li, Fei1	dc1e2adaec	hv: vpci: add PCI BAR re-program address check In theory, guest could re-program PCI BAR address to any address. However, ACRN hypervisor only support [0, top_address_space) EPT memory mapping. So we need to check whether the PCI BAR re-program address is within this scope. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-29 14:49:55 +08:00
Shuo A Liu	5f8e7a6cb7	hv: sched: add kick_thread to support notification kick means to notify one thread_object. If the target thread object is running, send a IPI to notify it; if the target thread object is runnable, make reschedule on it. Also add kick_vcpu API in vcpu layer to notify vcpu. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Conghui Chen	810305be98	hv: sched: disable interrupt when grab schedule spinlock After moving softirq to following interrupt path, softirq handler might break in the schedule spinlock context and try to grab the lock again, then deadlock. Disable interrupt with schedule spinlock context. For the IRQ disable/restore operations: CPU_INT_ALL_DISABLE(&rflag) CPU_INT_ALL_RESTORE(rflag) each takes 50~60 cycles. renaming: get_schedule_lock -> obtain_schedule_lock Tracked-On: #3813 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Shuo A Liu	f04c491259	hv: sched: decouple scheduler from schedule framework This patch decouple some scheduling logic and abstract into a scheduler. Then we have scheduler, schedule framework. From modulization perspective, schedule framework provides some APIs for other layers to use, also interact with scheduler through scheduler interaces. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Shuo A Liu	cad195c018	hv: sched: add pcpu_id in sched_control To get pcpu_id from sched_control quickly and easier. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-10-25 13:00:21 +08:00
Yonghua Huang	2e62ad9574	hv[v2]: remove registration of default port IO and MMIO handlers - The default behaviors of PIO & MMIO handlers are same for all VMs, no need to expose dedicated APIs to register default hanlders for SOS and prelaunched VM. Tracked-On: #3904 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2019-10-24 13:21:19 +08:00
Mingqiang Chi	d81872ba18	hv:Change the function parameter for init_ept_mem_ops Currently the parameter of init_ept_mem_ops is 'struct acrn_vm vm' for this api,change it to 'struct memory_ops mem_ops' and 'vm_id' to avoid the reversed dependency, page.c is hardware layer and vm structure is its upper-layer stuff. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:48:30 +08:00
Shuo A Liu	0f70a5ca3a	hv: sched: decouple idle stuff from schedule module Let init thread end with run_idle_thread(), then idle thread take over and start to do scheduling. Change enter_guest_mode() to init_guest_mode() as run_idle_thread() is removed out of it. Also add run_thread() in schedule module to run thread_object's thread loop directly. rename: switch_to_idle -> run_idle_thread Tracked-On: #3813 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	27163df9b1	hv: sched: add sleep/wake for thread object sleep one thread_object means to prevent it from being scheduled. wake one thread_object is an opposite operation of sleep. This patch also add notify_mode in thread_object to indicate how to deliver the request. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	9b8c6e6a90	hv: sched: add status for thread_object Now, we have three valid status for thread_object: THREAD_STS_RUNNING, THREAD_STS_RUNNABLE, THREAD_STS_BLOCKED. This patch also provide several helpers to check the thread's status and a status set wrapper function. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	fafd5cf063	hv: sched: move schedule initialization to each pcpu init schedule infrastructure is per pcpu, so move its initialization to each pcpu's initialization. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	dadcdcefa0	hv: sched: support vcpu context switch on one pcpu To support cpu sharing, multiple vcpu can run on same pcpu. We need do necessary vcpu context switch. This patch add below actions in context switch. 1) fxsave/fxrstor; 2) save/restore MSRs: MSR_IA32_STAR, MSR_IA32_LSTAR, MSR_IA32_FMASK, MSR_IA32_KERNEL_GS_BASE; 3) switch vmcs. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	7e66c0d4fa	hv: sched: use get_running_vcpu to replace per_cpu vcpu with cpu sharing With cpu sharing enabled, per_cpu vcpu cannot work properly as we might has multiple vcpus running on one pcpu. Add a schedule API sched_get_current to get current thread_object on specific pcpu, also add a vcpu API get_running_vcpu to get corresponding vcpu of the thread_object. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	891e46453d	hv: sched: move pcpu_id from acrn_vcpu to thread_object With cpu sharing enabled, we will map acrn_vcpu to thread_object in scheduling. From modulization perspective, we'd better hide the pcpu_id in acrn_vcpu and move it to thread_object. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Jian Jun Chen	1d194ede61	hv: support reference time enlightenment Two time related synthetic MSRs are implemented in this patch. Both of them are partition wide MSR. - HV_X64_MSR_TIME_REF_COUNT is read only and it is used to return the partition's reference counter value in 100ns units. - HV_X64_MSR_REFERENCE_TSC is used to set/get the reference TSC page, a sequence number, an offset and a multiplier are defined in this page by hypervisor and guest OS can use them to calculate the normalized reference time since partition creation, in 100ns units. Tracked-On: #3831 Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-10-22 10:09:16 +08:00
wenwumax	048155d3d6	hv: support minimum set of TLFS This patch implements the minimum set of TLFS functionality. It includes 6 vCPUID leaves and 3 vMSRs. - 0x40000001 Hypervisor Vendor-Neutral Interface Identification - 0x40000002 Hypervisor System Identity - 0x40000003 Hypervisor Feature Identification - 0x40000004 Implementation Recommendations - 0x40000005 Hypervisor Implementation Limits - 0x40000006 Implementation Hardware Features - HV_X64_MSR_GUEST_OS_ID Reporting the guest OS identity - HV_X64_MSR_HYPERCALL Establishing the hypercall interface - HV_X64_MSR_VP_INDEX Retrieve the vCPU ID from hypervisor Tracked-On: #3832 Signed-off-by: wenwumax <wenwux.ma@intel.com> Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-10-22 10:09:16 +08:00
Mingqiang Chi	292d1a15f9	hv:Wrap some APIs related with guest pm -- change some APIs to static -- combine two APIs to init_guest_pm Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-10-21 10:13:02 +08:00
Shuo A Liu	de157ab96c	hv: sched: remove runqueue from current schedule logic Currently we are using a 1:1 mapping logic for pcpu:vcpu. So don't need a runqueue for it. Removing it as preparation work to abstract scheduler framework. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-16 10:25:53 +08:00
Shuo A Liu	837e4d8788	hv: sched: rename schedule related structs and vars prepare_switch_out -> switch_out prepare_switch_in -> switch_in prepare_switch -> do_switch run_thread_t -> thread_entry_t sched_object -> thread_object sched_object.thread -> thread_object.thread_entry sched_obj -> thread_obj sched_context -> sched_control sched_ctx -> sched_ctl Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-16 10:25:53 +08:00
Binbin Wu	d19592a33e	hv: vmsr: disable prmrr related msrs in vm PRMRR related MSRs need to be configured by platform BIOS / bootloader. These settings are not allowed to be changed by guest. VMs currently have no requirement to access these MSRs even when vSGX is enabled. So, this patch disables PRMRR related MSRs in VM. Tracked-On: #3739 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-10-15 15:13:11 +08:00
Mingqiang Chi	de0a5a48d6	hv:remove some unnecessary includes --remove unnecessary includes --remove unnecssary forward-declaration for 'struct vhm_request' Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-10-15 14:40:39 +08:00
Li, Fei1	0dac373d93	hv: vpci: remove pci_msi_cap in pci_pdev The MSI Message Address and Message Data have no valid data after Power-ON. So there's no need to initialize them by reading the data from physical PCI configuration space. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-10-14 15:09:03 +08:00
Shiqing Gao	c8bcab9006	hv: pci: update function "bdf_is_equal" - update the function argument type to union Declaring argument as pointer is not necessary since it only does the comparison. Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-09-25 13:45:39 +08:00
Shiqing Gao	658fff27b4	hv: pci: update "union pci_bdf" - add one more filed in "union pci_bdf" - remove following interfaces: * pci_bus * pci_slot * pci_func * pci_devfn Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-09-25 13:45:39 +08:00
Shuo A Liu	9a23ec6b5a	hv: remove unused pcpu assignment functions As we introduced vcpu_affinity[] to assign vcpus to different pcpus, the old policy and functions are not needed. Remove them. Tracked-On: #3663 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	1c526e6d16	hv: use vcpu_affinity[] in vm_config to support vcpu assignment Add this vcpu_affinity[] for each VM to indicate the assignment policy. With it, pcpu_bitmap is not needed, so remove it from vm_config. Instead, vcpu_affinity is a must for each VM. This patch also add some sanitize check of vcpu_affinity[]. Here are some rules: 1) only one bit can be set for each vcpu_affinity of vcpu. 2) two vcpus in same VM cannot be set with same vcpu_affinity. 3) vcpu_affinity cannot be set to the pcpu which used by pre-launched VM. v4: config SDC with CONFIG_MAX_KATA_VM_NUM v5: config SDC with CONFIG_MAX_PCPU_NUM Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	ca2540fe8c	hv: return pre-defined vcpu_num from HV to upper layer There is plan that define each VM configuration statically in HV and let DM just do VM creating and destroying. So DM need get vcpu_num information when VM creating. This patch return the vcpu_num via the API param. And also initial the VMs' cpu_num for existing scenarios. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	d588703976	hv: Add a helper to account bitmap weight Sometimes we need know the number of 1 in one bitmap. This patch provide a inline function bitmap_weight for it. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	f4ce9cc4a2	hv: make hypercall HC_CREATE_VCPU empty Now, we create vcpus while VM being created in hypervisor. The create vcpu hypercall will not be used any more. For compatbility, keep the hypercall HC_CREATE_VCPU do nothing. v4: Don't remove HC_CREATE_VCPU hypercall, let it do nothing. Tracked-On: #3663 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-09-24 11:58:45 +08:00
Mingqiang Chi	489937f7b8	hv:check pcpu numbers during init_pcpu_pre it will panic if phys_cpu_num > CONFIG_MAX_PCPU_NUM during init_pcpu_pre,after that no need to check it again. Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-09-24 09:02:05 +08:00
Li, Fei1	535a83e24b	hv: vpci: refine vPCI BAR initialization Initialize vBAR configure space when doing vPCI BAR initialization. At this time, we access the physical device as we needs, no need to cache physical PCI device BAR information beforehand. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-09-23 11:16:48 +08:00
Qi Yadong	3ebeecf060	hv: save/restore TSC in host's suspend/resume path TSC would be reset to 0 when enter suspend state on some platform. This will fail the secure timer checking in secure world because secure world leverage the TSC as source of secure timer which should be increased monotonously. This patch save/restore TSC in host suspend/resume path to guarantee the mono increasing TSC. Note: There should no timer setup before TSC resumed. Tracked-On: #3697 Signed-off-by: Qi Yadong <yadong.qi@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-19 13:50:50 +08:00
Manisha	489a0e645d	SEP/SOCWATCH change variable names To follow ACRN coding guideline: identifier cannot be reused. in profiling_info_wrapper struct, modified member names: pmu_sample -> p_sample sep_state -> s_state sw_msr_op_info -> sw_msr_info vm_switch_trace -> vm_trace Tracked-On: #3598 Acked-by: min.yeol.lim@intel.com Signed-off-by: Manisha <manisha.chinthapally@intel.com>	2019-09-16 15:54:34 +08:00
Mingqiang Chi	60adef33d3	hv:move down structures run_context and ext_context Now the structures(run_context & ext_context) are defined in vcpu.h,and they are used in the lower-layer modules(wakeup.S), this patch move down the structures from vcpu.h to cpu.h to avoid reversed dependency. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-16 14:51:36 +08:00
Mingqiang Chi	4f98cb03a7	hv:move down the structure intr_source Now the structures(union source & struct intr_source) are defined in ptdev.h,they are used in vtd.c and assign.c, vtd is the hardware layer and ptdev is the upper-layer module from the modularization perspective, this patch move down these structures to avoid reversed dependency. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-16 14:51:36 +08:00
Shuo A Liu	4742d1c747	hv: ptdev: move softirq_dev_entry_list from vm structure to per_cpu region Using per_cpu list to record ptdev interrupts is more reasonable than recording them per-vm. It makes dispatching such interrupts more easier as we now do it in softirq which happens following interrupt context of each pcpu. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-16 09:36:52 +08:00
Shuo A Liu	2cc45534d6	hv: move pcpu offline request and vm shutdown request from schedule From modulization perspective, it's not suitable to put pcpu and vm related request operations in schedule. So move them to pcpu and vm module respectively. Also change need_offline return value to bool. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-09-16 09:36:52 +08:00
Yin Fengwei	6b6aa80600	hv: pm: fix coding style issue This patch fix the coding style issue introduced by previous two patches. Tracked-On: #3564 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-09-11 17:30:24 +08:00
Yin Fengwei	f039d75998	hv: pm: enhencement platform S5 entering operation Now, we have assumption that SOS control whether the platform should enter S5 or not. So when SOS tries enter S5, we just forward the S5 request to native port which make sure platform S5 is totally aligned with SOS S5. With higher serverity guest introduced,this assumption is not true any more. We need to extend the platform S5 process to handle higher severity guest: - For DM launched RTVM, we need to make sure these guests is off before put the whole platfrom to S5. - For pre-launched VM, there are two cases: * if os running in it support S5, we wait for guests off. * if os running in it doesn't support S5, we expect it will invoke one hypercall to notify HV to shutdown it. NOTE: this case is not supported yet. Will add it in the future. Tracked-On: #3564 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2019-09-11 17:30:24 +08:00
Li, Fei1	6ebc22210b	hv: vPCI: cache PCI BAR physical base address PCI BAR physical base address will never changed. Cache it to avoid calculating it every time when we access it. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2019-09-11 13:17:42 +08:00
Mingqiang Chi	c691c5bd3c	hv:add volatile keyword for some variables pcpu_active_bitmap was read continuously in wait_pcpus_offline(), acrn_vcpu->running was read continuously in pause_vcpu(), add volatile keyword to ensure that such accesses are not optimised away by the complier. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-09-10 11:26:35 +08:00
Mingqiang Chi	cd40980d5f	hv:change function parameter for invept change the input parameter from vcpu to eptp in order to let this api more generic, no need to care normal world or secure world. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-09-05 16:32:30 +08:00
Binbin Wu	cd1ae7a89e	hv: cat: isolate hypervisor from rtvm Currently, the clos id of the cpu cores in vmx root mode is the same as non-root mode. For RTVM, if hypervisor share the same clos id with non-root mode, the cacheline may be polluted due to the hypervisor code execution when vmexit. The patch adds hv_clos in vm_configurations.c Hypervisor initializes clos setting according to hv_clos during physical cpu cores initialization. For RTVM, MSR auto load/store areas are used to switch different settings for VMX root/non-root mode for RTVM. Tracked-On: #2462 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-09-05 09:59:13 +08:00
Mingqiang Chi	38ca8db19f	hv:tiny cleanup -- remove some unnecessary includes -- fix a typo -- remove unnecessary void before launch_vms Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-09-05 09:58:47 +08:00
Yan, Like	3f84acda09	hv: add "invariant TSC" cap detection ACRN HV is designed/implemented with "invariant TSC" capability, which wasn't checked at boot time. This commit adds the "invairant TSC" detection, ACRN fails to boot if there wasn't "invariant TSC" capability. Tracked-On: #3636 Signed-off-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-05 09:58:16 +08:00
dongshen	295701cc55	hv: remove mptable code for pre-launched VMs Now that ACPI is enabled for pre-launched VMs, we can remove all mptable code. Tracked-On: #3601 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-08-29 10:12:25 +08:00
dongshen	b447ce3d86	hv: add ACPI support for pre-launched VMs Statically define the per vm RSDP/XSDT/MADT ACPI template tables in vacpi.c, RSDP/XSDT tables are copied to guest physical memory after checksum is calculated. For MADT table, first fix up process id/lapic id in its lapic subtable, then the MADT table's checksum is calculated before it is copies to guest physical memory. Add 8-bit checksum function in util.h Tracked-On: #3601 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-08-29 10:12:25 +08:00
dongshen	96b422ce9d	hv: create 8-bit sum function Move 8-bit sum code to a separate new function calculate_sum8() in util.h and replace the old code with a call to calculate_sum8() Minor code cleanup in found_rsdp() to make it more readable. Both break and continue statements are used in a single for loop, changed to only use break statement to make the logic simpler. Fixed some coding style issues reported by checkpatch.pl for file hypervisor/boot/acpi_base.c Tracked-On: #3601 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-08-29 10:12:25 +08:00
Binbin Wu	5c81659713	hv: ept: flush cache for modified ept entries EPT tables are shared by MMU and IOMMU. Some IOMMUs don't support page-walk coherency, the cpu cache of EPT entires should be flushed to memory after modifications, so that the modifications are visible to the IOMMUs. This patch adds a new interface to flush the cache of modified EPT entires. There are different implementations for EPT/PPT entries: - For PPT, there is no need to flush the cpu cache after update. - For EPT, need to call iommu_flush_cache to make the modifications visible to IOMMUs. Tracked-On: #3607 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com>	2019-08-26 10:47:17 +08:00
Binbin Wu	2abd8b34ef	hv: vtd: export iommu_flush_cache VT-d shares the EPT tables as the second level translation tables. For the IOMMUs that don't support page-walk coherecy, cpu cache should be flushed for the IOMMU EPT entries that are modified. For the current implementation, EPT tables for translating from GPA to HPA for EPT/IOMMU are not modified after VM is created, so cpu cache invlidation is done once per VM before starting execution of VM. However, this may be changed, runtime EPT modification is possible. When cpu cache of EPT entries is invalidated when modification, there is no need invalidate cpu cache globally per VM. This patch exports iommu_flush_cache for EPT entry cache invlidation operations. - IOMMUs share the same copy of EPT table, cpu cache should be flushed if any of the IOMMU active doesn't support page-walk coherency. - In the context of ACRN, GPA to HPA mapping relationship is not changed after VM created, skip flushing iotlb to avoid potential performance penalty. Tracked-On: #3607 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com>	2019-08-26 10:47:17 +08:00
Mingqiang Chi	2310d99ebf	hv: cleanup vmcs.h -- move 'RFLAGS_AC' to cpu.h -- move 'VMX_SUPPORT_UNRESTRICTED_GUEST' to msr.h and rename it to 'MSR_IA32_MISC_UNRESTRICTED_GUEST' -- move 'get_vcpu_mode' to vcpu.h -- remove deadcode 'vmx_eoi_exit()' Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-22 14:13:15 +08:00
Mingqiang Chi	bd09f471a6	hv:move some APIs related host reset to pm.c move some data structures and APIs related host reset from vm_reset.c to pm.c, these are not related with guest. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-08-22 14:09:18 +08:00
Yin Fengwei	6beb34c3cb	vm_load: update init gdt preparation Now, we use native gdt saved in boot context for guest and assume it could be put to same address of guest. But it may not be true after the pre-launched VM is introduced. The gdt for guest could be overwritten by guest images. This patch make 32bit protect mode boot not use saved boot context. Insteadly, we use predefined vcpu_regs value for protect guest to initialize the guest bsp registers and copy pre-defined gdt table to a safe place of guest memory to avoid gdt table overwritten by guest images. Tracked-On: #3532 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-20 09:22:20 +08:00
Yonghua Huang	700a37856f	hv: remove 'flags' field in struct vm_io_range Currently, 'flags' is defined and set but never be used in the flow of handling i/o request after then. Tracked-On: #861 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-19 10:19:54 +08:00

1 2 3 4 5 ...

1428 Commits