acrn-hypervisor

mirror of https://github.com/projectacrn/acrn-hypervisor.git synced 2025-07-11 14:24:11 +00:00

Author	SHA1	Message	Date
Liang Yi	c46e3c71ac	hv/mod_irq: decouple irq number reservation from ioapic This is done be adding irq_rsvd_bitmap as an auxiliary bitmap besides irq_alloc_bitmap. Tracked-On: #5825 Signed-off-by: Peter Fang <peter.fang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	038e0cae92	hv/mod_irq: split IRQ handling into common and arch specific parts The common IRQ handling routine calls arch specific functions pre_irq_arch() and post_irq_arch() before and after calling the registered action function respectively. Tracked-On: #5825 Signed-off-by: Peter Fang <peter.fang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	ac3e0a1718	hv/mod_irq: split irq initialization into common and arch specific parts The common part initializes the global irq_desc data structure while the arch specific part initialize the HW and its own irq data. This is one of the preparation steps for spliting IRQ handling into common and architecture specific parts. Tracked-On: #5825 Signed-off-by: Peter Fang <peter.fang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Liang Yi	f3cae9e258	hv/mod_irq: hide arch specific data in irq_desc Arch specific IRQ data is now an opaque pointer in irq_desc. This is a preparation step for spliting IRQ handling into common and architecture specific parts. Tracked-On: #5825 Signed-off-by: Peter Fang <peter.fang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-24 11:38:14 +08:00
Li Fei1	9000381f34	hv: pgtable: move pgtable definition to pgtable.h This patch moves pgtable definition to pgtable.h and include the proper header file for page module. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-11 13:48:52 +08:00
Li Fei1	0278a3f46e	hv: pgatble: move the EPT page table related APIs to ept.c Move the EPT page table related APIs to ept.c. page module only provides APIs to allocate/free page for page table page. pagetabl module only provides APIs to add/modify/delete/lookup page table entry. The page pool and the page table related APIs for EPT should defined in EPT module. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-11 13:48:52 +08:00
Li Fei1	5c71ca456a	hv: pgatble: move the MMU page table related APIs to mmu.c Move the MMU page table related APIs to mmu.c. page module only provides APIs to allocate/free page for page table page. pagetabl module only provides APIs to add/modify/delete/lookup page table entry. The page pool and the page table related APIs for MMU should defined in MMU module. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-03-11 13:48:52 +08:00
Li Fei1	15d68675e9	hv: pgtable: separate common APIs for MMU/EPT We would move the MMU page table related APIs to mmu.c and move the EPT related APIs to EPT.c. The page table module only provides APIs to add/modify/delete/lookup page table entry. This patch separates common APIs and adds separate APIs of page table module for MMU/EPT. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2021-03-11 13:48:52 +08:00
Li Fei1	80bd3ac02a	hv: trusty: move post_uos_sworld_memory into vm.c post_uos_sworld_memory are used for post-launched VM which support trusty. It's more VM related. So move it definition into vm.c Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-11 13:48:52 +08:00
Yonghua Huang	1a011bd91b	hv: disable guest MONITOR-WAIT support when SW SRAM is configured Per-core software SRAM L2 cache may be flushed by 'mwait' extension instruction, which guest VM may execute to enter core deep sleep. Such kind of flushing is not expected when software SRAM is enabled for RTVM. Hypervisor disables MONITOR-WAIT support on both hypervisor and VMs sides to protect above software SRAM from being flushed. This patch disable ACRN guest MONITOR-WAIT support if software SRAM is configured. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-11 09:42:44 +08:00
Yonghua Huang	ae43b2a847	hv: disable host MONITOR-WAIT support when SW SRAM is enabled Per-core software SRAM L2 cache may be flushed by 'mwait' extension instruction, which guest VM may execute to enter core deep sleep. Such kind of flushing is not expected when software SRAM is enabled for RTVM. Hypervisor disables MONITOR-WAIT support on both hypervisor and VMs sides to protect above software SRAM from being flushed. This patch disable hypervisor(host) MONITOR-WAIT support and refine software sram initializaion flow. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-11 09:42:44 +08:00
Yonghua Huang	ea44bb6c4d	hv: wrap function to check software SRAM support Below boolean function are defined in this patch: - is_software_sram_enabled() to check if SW SRAM feature is enabled or not. - set global variable 'is_sw_sram_initialized' to file static. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-11 09:42:44 +08:00
Li Fei1	768e483cd2	hv: pgtable: rename 'struct memory_ops' to 'struct pgtable' The fields and APIs in old 'struct memory_ops' are used to add/modify/delete page table (page or entry). So rename 'struct memory_ops' to 'struct pgtable'. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-10 11:42:13 +08:00
Li Fei1	ef98fa69ce	hv: pgtable: remove get_default_access_right API Use default_access_right field to replace get_default_access_right API. Tracked-On: #5830 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-03-10 11:42:13 +08:00
Li Fei1	7c6a52037a	refine ept_flush_leaf_page Refine the logic how to skip the pSRAM region when flushing cache. Tracked-On: #5330 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2021-03-03 14:44:25 +08:00
Li Fei1	1db32f4d03	hv: ept: build 4KB page mapping in EPT for code pages of rtvm RTVM is enforced to use 4KB pages to mitigate CVE-2018-12207 and performance jitter, which may be introduced by splitting large page into 4KB pages on demand. It works fine in previous hardware platform where the size of address space for the RTVM is relatively small. However, this is a problem when the platforms support 64 bits high MMIO space, which could be super large and therefore consumes large # of EPT page table pages. This patch optimize it by using large page for purely data pages, such as MMIO spaces, even for the RTVM. Signed-off-by: Li Fei1 <fei1.li@intel.com> Tracked-On: #5788	2021-03-03 13:46:49 +08:00
Li Fei1	01b54241c6	hv: ept: only treak execution right for large pages To mitigate the page size change MCE vulnerability (CVE-2018-12207), ACRN would clear the execution permission in the EPT paging-structure entries for large pages and then intercept an EPT execution-permission violation caused by an attempt to execution an instruction in the guest. However, the current code would clear the execution permission in the EPT paging- structure entries for small pages too when we clearing the the execution permission for large pages. This would trigger extra EPT violation VM exits. This patch fix this issue. Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Tracked-On: #5788	2021-03-03 13:46:49 +08:00
Li Fei1	97a9c5151b	kv: kconfig: remove some unused ram size kconfig SOS_RAM_SIZE/UOS_RAM_SIZE Kconfig are only used to calculate how many pages we should reserve for the VM EPT mapping. Now we reserve pages for each VM EPT pagetable mapping by the PLATFORM_RAM_SIZE not the VM RAM SIZE. This could simplify the reserve logic for us: not need to take care variable corner cases. We could make assume we reserve enough pages base on the VM could not use the resources beyond the platform hardware resources. So remove these two unused VM ram size kconfig. Signed-off-by: Li Fei1 <fei1.li@intel.com> Tracked-On: #5788	2021-03-01 13:10:04 +08:00
Li Fei1	0579e2ee24	hv: page: add free_page Add free_page to free page when unmap pagetable. Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Tracked-On: #5788	2021-03-01 13:10:04 +08:00
Li Fei1	8d9f12f3b7	hv: page: use dynamic page allocation for pagetable mapping For FuSa's case, we remove all dynamic memory allocation use in ACRN HV. Instead, we use static memory allocation or embedded data structure. For pagetable page, we prefer to use an index (hva for MMU, gpa for EPT) to get a page from a special page pool. The special page pool should be big enougn for each possible index. This is not a big problem when we don't support 64 bits MMIO. Without 64 bits MMIO support, we could use the index to search addrss not larger than DRAM_SIZE + 4G. However, if ACRN plan to support 64 bits MMIO in SOS, we could not use the static memory alocation any more. This is because there's a very huge hole between the top DRAM address and the bottom 64 bits MMIO address. We could not reserve such many pages for pagetable mapping as the CPU physical address bits may very large. This patch will use dynamic page allocation for pagetable mapping. We also need reserve a big enough page pool at first. For HV MMU, we don't use 4K granularity page table mapping, we need reserve PML4, PDPT and PD pages according the maximum physical address space (PPT va and pa are identical mapping); For each VM EPT, we reserve PML4, PDPT and PD pages according to the maximum physical address space too, (the EPT address sapce can't beyond the physical address space), and we reserve PT pages by real use cases of DRAM, low MMIO and high MMIO. Signed-off-by: Li Fei1 <fei1.li@intel.com> Tracked-On: #5788	2021-03-01 13:10:04 +08:00
Li Fei1	5621fabbcb	hv: memory: remove get_sworld_memory_base API memory_ops structure will be changed to store page table related fields. However, secure world memory base address is not one of them, it's VM related. So save sworld_memory_base_hva in vm_arch structure directly. Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Tracked-On: #5788	2021-03-01 13:10:04 +08:00
Victor Sun	26abc82f3c	HV: panic on 0 address when do e820_alloc_memory Current memory allocation algorithm is to find the available address from the highest possible address below max_address. If the function returns 0, means all memory is used up and we have to put the resource at address 0, this is dangerous for a running hypervisor. Also returns 0 would make code logic very complicated, since memcpy_s() doesn't support address 0 copy. Tracked-On: #5626 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-02-26 16:38:32 +08:00
Victor Sun	2e72bb97e7	HV: refine acpi rsdp initialize interface In previous code, the rsdp initialization is done in get_rsdp() api implicitly. The function is called multiple times in following acpi table parsing functions and the condition (rsdp == NULL) need to be added in each parsing function. This is not needed since the panic would occur if rsdp is NULL when do acpi initialization. Tracked-On: #5626 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-02-26 16:38:32 +08:00
Yonghua Huang	fdfd28b140	hv: unmap software region of pre-RTVM from Service VM EPT Accessing to software SRAM region is not allowed when software SRAM is pass-thru to prelaunch RTVM. This patch removes software SRAM region from service VM EPT if it is enabled for prelaunch RTVM. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2021-02-25 09:35:31 +08:00
Tao Yuhong	50d8525618	HV: deny HV owned PCI bar access from SOS This patch denies Service VM the access permission to device resources owned by hypervisor. HV may own these devices: (1) debug uart pci device for debug version (2) type 1 pci device if have pre-launched VMs. Current implementation exposes the mmio/pio resource of HV owned devices to SOS, should remove them from SOS. Tracked-On: #5615 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>	2021-02-03 14:01:23 +08:00
Tao Yuhong	6e7ce4a73f	HV: deny pre-launched VM ptdev bar access from SOS This patch denies Service VM the access permission to device resources owned by pre-launched VMs. Rationale: * Pre-launched VMs in ACRN are independent of service VM, and should be immune to attacks from service VM. However, current implementation exposes the bar resource of passthru devices to service VM for some reason. This makes it possible for service VM to crash or attack pre-launched VMs. * It is same for hypervisor owned devices. NOTE: * The MMIO spaces pre-allocated to VFs are still presented to Service VM. The SR-IOV capable devices assigned to pre-launched VMs doesn't have the SR-IOV capability. So the MMIO address spaces pre-allocated by BIOS for VFs are not decoded by hardware and couldn't be enabled by guest. SOS may live with seeing the address space or not. We will revisit later. Tracked-On: #5615 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 14:01:23 +08:00
Shuo A Liu	d4aaf99d86	hv: keylocker: Support keylocker backup MSRs for Guest VM The logical processor scoped IWKey can be copied to or from a platform-scope storage copy called IWKeyBackup. Copying IWKey to IWKeyBackup is called ‘backing up IWKey’ and copying from IWKeyBackup to IWKey is called ‘restoring IWKey’. IWKeyBackup and the path between it and IWKey are protected against software and simple hardware attacks. This means that IWKeyBackup can be used to distribute an IWKey within the logical processors in a platform in a protected manner. Linux keylocker implementation uses this feature, so they are introduced by this patch. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 13:54:45 +08:00
Shuo A Liu	38cd5b481d	hv: keylocker: host keylocker iwkey context switch Different vCPU may have different IWKeys. Hypervisor need do the iwkey context switch. This patch introduce a load_iwkey() function to do that. Switches the host iwkey when the switch_in vCPU satisfies: 1) keylocker feature enabled 2) Different from the current loaded one. Two opportunities to do the load_iwkey(): 1) Guest enables CR4.KL bit. 2) vCPU thread context switch. load_iwkey() costs ~600 cycles when do the load IWKey action. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 13:54:45 +08:00
Shuo A Liu	c11c07e0fe	hv: keylocker: Support Key Locker feature for guest VM KeyLocker is a new security feature available in new Intel CPUs that protects data-encryption keys for the Advanced Encryption Standard (AES) algorithm. These keys are more valuable than what they guard. If stolen once, the key can be repeatedly used even on another system and even after vulnerability closed. It also introduces a CPU-internal wrapping key (IWKey), which is a key- encryption key to wrap AES keys into handles. While the IWKey is inaccessible to software, randomizing the value during the boot-time helps its value unpredictable. Keylocker usage: - New “ENCODEKEY” instructions take original key input and returns HANDLE crypted by an internal wrap key (IWKey, init by “LOADIWKEY” instruction) - Software can then delete the original key from memory - Early in boot/software, less likely to have vulnerability that allows stealing original key - Later encrypt/decrypt can use the HANDLE through new AES KeyLocker instructions - Note: * Software can use original key without knowing it (use HANDLE) * HANDLE cannot be used on other systems or after warm/cold reset * IWKey cannot be read from CPU after it's loaded (this is the nature of this feature) and only 1 copy of IWKey inside CPU. The virtualization implementation of Key Locker on ACRN is: - Each vCPU has a 'struct iwkey' to store its IWKey in struct acrn_vcpu_arch. - At initilization, every vCPU is created with a random IWKey. - Hypervisor traps the execution of LOADIWKEY (by 'LOADIWKEY exiting' VM-exectuion control) of vCPU to capture and save the IWKey if guest set a new IWKey. Don't support randomization (emulate CPUID to disable) of the LOADIWKEY as hypervisor cannot capture and save the random IWKey. From keylocker spec: "Note that a VMM may wish to enumerate no support for HW random IWKeys to the guest (i.e. enumerate CPUID.19H:ECX[1] as 0) as such IWKeys cannot be easily context switched. A guest ENCODEKEY will return the type of IWKey used (IWKey.KeySource) and thus will notice if a VMM virtualized a HW random IWKey with a SW specified IWKey." - In context_switch_in() of each vCPU, hypervisor loads that vCPU's IWKey into pCPU by LOADIWKEY instruction. - There is an assumption that ACRN hypervisor will never use the KeyLocker feature itself. This patch implements the vCPU's IWKey management and the next patch implements host context save/restore IWKey logic. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 13:54:45 +08:00
Shuo A Liu	4483e93bd1	hv: keylocker: Enable the tertiary VM-execution controls In order for a VMM to capture the IWKey values of guests, processors that support Key Locker also support a new "LOADIWKEY exiting" VM-execution control in bit 0 of the tertiary processor-based VM-execution controls. This patch enables the tertiary VM-execution controls. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 13:54:45 +08:00
Shuo A Liu	e9247dbca0	hv: keylocker: Simulate CPUID of keylocker caps for guest VM KeyLocker is a new security feature available in new Intel CPUs that protects data-encryption keys for the Advanced Encryption Standard (AES) algorithm. This patch emulates Keylocker CPUID leaf 19H to support Keylocker feature for guest VM. To make the hypervisor being able to manage the IWKey correctly, this patch doesn't expose hardware random IWKey capability (CPUID.0x19.ECX[1]) to guest VM. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-02-03 13:54:45 +08:00
Shuo A Liu	15c967ad34	hv: keylocker: Add CR4 bit CR4_KL as CR4_TRAP_AND_PASSTHRU_BITS Bit19 (CR4_KL) of CR4 is CPU KeyLocker feature enable bit. Hypervisor traps the bit's writing to track the keylocker feature on/off of guest. While the bit is set by guest, - set cr4_kl_enabled to indicate the vcpu's keylocker feature enabled status - load vcpu's IWKey in host (will add in later patch) While the bit is clear by guest, - clear cr4_kl_enabled This patch trap and passthru the CR4_KL bit to guest for operation. Tracked-On: #5695 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-02-03 13:54:45 +08:00
Li Fei1	94a980c923	hv: hypercall: prevent sos can touch hv/pre-launched VM resource Current implementation, SOS may allocate the memory region belonging to hypervisor/pre-launched VM to a post-launched VM. Because it only verifies the start address rather than the entire memory region. This patch verifies the validity of the entire memory region before allocating to a post-launched VM so that the specified memory can only be allocated to a post-launched VM if the entire memory region is mapped in SOS’s EPT. Tracked-On: #5555 Signed-off-by: Li Fei1 <fei1.li@intel.com> Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>	2021-02-02 16:55:40 +08:00
Yonghua Huang	8bec63a6ea	hv: remove the hardcoding of Software SRAM GPA base Currently, we hardcode the GPA base of Software SRAM to an address that is derived from TGL platform, as this GPA is identical with HPA for Pre-launch VM, This hardcoded address may not work on other platforms if the HPA bases of Software SRAM are different. Now, Offline tool configures above GPA based on the detection of Software SRAM on specific platform. This patch removes the hardcoding GPA of Software SRAM, and also renames MACRO 'SOFTWARE_SRAM_BASE_GPA' to 'PRE_RTVM_SW_SRAM_BASE_GPA' to avoid confusing, as it is for Prelaunch VM only. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-01-30 13:41:02 +08:00
Yonghua Huang	c9ca23d268	hv: refine RTCM initialization code - RTCM is initialized in hypervisor only if RTCM binaries are detected. - Remove address space of RTCM binary from Software SRAM region. - Refine parse_rtct() function, validity of ACPI RTCT table shall be checked by caller. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-01-28 11:29:25 +08:00
Yonghua Huang	a6e666dbe7	hv: remove hardcoding of SW SRAM HPA base Physical address to SW SRAM region maybe different on different platforms, this hardcoded address may result in address mismatch for SW SRAM operations. This patch removes above hardcoded address and uses the physical address parsed from native RTCT. Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-01-28 11:29:25 +08:00
Yonghua Huang	a6420e8cfa	hv: cleanup legacy terminologies in RTCM module This patch updates below terminologies according to the latest TCC Spec: PTCT -> RTCT PTCM -> RTCM pSRAM -> Software SRAM Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2021-01-28 11:29:25 +08:00
Yonghua Huang	806f479108	hv: rename RTCM source files 'ptcm' and 'ptct' are legacy name according to the latest TCC spec, hence rename below files to avoid confusing: ptcm.c -> rtcm.c ptcm.h -> rtcm.h ptct.h -> rtct.h Tracked-On: #5649 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2021-01-28 11:29:25 +08:00
Liang Yi	e8a76868c9	hv: modularization: remove global variable efiloader_sig. Simplify multiboot API by removing the global variable efiloader_sig. Replaced by constant at the use site. Tracked-On: #5661 Signed-off-by: Yi Liang <yi.liang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-01-27 15:59:47 +08:00
Liang Yi	67926cee81	hv: modularization: remove include/boot.h. Remove include/boot.h since it contains only assembly variables that should only be accessed in arch/x86/init.c. Tracked-On: #5661 Signed-off-by: Yi Liang <yi.liang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-01-27 15:59:47 +08:00
Liang Yi	1de396363f	hv: modularization: avoid dependency of multiboot on zeropage.h. Split off definition of "struct efi_info" into a separate header file lib/efi.h. Tracked-On: #5661 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-01-27 15:59:47 +08:00
Liang Yi	681688fbe4	hv: modularization: change of multiboot API. The init_multiboot_info() and sanitize_multiboot_ifno() APIs now require parameters instead of implicitly relying on global boot variables. Tracked-On: #5661 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-01-27 15:59:47 +08:00
Liang Yi	66599e0aa7	hv: modularization: multiboot Calling sanitize_multiboot() from init.c instead of cpu.c. Tracked-On: #5661 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-01-27 15:59:47 +08:00
Liang Yi	c23e557a18	hv: modularization: make parse_hv_cmdline() an internal function. This way, we void exposing acrn_mbi as a global variable. Tracked-On: #5661 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-01-27 15:59:47 +08:00
Liang Yi	8f9ec59a53	hv: modularization: cleanup boot.h Move multiboot specific declarations from boot.h to multiboot.h. Tracked-On: #5661 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-01-27 15:59:47 +08:00
Jie Deng	5c5d272358	hv: remove bitmap_clear_lock of split-lock after completing emulation When "signal_event" is called, "wait_event" will actually not block. So it is ok to remove this line. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com>	2021-01-13 15:32:27 +08:00
Yin Fengwei	ef411d4ac3	hv: ptirq: Shouldn't change sid if intx irq mapping was added Now, we use hash table to maintain intx irq mapping by using the key generated from sid. So once the entry is added,we can not update source ide any more. Otherwise, we can't locate the entry with the key generated from new source ide. For source id change, remove_remapping/add_remapping is used instead of update source id directly if entry was added already. Tracked-On: #5640 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-01-12 15:23:44 +08:00
Jie Deng	8aebf5526f	hv: move split-lock logic into dedicated file This patch move the split-lock logic into dedicated file to reduce LOC. This may make the logic more clear. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com>	2021-01-08 17:37:20 +08:00
Jie Deng	27d5711b62	hv: add a cache register for VMX_PROC_VM_EXEC_CONTROLS This patch adds a cache register for VMX_PROC_VM_EXEC_CONTROLS to avoid the frequent VMCS access. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com>	2021-01-08 17:37:20 +08:00
Jie Deng	f291997811	hv: split-lock: using MTF instead of TF(#DB) The TF is visible to guest which may be modified by the guest, so it is not a safe method to emulate the split-lock. While MTF is specifically designed for single-stepping in x86/Intel hardware virtualization VT-x technology which is invisible to the guest. Use MTF to single step the VCPU during the emulation of split lock. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com>	2021-01-08 17:37:20 +08:00
Jie Deng	6852438e3a	hv: Support concurrent split-lock emulation on SMP. For a SMP guest, split-lock check may happen on multiple vCPUs simultaneously. In this case, one vCPU at most can be allowed running in the split-lock emulation window. And if the vCPU is doing the emulation, it should never be blocked in the hypervisor, it should go back to the guest to execute the lock instruction immediately and trap back to the hypervisor with #DB to complete the split-lock emulation. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com>	2021-01-08 17:37:20 +08:00
Li Fei1	0b18389d95	hv: vcpuid: expose mce feature to guest Windows64 seems only support processor which has MCE (Machine Check Error) feature. Tracked-On: #5638 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2021-01-08 17:22:34 +08:00
Jie Deng	b14c32a110	hv: Retain RIP only for fault exception. We have trapped the #DB for split-lock emulation. Only fault exception need RIP being retained. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-12-31 11:12:33 +08:00
Jie Deng	977e862192	hv: Add split-lock emulation for xchg xchg may also cause the #AC for split-lock check. This patch adds this emulation. 1. Kick other vcpus of the guest to stop execution if the guest has more than one vcpu. 2. Emulate the xchg instruction. 3. Notify other vcpus (if any) to restart execution. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-12-31 11:12:33 +08:00
Jie Deng	47e193a7bb	hv: Add split-lock emulation for LOCK prefix instruction This patch adds the split-lock emulation. If a #AC is caused by instruction with LOCK prefix then emulate it, otherwise, inject it back as it used to be. 1. Kick other vcpus of the guest to stop execution and set the TF flag to have #DB if the guest has more than one vcpu. 2. Skip over the LOCK prefix and resume the current vcpu back to guest for execution. 3. Notify other vcpus to restart exception at the end of handling the #DB since we have completed the LOCK prefix instruction emulation. Tracked-On: #5605 Signed-off-by: Jie Deng <jie.deng@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-12-31 11:12:33 +08:00
Yonghua Huang	643bbcfe34	hv: check the availability of guest CR4 features Check hardware support for all features in CR4, and hide bits from guest by vcpuid if they're not supported for guests OS. Tracked-On: #5586 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-12-18 11:21:22 +08:00
Yonghua Huang	442fc30117	hv: refine virtualization flow for cr0 and cr4 - The current code to virtualize CR0/CR4 is not well designed, and hard to read. This patch reshuffle the logic to make it clear and classify those bits into PASSTHRU, TRAP_AND_PASSTHRU, TRAP_AND_EMULATE & reserved bits. Tracked-On: #5586 Signed-off-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-12-18 11:21:22 +08:00
Yonghua Huang	08c42f91c9	hv: rename hypercall for hv-emulated device management Coding style cleanup, use add/remove instead of create/destroy. Tracked-On: #5586 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-12-07 16:25:17 +08:00
Shiqing Gao	6f10bd00bf	hv: coding style clean-up related to Boolean While following two styles are both correct, the 2nd one is simpler. bool is_level_triggered; 1. if (is_level_triggered == true) {...} 2. if (is_level_triggered) {...} This patch cleans up the style in hypervisor. Tracked-On: #861 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2020-11-28 14:51:32 +08:00
Junming Liu	1cd932e568	hv: refine code style refine code style Tracked-On: #4020 Signed-off-by: Junming Liu <junming.liu@intel.com>	2020-11-26 12:56:28 +08:00
Junming Liu	56eb859ea4	hv: vmexit: refine xsetbv_vmexit_handler API From SDM Vol.2C - XSETBV instruction description, If CR4.OSXSAVE[bit 18] = 0, execute "XSETBV" instruction will generate #UD exception. From SDM Vol.3C 25.1.1,#UD exception has priority over VM exits, So if vCPU execute "XSETBV" instruction when CR4.OSXSAVE[bit 18] = 0, VM exits won't happen. While hv inject #GP if vCPU execute "XSETBV" instruction when CR4.OSXSAVE[bit 18] = 0. It's a wrong behavior, this patch will fix the bug. Tracked-On: #4020 Signed-off-by: Junming Liu <junming.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-26 12:56:28 +08:00
Peter Fang	68dc8d9f8f	hv: pm: avoid duplicate shutdowns on RTVM It is possible for more than one vCPUs to trigger shutdown on an RTVM. We need to avoid entering VM_READY_TO_POWEROFF state again after the RTVM has been paused or shut down. Also, make sure an RTVM enters VM_READY_TO_POWEROFF state before it can be paused. v1 -> v2: - rename to poweroff_if_rt_vm for better clarity Tracked-On: #5411 Signed-off-by: Peter Fang <peter.fang@intel.com>	2020-11-11 14:05:39 +08:00
dongshen	ca5683f78d	hv: add support for shutdown for pre-launched VMs Currently, ACRN only support shutdown when triple fault happens, because ACRN doesn't present/emulate a virtual HW, i.e. port IO, to support shutdown. This patch emulate a virtual shutdown component, and the vACPI method for guest OS to use. Pre-launched VM uses ACPI reduced HW mode, intercept the virtual sleep control/status registers for pre-launched VMs shutdown Tracked-On: #5411 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-11-04 10:33:31 +08:00
dongshen	8f79ceefbd	hv: fix out-of-date comments related to pre-launched VMs rebooting Like post-launched VMs, for pre-launched VMs, the ACPI reset register is also fixed at 0xcf9 and the reset value is 0xE, so pre-launched VMs now also use ACPI reset register for rebooting. Tracked-On: #5411 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-11-04 10:33:31 +08:00
Peter Fang	70b1218952	hv: pm: support shutting down multiple VMs when pCPUs are shared More than one VM may request shutdown on the same pCPU before shutdown_vm_from_idle() is called in the idle thread when pCPUs are shared among VMs. Use a per-pCPU bitmap to store all the VMIDs requesting shutdown. v1 -> v2: - use vm_lock to avoid a race on shutdown Tracked-On: #5411 Signed-off-by: Peter Fang <peter.fang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-04 10:33:31 +08:00
Li Fei1	c6f9404f55	hv: psram: add kconfig to enable psram Add two Kconfig pSRAM config: one for whether to enable the pSRAM on the platfrom or not; another for if the pSRAM is enabled on the platform whether to enable the pSRAM in the pre-launched RTVM. If we enable the pSRAM on the platform, we should remove the pSRAM EPT mapping from the SOS to prevent it could flush the pSRAM cache. Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com>	2020-11-02 15:56:30 +08:00
Qian Wang	99ee76781f	hv: pSRAM: add pSRAM support for pre-launched RTVM 1.Modified the virtual e820 table for pre-launched VM. We added a segment for pSRAM, and thus lowmem RAM is split into two parts. Logics are added to deal with the split. 2.Added EPT mapping of pSRAM segment for pre-launched RTVM if it uses pSRAM. Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-02 15:56:30 +08:00
Qian Wang	a557105e71	hv: ept: set EPT cache attribute to WB for pSRAM pSRAM memory should be cachable. However, it's not a RAM or a normal MMIO, so we can't use the an exist API to do the EPT mapping and set the EPT cache attribute to WB for it. Now we assume that SOS must assign the PSRAM area as a whole and as a separate memory region whose base address is PSRAM_BASE_HPA. If the hpa of the EPT mapping region is equal to PSRAM_BASE_HPA, we think this EPT mapping is for pSRAM, we change the EPT mapping cache attribute to WB. And fix a minor bug when SOS trap out to emulate wbinvd when pSRAM is enabled. Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-02 15:56:30 +08:00
Qian Wang	ca2aee225c	hv: skip pSRAM for guest WBINVD emulation Use ept_flush_leaf_page to emulate guest WBINVD when PTCM is enabled and skip the pSRAM in ept_flush_leaf_page. TODO: do we need to emulate WBINVD in HV side. Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-02 10:29:43 +08:00
Li Fei1	f3067f5385	hv: mmu: rename hv_access_memory_region_update to ppt_clear_user_bit Rename hv_access_memory_region_update to ppt_clear_user_bit to verb + object style. Tracked-On: #5330 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-02 10:29:43 +08:00
Li Fei1	35abee60d6	hv: pSRAM: temporarily remove NX bit of PTCM binary Temporarily remove NX bit of PTCM binary in pagetable during pSRAM initialization: 1.added a function ppt_set_nx_bit to temporarily remove/restore the NX bit of a given area in pagetable. 2.Temporarily remove NX bit of PTCM binary during pSRAM initialization to make PTCM codes executable. 3. TODO: We may use SMP call to flush TLB and do pSRAM initilization on APs. Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-02 10:29:43 +08:00
Li Fei1	5fa816f921	hv: pSRAM: add PTCT parsing code The added parse_ptct function will parse native ACPI PTCT table to acquire information like pSRAM location/size/level and PTCM location, and save them. Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com>	2020-11-02 10:29:43 +08:00
Li Fei1	80121b8347	hv: pSRAM: add pSRAM initialization codes 1.We added a function init_psram to initialize pSRAM as well as some definitions. Both AP and BSP shall call init_psram to make sure pSRAM is initialized, which is required by PTCM. BSP: To parse PTCT and find the entry of PTCM command function, then call PTCM ABI. AP: Wait until BSP has done the parsing work, then call the PTCM ABI. Synchronization of AP and BSP is ensured, both inside and outside PTCM. 2. Added calls of init_psram in init_pcpu_post to initialize pSRAM in HV booting phase Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com>	2020-11-02 10:29:43 +08:00
Qian Wang	77269c15c5	hv: vcr: remove wbinvd for CR0.CD emulation According 11.5.1 Cache Control Registers and Bits, Intel SDM Vol 3, change CR0.CD will not flush cache to insure memory coherency. So it's not needed to call wbinvd to flush cache in ACRN Hypervisor. That's what the guest should do. Tracked-On: #5330 Signed-off-by: Qian Wang <qian1.wang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-11-02 10:29:43 +08:00
Tao Yuhong	996e8f680c	HV: pci-vuart support create vdev hcall Add cteate method for vmcs9900 vdev in hypercalls. The destroy method of ivshmem is also suitable for other emulated vdev, move it into hcall_destroy_vdev() for all emulated vdevs Tracked-On: #5394 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>	2020-10-30 20:41:34 +08:00
Tao Yuhong	4120bd391a	HV: decouple legacy vuart interface from acrn_vuart layer support pci-vuart type, and refine: 1.Rename init_vuart() to init_legacy_vuarts(), only init PIO type. 2.Rename deinit_vuart() to deinit_legacy_vuarts(), only deinit PIO type. 3.Move io handler code out of setup_vuart(), into init_legacy_vuarts() 4.add init_pci_vuart(), deinit_pci_vuart, for one pci vuart vdev. and some change from requirement: 1.Increase MAX_VUART_NUM_PER_VM to 8. Tracked-On: #5394 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>	2020-10-30 20:41:34 +08:00
Yang, Yu-chu	8c78590da7	acrn-config: refactor pci_dev_c.py and insert vuart device information - Refactor pci_dev_c.py to insert devices information per VMs - Add function to get unused vbdf form bus:dev.func 00:00.0 to 00:1F.7 Add pci devices variables to vm_configurations.c - To pass the pci vuart information form tool, add pci_dev_num and pci_devs initialization by tool - Change CONFIG_SOS_VM in hypervisor/include/arch/x86/vm_config.h to compromise vm_configurations.c Tracked-On: #5426 Signed-off-by: Yang, Yu-chu <yu-chu.yang@intel.com>	2020-10-30 20:24:28 +08:00
Zide Chen	a776ccca94	hv: don't need to save boot context - Since de-privilege boot is removed, we no longer need to save boot context in boot time. - cpu_primary_start_64 is not an entry for ACRN hypervisor any more, and can be removed. Tracked-On: #5197 Signed-off-by: Zide Chen <zide.chen@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-10-29 10:05:05 +08:00
Yonghua Huang	3ea1ae1e11	hv: refine msi interrupt injection functions 1. refine the prototype of 'inject_msi_lapic_pt()' 2. rename below function: - rename 'vlapic_intr_msi()' to 'vlapic_inject_msi()' - rename 'inject_msi_lapic_pt()' to 'inject_msi_for_lapic_pt()' - rename 'inject_msi_lapic_virt()' to 'inject_msi_for_non_lapic_pt()' Tracked-On: #5407 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Li Fei <fei1.li@intel.com> Reviewed-by: Wang, Yu1 <yu1.wang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-10-26 08:44:13 +08:00
Yonghua Huang	012927d0bd	hv: move function 'inject_msi_lapic_pt()' to vlapic.c This function can be used by other modules instead of hypercall handling only, hence move it to vlapic.c Tracked-On: #5407 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Li, Fei <fei1.li@intel.com> Reviewed-by: Wang, Yu1 <yu1.wang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-10-26 08:44:13 +08:00
Zide Chen	9f2b35507a	hv: remove UEFI_OS_LOADER_NAME from Kconfig Since UEFI boot is no longer supported. Tracked-On: #5197 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-10-21 15:09:26 +08:00
Zide Chen	bebffb29fc	hv: remove de-privilege boot mode support and remove vboot wrappers Now ACRN supports direct boot mode, which could be SBL/ABL, or GRUB boot. Thus the vboot wrapper layer can be removed and the direct boot functions don't need to be wrapped in direct_boot.c: - remove call to init_vboot(), and call e820_alloc_memory() directly at the time when the trampoline buffer is actually needed. - Similarly, call CPU_IRQ_ENABLE() instead of the wrapper init_vboot_irq(). - remove get_ap_trampoline_buf(), since the existing function get_trampoline_start16_paddr() returns the exact same value. - merge init_general_vm_boot_info() into init_vm_boot_info(). - remove vm_sw_loader pointer, and call direct_boot_sw_loader() directly. - move get_rsdp_ptr() from vboot_wrapper.c to multiboot.c, and remove the wrapper over two boot modes. Tracked-On: #5197 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-10-21 15:09:26 +08:00
Shuang Zheng	13d39fda85	hv: update hybrid_rt with 2 post-launched VMs in Kconfig update the help message of config SCENARIO to set 2 standard post-launched VMs for default hybrid_rt scenario in Kconfig. Tracked-On: #5390 Signed-off-by: Shuang Zheng <shuang.zheng@intel.com> Acked-by: Victor Sun <victor.sun@intel.com>	2020-10-14 14:00:45 +08:00
Victor Sun	c63899fc81	HV: correct hpa calculation for pre-launched VM The commit of `da81a0041d` "HV: add e820 ACPI entry for pre-launched VM" introduced a issue that the base_hpa and remaining_hpa_size are also calculated on the entry of 32bit PCI hole which from 0x80000000 to 0xffffffff, which is incorrect; Tracked-On: #5266 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-09-15 09:45:10 +08:00
Li Fei1	a2fd8c5a9d	pci: mcfg: limit device bus numbers which could access by ECAM Per PCI Firmware Specification Revision 3.0, 4.1.2. MCFG Table Description: Memory Mapped Enhanced Configuration Space Base Address Allocation Structure assign the Start Bus Number and the End Bus Number which could decoded by the Host Bridge. We should not access the PCI device which bus number outside of the range of [Start Bus Number, End Bus Number). For ACRN, we should: 1. Don't detect PCI device which bus number outside the range of [Start Bus Number, End Bus Number) of MCFG ACPI Table. 2. Only trap the ECAM MMIO size: [MMCFG_BASE_ADDRESS, MMCFG_BASE_ADDRESS + (End Bus Number - Start Bus Number + 1) * 0x100000) for SOS. Tracked-On: #5233 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-09 09:31:56 +08:00
Victor Sun	2c0bc146ce	HV: remove deprecated vacpi build method The old method of build pre-launched VM vacpi by HV source code is deprecated, so remove related source code; Tracked-On: #5266 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-08 19:52:25 +08:00
Victor Sun	34547e1e19	HV: add acpi module support for pre-launched VM Previously we use a pre-defined structure as vACPI table for pre-launched VM, the structure is initialized by HV code. Now change the method to use a pre-loaded multiboot module instead. The module file will be generated by acrn-config tool and loaded to GPA 0x7ff00000, a hardcoded RSDP table at GPA 0x000f2400 will point to the XSDT table which at GPA 0x7ff00080; Tracked-On: #5266 Signed-off-by: Victor Sun <victor.sun@intel.com> Signed-off-by: Shuang Zheng <shuang.zheng@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-08 19:52:25 +08:00
Victor Sun	da81a0041d	HV: add e820 ACPI entry for pre-launched VM Previously the ACPI table was stored in F segment which might not be big enough for a customized ACPI table, hence reserve 1MB space in pre-launched VM e820 table to store the ACPI related data: 0x7ff00000 ~ 0x7ffeffff : ACPI Reclaim memory 0x7fff0000 ~ 0x7fffffff : ACPI NVS memory Tracked-On: #5266 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-08 19:52:25 +08:00
Victor Sun	0461ac209f	HV: set CONFIG_HV_RAM_START as min addr when RELOC enabled Previously the min load_addr for HV image is hard coded to 0x10000000 when CONFIG_RELOC is enabled, now use CONFIG_HV_RAM_START as its prefer minimum address like setting of CONFIG_PHYSICAL_START do in Linux kernel. With this patch, we can offload the CONFIG_HV_RAM_START algorithm to acrn-config or manually set it in scenario XML on some special boards. Tracked-On: #5275 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-09-07 15:03:53 +08:00
Nishioka, Toshiki	77fb21e98c	hv: add vgpio device model support When HV pass through the P2SB MMIO device to pre-launched VM, vgpio device model traps MMIO access to the GPIO registers within P2SB so that it can expose virtual IOAPIC pins to the VM in accordance with the programmed mappings between gsi and vgsi. Tracked-On: #5246 Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-07 14:52:02 +08:00
Nishioka, Toshiki	ba99984f69	hv: add INTx mapping for pre-launched VMs Add the capability of forwarding specified physical IOAPIC interrupt lines to pre-launched VMs as virtual IOAPIC interrupts. This is for the sake of the certain MMIO pass-thru devices on EHL CRB which can support only INTx interrupts. Tracked-On: #5245 Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-07 14:52:02 +08:00
Shuo A Liu	d6b9682581	hv: debug: Convert PCI UART paramter from a BDF string to a hex value BDF string can be parsed by the configuration tool. A 16bit WORD value with format (B:8, D:5, F:3) can be passed from configuration to the hypervisor directly to save some BDF string parse code. Tracked-On: #4937 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-01 15:13:53 +08:00
Yonghua Huang	c03623f3fb	hv[v2]: Remove deprecated term in vPIC submodule This patch cleanup below deprecated terms: 'master' -> 'primary' 'slave' -> 'secondary' v2 update: Refine comments. Tracked-On: #5249 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-09-01 09:30:08 +08:00
Stanley Chang	d55813e80b	hv: passthru DHRD-ignored device When trying to passthru a DHRD-ignored PCI device, iommu_attach_device shall report success. Otherwise, the assign_vdev_pt_iommu_domain will result in HV panic. Same for iommu_detach_device case. Tracked-On: #5240 Signed-off-by: Stanley Chang <stanley.chang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-01 09:29:25 +08:00
Yuan Liu	8a34cf03ca	hv: add new hypercalls to create and destroy an emulated device in hypervisor Add HC_CREATE_VDEV and HC_DESTROY_VDEV two hypercalls that are used to create and destroy an emulated device(PCI device or legacy device) in hypervisor v3: 1) change HC_CREATE_DEVICE and HC_DESTROY_DEVICE to HC_CREATE_VDEV and HC_DESTROY_VDEV 2) refine code style v4: 1) remove unnecessary parameter 2) add VM state check for HC_CREATE_VDEV and HC_DESTROY hypercalls Tracked-On: #4853 Reviewed-by: Wang, Yu1 <yu1.wang@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-28 16:53:12 +08:00
Wei Liu	29ac258134	acrn-config: code refactoring for CAT/MBA 1.Modify clos_mask and mba_delay as a member of the union type. 2.Move HV_SUPPORTED_MAX_CLOS ,MAX_CACHE_CLOS_NUM_ENTRIES and MAX_MBA_CLOS_NUM_ENTRIES to misc_cfg.h file. Tracked-On: #5229 Signed-off-by: Wei Liu <weix.w.liu@intel.com> Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-08-28 16:44:06 +08:00
dongshen	a425730f64	acrn-config: rename MAX_PLATFORM_CLOS_NUM to HV_SUPPORTED_MAX_CLOS HV_SUPPORTED_MAX_CLOS: This value represents the maximum CLOS that is allowed by ACRN hypervisor. This value is set to be least common Max CLOS (CPUID.(EAX=0x10,ECX=ResID):EDX[15:0]) among all supported RDT resources in the platform. In other words, it is min(maximum CLOS of L2, L3 and MBA). This is done in order to have consistent CLOS allocations between all the RDT resources. Tracked-On: #5229 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-08-28 16:44:06 +08:00
Yin Fengwei	d0e06c4f80	hv: debug: Enable MMIO UART support New board, EHL CRB, does not have legacy port IO UART. Even the PCI UART are not work due to BIOS's bug workaround(the BARs on LPSS PCI are reset after BIOS hand over control to OS). For ACRN console usage, expose the debug UART via ACPI PnP device (access by MMIO) and add support in hypervisor debug code. Another special thing is that register width of UART of EHL CRB is 1byte. Introduce reg_width for each struct console_uart. Tracked-On: #4937 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2020-08-27 13:31:17 +08:00
Mingqiang Chi	53b11d1048	refine hypercall -- use an array to fast locate the hypercall handler to replace switch case. -- uniform hypercall handler as below: int32_t (*handler)(sos_vm, target_vm, param1, param2) Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2020-08-26 14:55:24 +08:00
Geoffroy Van Cutsem	f8883f43e9	hv: enhance help text for the scenario option in Kconfig Enhance the help text that accompanies the CONFIG_SCENARIO symbol in Kconfig Tracked-On: #5203 Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>	2020-08-26 08:51:50 +08:00
Yuan Liu	d6f563c4eb	hv: implement ivshmem memory regions initialization The ivshmem memory regions use the memory of the hypervisor and they are continuous and page aligned. this patch is used to initialize each memory region hpa. v2: 1) if CONFIG_IVSHMEM_SHARED_MEMORY_ENABLED is not defined, the entire code of ivshmem will not be compiled. 2) change ivshmem shared memory unit from byte to page to avoid misconfiguration. 3) add ivshmem configuration and vm configuration references v3: 1) change CONFIG_IVSHMEM_SHARED_MEMORY_ENABLED to CONFIG_IVSHMEM_ENABLED 2) remove the ivshmem configuration sample, offline tool provides default ivshmem configuration. 3) refine code style. v4: 1) make ivshmem_base 2M aligned. Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-19 15:06:15 +08:00
Junming Liu	3631a85c3c	hv:cpu-caps:refine is_apl_platform func and clean up duplicated code Fix the bug for "is_apl_platform" func. "monitor_cap_buggy" is identical to "is_apl_platform", so remove it. On apl platform: 1) ACRN doesn't use monitor/mwait instructions 2) ACRN disable GPU IOMMU Tracked-On:#3675 Signed-off-by: Junming Liu <junming.liu@intel.com>	2020-08-14 10:08:50 +08:00
liujunming	538e7cf74d	hv:cpu-caps:refine processor family and model info v3 -> v4: Refine commit message and code stype 1. SDM Vol. 2A 3-211 states DisplayFamily = Extended_Family_ID + Family_ID when Family_ID == 0FH. So it should be family += ((eax >> 20U) & 0xffU) when Family_ID == 0FH. 2. IF (Family_ID = 06H or Family_ID = 0FH) THEN DisplayModel = (Extended_Model_ID « 4) + Model_ID; While previous code this logic: IF (DisplayFamily = 06H or DisplayFamily = 0FH) Fix the bug about calculation of display family and display model according to SDM definition. 3. use variable name to distinguish Family ID/Display Family/Model ID/Display Model, then the code is more clear to avoid some mistake Tracked-On:#3675 Signed-off-by: liujunming <junming.liu@intel.com> Reviewed-by: Wu Xiangyang <xiangyang.wu@linux.intel.com> Acked-by： Eddie Dong <eddie.dong@intel.com>	2020-08-14 10:08:50 +08:00
Victor Sun	b5dfe369da	HV: move vm configuration check to pre-build time This patch will move the VM configuration check to pre-build stage, a test program will do the check for pre-defined VM configuration data before making hypervisor binary. If test failed, the make process will be aborted. So once the hypervisor binary is built successfully or start to run, it means the VM configuration has been sanitized. The patch did not add any new VM configuration check function, it just port the original sanitize_vm_config() function from cpu.c to static_checks.c with below change: 1. remove runtime rdt detection for clos check; 2. replace pr_err() from logmsg.h with printf() from stdio.h; 3. replace runtime call get_pcpu_nums() in ALL_CPUS_MASK macro with static defined MAX_PCPU_NUM; 4. remove cpu_affinity check since pre-launched VM might share pcpu with SOS VM; The BOARD/SCENARIO parameter check and configuration folder check is also moved to prebuild Makefile. Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-12 10:21:17 +08:00
Victor Sun	8245145317	HV: remove sanitize_vm_config function Remove function of sanitize_vm_config() since the processing of sanitizing will be moved to pre-build process. When hypervisor has booted, we assume all VM configurations is sanitized; Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-12 10:21:17 +08:00
Mingqiang Chi	a67a85c70d	hv:refine vm & vcpu lock -- move vm_state_lock to other place in vm structure to avoid the memory waste because of the page-aligned. -- remove the memset from create_vm -- explicitly set max_emul_mmio_regions and vcpuid_entry_nr to 0 inside create_vm to avoid use without initialization. -- rename max_emul_mmio_regions to nr_emul_mmio_regions v1->v2: add deinit_emul_io in shutdown_vm Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-05 13:39:28 +08:00
Victor Sun	b9ad04d24d	HV: add cpu affinity info for SOS VM Previously the CPU affinity of SOS VM is initialized at runtime during sanitize_vm_config() stage, follow the policy that all physical CPUs except ocuppied by Pre-launched VMs are all belong to SOS_VM. Now change the process that SOS CPU affinity should be initialized at build time and has the assumption that its validity is guarenteed before runtime. Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-04 09:05:29 +08:00
Victor Sun	3acafd140f	HV: Make: simplify acpi info header file check Previously we have complicated check mechanism on platform_acpi_info.h which is supposed to be generated by acrn-config tool, but given the reality that all configurations should be generated by acrn-config before build acrn hypervisor, this check is not needed anymore. Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-04 09:05:29 +08:00
Victor Sun	07ad37f436	HV: Make: remove sdc scenario build support The SDC scenario configurations will not be validated so remove it from build makefile; Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-04 09:05:29 +08:00
Victor Sun	62c87856ce	HV: remove deprecated old layout configuration source The old layout configuration source which located in: hypervisor/arch/x86/configs/ is abandoned, remove it; Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-07-24 16:16:06 +08:00
Victor Sun	a57a4fd7fb	HV: Make: enable build for new configs layout The make command is same as old configs layout: under acrn-hypervisor folder: make hypervisor BOARD=xxx SCENARIO=xxx [TARGET_DIR]=xxx [RELEASE=x] under hypervisor folder: make BOARD=xxx SCENARIO=xxx [TARGET_DIR]=xxx [RELEASE=x] if BOARD/SCENARIO parameter is not specified, the default will be: BOARD=nuc7i7dnb SCENARIO=industry Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-24 16:16:06 +08:00
Victor Sun	e792fa3d3c	HV: nuc7i7dnb example of new VM configuratons layout There are 3 kinds of configurations in ACRN hypervisor source code: hypervisor overall setting, per-board setting and scenario specific per-VM setting. Currently Kconfig act as hypervisor overall setting and its souce is located at "hypervisor/arch/x86/configs/$(BOARD).config"; Per-board configs are located at "hypervisor/arch/x86/configs/$(BOARD)" folder; scenario specific per-VM configs are located at "hypervisor/scenarios/$(SCENARIO)" folder. This layout brings issues that board configs and VM configs are coupled tightly. The board specific Kconfig file and misc_cfg.h are shared by all scenarios, and scenario specific pci_dev.c is shared by all boards. So the user have no way to build hypervisor binary for different scenario on different board with one source code repo. The patch will setup a new VM configurations layout as below: misc/vm_configs ├── boards --> folder of supported boards │ ├── <board_1> --> scenario-irrelevant board configs │ │ ├── board.c --> C file of board configs │ │ ├── board_info.h --> H file of board info │ │ ├── pci_devices.h --> pBDF of PCI devices │ │ └── platform_acpi_info.h --> native ACPI info │ ├── <board_2> │ ├── <board_3> │ └── <board...> └── scenarios --> folder of supported scenarios ├── <scenario_1> --> scenario specific VM configs │ ├── <board_1> --> board specific VM configs for <scenario_1> │ │ ├── <board_1>.config --> Kconfig for specific scenario on specific board │ │ ├── misc_cfg.h --> H file of board specific VM configs │ │ ├── pci_dev.c --> board specific VM pci devices list │ │ └── vbar_base.h --> vBAR base info of VM PT pci devices │ ├── <board_2> │ ├── <board_3> │ ├── <board...> │ ├── vm_configurations.c --> C file of scenario specific VM configs │ └── vm_configurations.h --> H file of scenario specific VM configs ├── <scenario_2> ├── <scenario_3> └── <scenario...> The new layout would decouple board configs and VM configs completely: The boards folder stores kinds of supported boards info, each board folder stores scenario-irrelevant board configs only, which could be totally got from a physical platform and works for all scenarios; The scenarios folder stores VM configs of kinds of working scenario. In each scenario folder, besides the generic scenario specific VM configs, the board specific VM configs would be put in a embedded board folder. In new layout, all configs files will be removed out of hypervisor folder and moved to a separate folder. This would make hypervisor LoC calculation more precisely with below fomula: typical LoC = Loc(hypervisor) + Loc(one vm_configs) which Loc(one vm_configs) = Loc(misc/vm_configs/boards/<board>) + LoC(misc/vm_configs/scenarios/<scenario>/<board>) + Loc(misc/vm_configs/scenarios/<scenario>/vm_configurations.c + Loc(misc/vm_configs/scenarios/<scenario>/vm_configurations.h Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-24 16:16:06 +08:00
Shuo A Liu	112f02851c	hv: Disable XSAVE-managed CET state of guest VM To hide CET feature from guest VM completely, the MSR IA32_MSR_XSS also need to be intercepted because it comprises CET_U and CET_S feature bits of xsave/xstors operations. Mask these two bits in IA32_MSR_XSS writing. With IA32_MSR_XSS interception, member 'xss' of 'struct ext_context' can be removed because it is duplicated with the MSR store array 'vcpu->arch.guest_msrs[]'. Tracked-On: #5074 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-07-23 20:15:57 +08:00
Shuo A Liu	ac598b0856	hv: Hide CET feature from guest VM Return-oriented programming (ROP), and similarly CALL/JMP-oriented programming (COP/JOP), have been the prevalent attack methodologies for stealth exploit writers targeting vulnerabilities in programs. CET (Control-flow Enforcement Technology) provides the following capabilities to defend against ROP/COP/JOP style control-flow subversion attacks: * Shadow stack: Return address protection to defend against ROP. * Indirect branch tracking: Free branch protection to defend against COP/JOP The full support of CET for Linux kernel has not been merged yet. As the first stage, hide CET from guest VM. Tracked-On: #5074 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-07-23 20:15:57 +08:00
Li Fei1	5e605e0daf	hv: vmcall: check vm id in dispatch_sos_hypercall Check whether vm_id is valid in dispatch_sos_hypercall Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Li Fei1	1859727abc	hv: vapci: add tpm2 support for pre-launched vm On WHL platform, we need to pass through TPM to Secure pre-launched VM. In order to do this, we need to add TPM2 ACPI Table and add TPM DSDT ACPI table to include the _CRS. Now we only support the TPM 2.0 device (TPM 1.2 device is not support). Besides, the TPM must use Start Method 7 (Uses the Command Response Buffer Interface) to notify the TPM 2.0 device that a command is available for processing. Tracked-On: #5053 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Li Fei1	7971f34344	hv: vapci: refine acpi table header initialization Using ACPI_TABLE_HEADER MACRO to initial the ACPI Table Header. Tracked-On: #5053 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-07-23 20:13:20 +08:00
Li Fei1	acc69007e2	hv: mmio_dev: add mmio device pass through support Add mmio device pass through support for pre-launched VM. When we pass through a MMIO device to pre-launched VM, we would remove its resource from the SOS. Now these resources only include the MMIO regions. Tracked-On: #5053 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Li Fei1	baf77a79ad	hv: mmio_dev: add hypercall to support mmio device pass through Add two hypercalls to support MMIO device pass through for post-launched VM. And when we support MMIO pass through for pre-launched VM, we could re-use the code in mmio_dev.c Tracked-On: #5053 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Conghui Chen	821c65b40c	hv: fix possible SSE region mismatch issue During context switch in hypervisor, xsave/xrstore are used to save/resotre the XSAVE area according to the XCR0 and XSS. The legacy region in XSAVE area include FPU and SSE, we should make sure the legacy region be saved during contex switch. FPU in XCR0 is always enabled according to SDM. For SSE, we enable it in XCR0 during context switch. Tracked-On: #5062 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-22 14:19:21 +08:00
Conghui Chen	53d4a7169b	hv: remove kick_thread from scheduler module kick_thread function is only used by kick_vcpu to kick vcpu out of non-root mode, the implementation in it is sending IPI to target CPU if target obj is running and target PCPU is not current one; while for runnable obj, it will just make reschedule request. So the kick_thread is not actually belong to scheduler module, we can drop it and just do the cpu notification in kick_vcpu. Tracked-On: #5057 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-22 13:38:41 +08:00
Conghui Chen	b6422f8985	hv: remove 'running' from vcpu structure vcpu->running is duplicated with THREAD_STS_RUNNING status of thread object. Introduce an API sleep_thread_sync(), which can utilize the inner status of thread object, to do the sync sleep for zombie_vcpu(). Tracked-On: #5057 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-22 13:38:41 +08:00
Mingqiang Chi	aa89eb3541	hv:add per-vm lock for vm & vcpu state change -- replace global hypercall lock with per-vm lock -- add spinlock protection for vm & vcpu state change v1-->v2: change get_vm_lock/put_vm_lock parameter from vm_id to vm move lock obtain before vm state check move all lock from vmcall.c to hypercall.c Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-20 11:22:17 +08:00
Yin Fengwei	fcec5a94be	kconfig: extend the max msix table number to 64 There are some devices (like Samsung NVMe SSD SM981/PM981 which has 33 MSIX tables) which have more than 16 MSIX tables. Extend the default value to 64 to handle them. Tracked-On: #4994 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2020-07-10 19:39:11 +08:00
Li Fei1	80c7da8f1c	hv: vioapic: expose ioapic to guest unconditionally Some OSes assume the platform must have the IOAPIC. For example: Linux Kernel allocates IRQ force from GSI (0 if there's no PIC and IOAPIC) on x86. And it thinks IRQ 0 is an architecture special IRQ, not for device driver. As a result, the device driver may goes wrong if the allocated IRQ is 0 for RTVM. This patch expose vIOAPIC to RTVM with LAPIC passthru even though the RTVM can't use IOAPIC, it servers as a place holder to fullfil the guest assumption. After vIOAPIC has exposed to guest unconditionally, the 'ready' field could be removed since we do vIOAPIC initialization for each guest. Tracked-On: #4691 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-10 19:33:46 +08:00
Mingqiang Chi	7751c7933d	hv:unify spin_lock initialization will follow this convention for spin lock initialization: -- for simple global variable locks, use this style: static spinlock_t xxx_spinlock = {.head = 0U, .tail = 0U,} -- for the locks inside a data structure, need to call spinlock_init to initialize. Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-07-02 09:40:29 +08:00
Shuo A Liu	025df6d44c	hv: use SELF IPI Register for self IPI in X2APIC mode According to SDM 10.12.11, we can know this register is dedicated to the purpose of sending self-IPIs with the intent of enabling a highly optimized path for sending self-IPIs. Also sending the IPI via the Self Interrupt Register ensures that interrupt is delivered to the processor core. Specifically completion of the WRMSR instruction to the SELF IPI register implies that the interrupt has been logged into the IRR. Tracked-On: #4937 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by： Eddie Dong <eddie.dong@intel.com>	2020-06-28 10:33:22 +08:00
Shuo A Liu	0397cb7174	hv: Fix the interrupts lost issue with PI support Currently, not all platforms support posted interrupt processing of both VT-x and VT-d. On EHL, VT-d doesn't support posted interrupt processing. So in such scenario, is_pi_capable() in vcpu_handle_pi_notification() will bypass the PIR pending bits check which might cause a self-NV-IPI lost. With commit "bf1ff8c98 (hv: Offload syncing PIR to vIRR to processor hardware)", the syncing PIR to vIRR is postponed and it is handled by a self-NV-IPI in the following VMEnter. The process looks like, a) vcpu A accepts a virtual interrupt -> 1) ACRN_REQUEST_EVENT is set 2) corresponding bit in PIR is set 3) Posted Interrupt ON bit is set b) vcpu A does virtual interrupt injection on resume path due to the pending ACRN_REQUEST_EVENT -> 1) hypervisor disables host interrupt 2) ACRN_REQUEST_EVENT is cleared 3) a self-NV-IPI is sent via ICR of LAPIC. 4) IRR bit of the self-NV-IPI is set c) (VM-ENTRY) vcpu A returns into non-root mode 1) host interrupt enable(by HW) 2) posted interrupt processing clears the ON bit, sync PIR to vIRR 3) deliver the virtual interrupt if guest rflags.IF=1 d) (VM-EXIT) vcpu A traps due to a instruction execution (e.g. HLT) 1) host interrupt disable(by HW) 2) hypervisor enable host interrupt Above illustrates a normal process of the virtual interrupt injection with cpu PI support. However, a failing case is observed. The failing case is that the self-NV-IPI from b-3 is not accepted by the core until a timing between d-1 and d-2. b-4 happening between d-1 and d-2 is observed by debug trace. So the self-NV-IPI will be handled in root-mode which cannot do the syncing PIR to vIRR processing. Due to the bug described in the first paragraph, vcpu_handle_pi_notification() cannot succeed the virtual interrupt injection request. This patch fix it by removing the wrong check in vcpu_handle_pi_notification() because vcpu_handle_pi_notification() only happens on platform with cpu PI support. Here are some cost data for sending IPI via LAPIC ICR regsiter. Normally, the cycles between ICR write and IRR got set is 140~260, which is not accurate due to the MSR read overhead. And from b-3 to c is about 560 cycles. So b-4 happens during this period. But in bad case, b-4 doesn't happen even c is triggered. The worse case i captured is that ICR write and IRR got set costs more than 1900 cycles. Now, the best GUESS of the huge cost of IPI via ICR is the ACPI bus arbitration(refer to SDM 10.6.3, 10.7 and Figure 10-17). Tracked-On: #4937 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by： Eddie Dong <eddie.dong@intel.com>	2020-06-28 10:33:22 +08:00
Li Fei1	da7c2ba3e9	hv: ept: wrap a function to do guest ept flush Wrap a function to do guest ept flush. This function doesn't do real EPT flush. It just make the EPT flush request and do the real flush just before vcpu vmenter. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-22 16:25:03 +08:00
Mingqiang Chi	1b84741a56	rename vm_lock/vlapic_state in VM structure rename: vlapic_state-->vlapic_mode vm_lock --> vlapic_mode_lock check_vm_vlapic_state --> check_vm_vlapic_mode Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
Mingqiang Chi	d808031a04	remove spin lock for micro code update remove spin lock for micro code update since the guest operating system will take lock Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
Mingqiang Chi	ac65898f35	cleanup spin lock in vtd.c move dm_unit->lock into dmar_issue_qi_request Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
Mingqiang Chi	67a7c355ec	cleanup spin lock in irq.c -- move exception_lock to dump.c -- optimize the lock usage in request_irq Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
yuhong.tao@intel.com	aff687896b	HV: Fix split-locked access detection is disabled by default The commit 'HV: Config Splitlock Detection to be disable' allows using CONFIG_ENFORCE_TURNOFF_AC to turn off splitlock #AC. If CONFIG_ENFORCE_TURNOFF_AC is not set, splitlock #AC should be turn on Tracked-On: #4962 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>	2020-06-19 09:22:58 +08:00
Conghui Chen	2a4c59db74	hv: add check for BASIC VMX INFORMATION Check bit 48 in IA32_VMX_BASIC MSR, if it is 1, return error, as we only support Intel 64 architecture. SDM: Appendix A.1 BASIC VMX INFORMATION Bit 48 indicates the width of the physical addresses that may be used for the VMXON region, each VMCS, anddata structures referenced by pointers in a VMCS (I/O bitmaps, virtual-APIC page, MSR areas for VMX transitions). If the bit is 0, these addresses are limited to the processor’s physical-address width.2 If the bit is 1, these addresses are limited to 32 bits. This bit is always 0 for processors that support Intel 64 architecture. Tracked-On: #4956 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2020-06-18 14:05:56 +08:00
Conghui Chen	906284eec8	hv: remove unnecessary debug symbols remove unnecessary debug symbols. Tracked-On: #4956 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2020-06-18 14:05:56 +08:00
Conghui Chen	f4292752b0	hv: remove check for OSXSAVE in host We always assume the physical platform has XSAVE, and we always enable XSAVE at the beginning, so, no need to check the OXSAVE in host. Tracked-On: #4956 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2020-06-18 14:05:56 +08:00
Conghui Chen	53f74f18ac	hv: remove repeated assignment remove repeated assignment for vmcs_pa. Tracked-On: #4956 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2020-06-18 14:05:56 +08:00
Victor Sun	ca9e98cc74	HV: add board and scenario info in log As build variants for different board and different scenario growing, users might make mistake on HV binary distributions. Checking board/scenario info from log would be the fastest way to know whether the binary matches. Also it would be of benifit to developers for confirming the correct binary they are debugging. Tracked-On: #4946 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-18 13:05:42 +08:00
Binbin Wu	da1788c9a3	hv: vtd: add an API to reserve continuous irtes dmar_reserve_irte is added to reserve N coutinuous IRTEs. N could be 1, 2, 4, 8, 16, or 32. The reserved IRTEs will not be freed. Tracked-On:#4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	7bfcc673a6	hv: ptirq: associate an irte with ptirq_remapping_info entry For a ptirq_remapping_info entry, when build IRTE: - If the caller provides a valid IRTE, use the IRET - If the caller doesn't provide a valid IRTE, allocate a IRET when the entry doesn't have a valid IRTE, in this case, the IRET will be freed when free the entry. Tracked-On:#4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	2fe4280cfa	hv: vtd: add two paramters for dmar_assign_irte idx_in: - If the caller of dmar_assign_irte passes a valid IRTE index, it will be resued; - If the caller of dmar_assign_irte passes INVALID_IRTE_ID as IRTE index, the function will allocate a new IRTE. idx_out: This paramter return the actual index of IRTE used. The caller need to check whether the return value is valid or not. Also this patch adds an internal function alloc_irte. The function takes count as input paramter to allocate continuous IRTEs. The count can only be 1, 2, 4, 8, 16 or 32. This is prepared for multiple MSI vector support. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	9c52b353fa	hv: kconfig: add a range for MAX_IR_ENTRIES Script only append 'U' for the config of int with a range. Add a range to MAX_IR_ENTRIES. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com>	2020-06-16 08:52:56 +08:00
Li Fei1	65e4a16e6a	hv: mmu: release 1GB cpu side support constrain There're some platforms still doesn't support 1GB large page on CPU side. Such as lakefield, TNT and EHL platforms on which have some silicon bug and this case CPU don't support 1GB large page. This patch tries to release this constrain to support more hardware platform. Note this patch doesn't release the constrain on IOMMU side. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-15 15:16:34 +08:00
Li Fei1	6e57553015	Revert "hv: Let trampoline execution use 1GB pages" This patch tries to release hardware platform 1GB large page support constrain on CPU side. There're some silicon bug on lakefield, TNT and EHL platforms which cause CPU couldn't support 1GB large page. As a result, the pre-assumption The platform which ACRN supports must support 1GB large page on both CPU side and VTD side is not true any more. This reverts commit `f01aad7e` to let trampoline execution use 2MB pages. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-15 15:16:34 +08:00
Binbin Wu	c907a820df	hv: config: add msix emulation support The information needed to enable MSI-x emulation. Only enable MSI-x emuation for the devices in msix_emul_devs array. Currently, only EHL has the need to enable MSI-x emulation for TSN devices. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-10 14:32:15 +08:00
Victor Sun	80262f0602	HV: rename append_seed_arg to fill_seed_arg Previously append_seed_arg() just do fill in seed arg to dest cmd buffer, so rename the api name to fill_seed_arg(). Since fill_seed_arg() will be called in SOS VM path only, the param of bool vm_is_sos is not needed and will be replaced by dest buffer size. The seed_args[] which used by fill_seed_arg() is pre-defined as all-zero, so memset() is not needed in fill_seed_arg(), buffer pointer check and strncpy_s() are not needed also. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Victor Sun	e254be150a	HV: rewrite memcpy_s to be iso c11 compliant Per C11 standard (ISO/IEC 9899:2011): K.3.7.1.1 1. Copying shall not take place between objects that overlap; 2. If there is a runtime-constraint violation, the memcpy_s function stores zeros in the first s1max characters of the object; 3. The memcpy_s function returns zero if there was no runtime-constraint violation. Otherwise, a nonzero value is returned. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Victor Sun	c74b1941a0	HV: split sanitize_multiboot_info api Previously sanitize_multiboot_info() was called after init_debug_pre() because the debug message can only print after uart is initialized. On the other hand, multiboot cmdline need to be parsed before init_debug_pre() because the cmdline could override uart settings and make sure debug message printed successfully. This cause multiboot info was parsed in two stages. The patch revise the multiboot parse logic that split sanitize_multiboot_info() api and use init_acrn_multiboot_info() api for the early stage. The most of multiboot info will be initialized during this stage and no debug message need to be printed. After uart is initialized, the sanitize_multiboot_info() would do sanitize multiboot info and print needed debug messages. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Binbin Wu	65ec6f3f3b	hv: vtd: fix potential dead loop if qi request timeout Fix potential dead loop if qi request timeout. Tracked-On: #4680 Signed-off-by: Binbin Wu <binbin.wu@intel.com>	2020-06-05 05:31:16 +08:00
Wei Liu	fbb1dfa264	HV/Kconfig: update efi bootloader image file path for Kconfig Update efi bootloader image file path for Yocto rootfs in Kconfig. Tracked-On: #4868 Signed-off-by: Wei Liu <weix.w.liu@intel.com> Reviewed-by: Victor Sun <victor.sun@intel.com>	2020-06-03 22:02:58 +08:00
Li Fei1	ae4fa40adc	hv: vpci: hv: vpci: refine pci device assignment logic Now Host Bridge and PCI Bridge could only be added to SOS's acrn_vm_pci_dev_config. So For UOS, we always emualte Host Bridge and PCI Bridge for it and assign PCI device to it; for SOS, if it's the highest severity VM, we will assign Host Bridge and PCI Bridge to it directly, otherwise, we will emulate them same as UOS. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-03 22:00:43 +08:00
Li Fei1	b8f151a55f	hv: pci: check whether a PCI device is host bridge or not by class According PCI Code and ID Assignment Specification Revision 1.11, a PCI device whose Base Class is 06h and Sub-Class is 00h is a Host bridge. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-03 22:00:43 +08:00
Li Fei1	0bd2daf1c5	hv: pci: remove host bridge BDF definition We should check whether a PCI device is host bridge or not by Base Class (06h) and Sub-Class (00h). Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-03 22:00:43 +08:00
Vijay Dhanraj	d03df0c7e2	HV: Fix MP Init sequence hang by adding a delay As per the BWG a delay should be provided between the INIT IPI and Startup IPI. Without the delay observe hangs on certain platforms during MP Init sequence. So Setting a delay of 10us between assert INIT IPI and Startup IPI. Also, as per SDM section 10.7 the the de-assert INIT IPI is only used for Pentium and P6 processors. This is not applicable for Pentium4 and Xeon processors so removing this sequence. Tracked-On: #4835 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 13:34:59 +08:00
Minggui Cao	8c090c71ca	HV: fix bug to clear guest flags after it not used in shutdown_vm, it uses guest flags when handling the phyiscal CPUs whose LAPIC is pass-through. So if it is cleared first, the related vCPUs and pCPUs can not be switched to correct state. so move the clear action after the flags used. Tracked-On: #4848 Signed-off-by: Minggui Cao <minggui.cao@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>	2020-05-27 11:35:47 +08:00
Binbin Wu	454bb14348	hv: vtd: remove some unnecessary check 1. context_entry couldn't be NULL in iommu_attach_device since bus number is checked before the call. 2. root_entry couldn't be NULL in iommu_detach_device since bus number is checked before the call. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 11:27:42 +08:00
Binbin Wu	e9901b3edd	hv: vtd: add a function to check valid of dmar unit Add a function dmar_unit_valid to check if the input dmar uint is valid or not. A valid dmar_unit should not be NULL, or ignore flag should not be set. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 11:27:42 +08:00
Binbin Wu	3009d9399f	hv: vtd: cleanup snoop control related code Snoop control will not be turned on by hypervisor, delete snoop control related code. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 11:27:42 +08:00
Binbin Wu	a94c3ef763	hv: vtd: init DMAR/IR table address when register Initialize root_table_addr/ir_table_addr of dmar uint when register the dmar uint. So no need to check if they are initialzed or not later. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 11:27:42 +08:00
Shuo A Liu	9a15ea82ee	hv: pause all other vCPUs in same VM when do wbinvd emulation Invalidate cache by scanning and flushing the whole guest memory is inefficient which might cause long execution time for WBINVD emulation. A long execution in hypervisor might cause a vCPU stuck phenomenon what impact Windows Guest booting. This patch introduce a workaround method that pausing all other vCPUs in the same VM when do wbinvd emulation. Tracked-On: #4703 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-21 15:21:29 +08:00
Mingqiang Chi	f994b5ffaf	hv:cleanup vcpu state -- remove VCPU_PAUSED and resume_vcpu -- remove vcpu->prev_state in vcpu structure -- rename pause_vcpu to zombie_vcpu Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-05-21 15:08:49 +08:00
Li Fei1	53af096726	hv: ptirq: refine find_ptirq_entry by hashing Refine find_ptirq_entry by hashing instead of walk each of the PTIRQ entries one by one. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com>	2020-05-20 16:04:16 +08:00
Yonghua Huang	3391bffb27	hv:fix rtvm hang with maxcpus=0/1 in bootargs RTVM (with lapic PT) boots hang when maxcpus is assigned a value less than the CPU number configured in hypervisor. In this case, vlapic_state(per VM) is left in TRANSITION state after BSP boot, which blocks interupts to be injected to this UOS. Tracked-On: #4803 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Li, Fei <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-15 10:09:13 +08:00
Li Fei1	27a66acd0e	hv: ptdev: refine look up MSI ptirq entry There's no need to look up MSI ptirq entry by virtual SID any more since the MSI ptirq entry would be removed before the device is assigned to a VM. Now the logic of MSI interrupt remap could simplify as: 1. Add the MSI interrupt remap first; 2. If step is already done, just do the remap part. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com> Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com>	2020-05-13 14:31:01 +08:00
Zide Chen	1bc5c7ac5b	hv/acrn-config/efi-stuf: assign hvlog and ramoops buffer address < 256MB If HV relocation is enabled, either ACRN efi-stub or GRUB relocates hypervisor image above HPA 256MB, thus we put hvlog and ramoops buffer under 256MB to avoid conflict with hypervisor owned address. This patch hardcodes these addresses: 0xa00000 - 0xdfffff: 4MiB for ramoops buffer 0xe00000 - 0xffffff: 2MiB for hvlog buffer However, user can customize them to other addresses as long as it's under 256MB, available in host e820, and SOS bootarg "nokaslr" is not specified. If HV relocation is disabled, need to make sure that these buffer addresses are not between HV_RAM_START and HV_RAM_START + HV_RAM_SIZE. Tracked-On: #4760 Signed-off-by: Zide Chen <zide.chen@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-05-13 08:36:54 +08:00
Zide Chen	0a956c34c7	hv: add a new field cpu_affinity in struct acrn_vm For post-launched VMs, the configured CPU affinity could be different from the actual running CPU affinity. This new field acrn_vm->cpu_affinity recognizes this difference so that it's possible that CREATE_VM hypercall won't overwrite the configured CPU afifnity. Change name cpu_affinity_bitmap in acrn_vm_config to cpu_affinity. This is read-only in run time, never overwritten by acrn-dm. Remove vm_config->vcpu_num, which means the number of vCPUs of the configured CPU affinity. This is not to be confused with the actual running vCPU number: vm->hw.created_vcpus. Changed get_vm_bsp_pcpu_id() to get_configured_bsp_pcpu_id() for less confusion. Tracked-On: #4616 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-08 11:04:31 +08:00
Sainath Grandhi	bf1ff8c98f	hv: Offload syncing PIR to vIRR to processor hardware ACRN syncs PIR to vIRR in the software in cases the Posted Interrupt notification happens while the pCPU is in root mode. Sync can be achieved by processor hardware by sending a posted interrupt notiification vector. This patch sends a self-IPI, if there are interrupts pending in PIR, which is serviced by the logical processor at the next VMEnter Tracked-On: #4777 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-05-08 10:01:07 +08:00
Yan, Like	869ccb7ba8	HV: RDT: add CDP support in ACRN CDP is an extension of CAT. It enables isolation and separate prioritization of code and data fetches to the L2 or L3 cache in a software configurable manner, depending on hardware support. This commit adds a Kconfig switch "CDP_ENABLED" which depends on "RDT_ENABLED". CDP will be enabled if the capability available and "CDP_ENABLED" is selected. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-08 08:50:13 +08:00
Yan, Like	277c668b04	HV: RDT: clean up RDT code This commit makes some RDT code cleanup, mainling including: - remove the clos_mask and mba_delay validation check in setup_res_clos_msr(), the check will be done in pre-build; - rename platform_clos_num to valid_clos_num, which is set as the minimal clos_mas of all enabled RDT resouces; - init the platform_clos_array in the res_cap_info[] definition; - remove the unnecessary return values and return value check. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com>	2020-05-08 08:50:13 +08:00
Yan, Like	f774ee1fba	HV: RDT: merge struct rdt_cache and rdt_membw in to a union A RDT resource could be CAT or MBA, so only one of struct rdt_cache and struct rdt_membw would be used at a time. They should be a union. This commit merge struct rdt_cache and struct rdt_membw in to a union res. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com	2020-05-08 08:50:13 +08:00
yuhong.tao@intel.com	b0ae9cfa2b	HV: Config Splitlock Detection to be disable #AC should be normally enabled for slpitlock detection, however, community developers may want to run ACRN on buggy system. In this case, CONFIG_ENFORCE_TURNOFF_AC can be used to turn off the #AC, to let the guest run without #AC. Tracked-On: #4765 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-07 09:35:22 +08:00
Li Fei1	0c6b3e57d6	hv: ptdev: minor refine about ptirq_build_physical_msi The virtual MSI information could be included in ptirq_remapping_info structrue, there's no need to pass another input paramater for this puepose. So we could remove the ptirq_msi_info input. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-05-06 11:51:11 +08:00
Li Fei1	73335b7276	hv: ptirq: rename ptirq_lookup_entry_by_sid to find_ptirq_entry We look up PTIRQ entru only by SID. So _by_sid could removed. And refine function name to verb-obj style. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-05-06 11:51:11 +08:00
Yin Fengwei	68269a559f	gpa2hva: add INVAVLID_HPA return value check For return value of local_gpa2hpa, either INVALID_HPA or NULL means the EPT walking failure. Current code only take care of NULL return and leave INVALID_HPA as correct case. In some cases (if guest page table is filled with invalid memory address), it could crash ACRN from guest. Add INVALID_HPA return check as well. Also add @pre assumptions for some gpa2hpa usages. Tracked-On: #4730 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2020-05-06 11:29:30 +08:00
Wei Liu	22aecf83e0	HV: modify CONFIG_HV_RAM_START for NUC7i7DNB When boot ACRN hypervisor from grub multiboot, HV will be loaded at CONFIG_HV_RAM_START since relocation is not supported in grub multiboot1. The CONFIG_HV_RAM_SIZE in industry scenario will take ~330MB(0x14000000), unfortunately the efi memmap on NUC7i7DNB is truncated at 0x6dba2000 although it is still usable from 0x6dba2000. So from grub point of view, it could not find a continuous memory from 0x6000000 to load industry scenario. Per efi memmap, there is a big memory area available from 0x40400000, so put CONFIG_HV_RAM_START to 0x41000000 is much safe for NUC7i7DNB. Tracked-On: #4641 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-05-06 11:25:57 +08:00
Minggui Cao	691a0e2e56	HV: add a specific stack space used in CPU booting The original stack used in CPU booting is: ld_bss_end + 4KB; which could be out of the RAM size limit defined in link_ram file. So add a specific stack space in link_ram file, and used in CPU booting. Tracked-On: #4738 Signed-off-by: Minggui Cao <minggui.cao@intel.com> Reviewed by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-29 13:56:40 +08:00
Li Fei1	4c733708bf	hv: lapic: minor refine about init_lapic According to SDM Vol 3, Chap 10.4.7.2 Local APIC State After It Has Been Software Disabled, The mask bits for all the LVT entries are set when the local APIC has been software disabled. So there's no need to mask all the LVT entries one by one. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-26 10:48:49 +08:00
Li Fei1	067b439e69	hv: irq: minor refine about structure idt_64_descriptor The 'value' field in structure idt_64_descriptor is no one used. We could remove it. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-26 10:48:49 +08:00
Zide Chen	9150284ca7	hv: replace vcpu_affinity array with cpu_affinity_bitmap Currently the vcpu_affinity[] array fixes the vCPU to pCPU mapping. While the new cpu_affinity_bitmap doesn't explicitly sepcify this mapping, instead, it implicitly assumes that vCPU0 maps to the pCPU with lowest pCPU ID, vCPU1 maps to the second lowest pCPU ID, and so on. This makes it possible for post-launched VM to run vCPUs on a subset of these pCPUs only, and not all of them. acrn-dm may launch post-launched VMs with the current approach: indicate VM UUID and hypervisor launches all VCPUs from the PCPUs that are masked in cpu_affinity_bitmap. Also acrn-dm can choose to launch the VM on a subset of PCPUs that is defined in cpu_affinity_bitmap. In this way, acrn-dm must specify the subset of PCPUs in the CREATE_VM hypercall. Additionally, with this change, a guest's vcpu_num can be easily calculated from cpu_affinity_bitmap, so don't assign vcpu_num in vm_configuration.c. Tracked-On: #4616 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-23 09:38:54 +08:00
Li Fei1	113f2f1e35	hv: vacpi: add ioapic madt table Add IOAPIC MADT table support so that guest could detect IOAPIC exist. Tracked-On: #4623 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-22 08:42:19 +08:00
Li Fei1	1dccbdbaa2	hv: vapic: add mcfg table support Add MCFG table support to allow guest access PCIe external CFG space by ECAM Tracked-On: #4623 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-22 08:42:19 +08:00
Li Fei1	4eb3f5a0c7	hv: vacpi: add fadt table support Add FADT table support to support guest S5 setting. According to ACPI 6.3 Spec, OSPM must ignored the DSDT and FACS fields if them're zero. However, Linux kernel seems not to abide by the protocol, it will check DSDT still. So add an empty DSDT to meet it. Tracked-On: #4623 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-22 08:42:19 +08:00
Victor Sun	e5561a5c71	HV: remove sdc2 scenario support Remove sdc2 scenario since the VM launch requirement under this scenario could be satisfied by industry scenario now; Tracked-On: #4661 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-20 14:59:23 +08:00
Victor Sun	a90890e9c7	HV: support up to 7 post launched VMs for industry scenario In industry scenario, hypervisor will support 1 post-launched RT VM and 1 post-launched kata VM and up to 5 post-launched standard VMs; Tracked-On: #4661 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-20 14:59:23 +08:00
Victor Sun	09212cf4b6	HV: Kconfig: enable CPU sharing by default The patch enables CPU sharing feature by default, the default scheduler is set to SCHED_BVT; Tracked-On: #4661 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-20 14:59:23 +08:00
Sainath Grandhi	60c4ec0c59	hv: Wake up vCPU for interrupts from vPIC Wake up vCPUs that are blocked upon interrupts from vPIC. Tracked-On: #4664 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-04-20 09:49:41 +08:00
Victor Sun	dfb947fe91	HV: fix wrong gpa start of hpa2 in ve820.c The current logic puts hpa2 above GPA 4G always, which is incorrect. Need to set gpa start of hpa2 right after hpa1 when hpa1 size is less then 2G; Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-17 14:08:54 +08:00
Victor Sun	fe6407155f	HV: set default MCFG base for generic board On most board the MCFG base is set to 0xe0000000, so modify this value in platform_acpi_info.h for generic boards; The description of ACPI_PARSE_ENABLED is modified also to match its usage. Tracked-On: #4157 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-17 13:49:58 +08:00
Victor Sun	3fd5fc9b51	Kconfig: remove MAX_KATA_VM_NUM CONFIG_MAX_KATA_VM_NUM is a scenario specific configuration, so it is better to put the MACRO in scenario folder directly, to instead the Kconfig item in Kconfig file which should work for all scenarios; Tracked-On: #4616 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-17 13:45:18 +08:00
Victor Sun	dba0591f72	Kconfig: change scenario variable type to string Basicly ACRN scenario is a configuration name for specific usage. By giving scenario name ACRN will load corresponding VM configurations to build the hypervisor. But customer might have their own scenario name, change the scenario type from choice to string is friendly to them since Kconfig source file change will not be needed. With this change, CONFIG_$(SCENARIO) will not exist in kconfig file and will be instead of CONFIG_SCENARIO, so the Makefile need to be changed accordingly; Tracked-On: #4616 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-17 13:45:18 +08:00
Victor Sun	55b50f408f	HV: init vm uuid and severity in macro Currently the vm uuid and severity is initilized separately in vm_config struct, developer need to take care both items carefuly otherwise hypervisor would have trouble with the configurations. Given the vm loader_order/uuid and severity are binded tightly, the patch merged these tree settings in one macro so that developer will have a simple interface to configure in vm_config struct. Tracked-On: #4616 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 13:45:18 +08:00
yuhong.tao@intel.com	7c80acee95	HV: emulate MSR_TEST_CTL If CPU has MSR_TEST_CTL, show an emulaued one to VCPU Tracked-On: #4496 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 09:53:59 +08:00
yuhong.tao@intel.com	dd3fa8ed75	HV: enable #AC for Splitlock Access If CPU support rise #AC for Splitlock Access, then enable this feature at each CPU. Tracked-On: #4496 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 09:53:59 +08:00
yuhong.tao@intel.com	ea1bce0cbf	HV: enumerate capability of #AC for Splitlock Access When the destination of an atomic memory operation located in 2 cache lines, it is called a Splitlock Access. LOCK# bus signal is asserted for splitlock access which may lead to long latency. #AC for Splitlock Access is a CPU feature, it allows rise alignment check exception #AC(0) instead of asserting LOCK#, that is helpful to detect Splitlock Access. This feature is enumerated by MSR(0xcf) IA32_CORE_CAPABILITIES[bit5] Add helper function: bool has_core_cap(uint32_t bitmask) Tracked-On: #4496 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 09:53:59 +08:00
Mingqiang Chi	f90100e382	hv: add pre-condition for vcpu APIs remove unnecessary state check and add pre-condition for vcpu APIs. Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-16 21:59:03 +08:00
Jason Chen CJ	0584981c03	hv:add pre-condition for vm APIs check the vm state in hypercall api, add pre-condition for vm api. Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-16 21:59:03 +08:00
Mingqiang Chi	fe929d0a10	hv: move out pause_vm from shutdown_vm now it will call pause_vm in shutdown_vm, move it out from shutdown_vm to reduce coupling. Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-16 21:59:03 +08:00
Yonghua Huang	84eaf94ae6	hv: wrap a function to initialize pCPU for second phase This patch wrapps a common function to initialize physical CPU for the second phase to reduce redundant code. Tracked-On: #861 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-04-16 14:02:29 +08:00
Xiaoguang Wu	d4f789f47e	hv: iommu: remove snoop related code ACRN disables Snoop Control in VT-d DMAR engines for simplifing the implementation. Also, since the snoop behavior of PCIE transactions can be controlled by guest drivers, some devices may take the advantage of the NO_SNOOP_ATTRIBUTE of PCIE transactions for better performance when snoop is not needed. No matter ACRN enables or disables Snoop Control, the DMA operations of passthrough devices behave correctly from guests' point of view. This patch is used to clean all the snoop related code. Tracked-On: #4509 Signed-off-by: Xiaoguang Wu <xiaoguang.wu@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-16 08:40:17 +08:00
Xiaoguang Wu	b4f1e5aa85	hv: iommu: disable snoop bit in EPT-PTE/SL-PTE Due to the fact that i915 iommu doesn't support snoop, hence it can't access memory when the SNOOP bit of Secondary Level page PTE (SL-PTE) is set, this will cause many undefined issues such as invisible cursor in WaaG etc. Current hv design uses EPT as Scondary Leval Page for iommu, and this patch removes the codes of setting SNOOP bit in both EPT-PTE and SL-PTE to avoid errors. And according to SDM 28.2.2, the SNOOP bit (11th bit) will be ignored by EPT, so it will not affect the CPU address translation. Tracked-On: #4509 Signed-off-by: Xiaoguang Wu <xiaoguang.wu@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-16 08:40:17 +08:00
Conghui Chen	84ad340898	hv: fix for waag 2 core reboot issue Waag will send NMIs to all its cores during reboot. But currently, NMI cannot be injected to vcpu which is in HLT state. To fix the problem, need to wakeup target vcpu, and inject NMI through interrupt-window. Tracked-On: #4620 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-15 14:42:00 +08:00
Binbin Wu	597f7658fc	hv: guest: fix bug in get_vcpu_paging_mode Align the implementation to SDM Vol.3 4.1.1. Also this patch fixed a bug that doesn't check paging status first in some cpu mode. Tracked-On: #4628 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-15 14:40:02 +08:00
Zide Chen	6040d8f6a2	hv: fix SOS vapic_id assignment issue Currently vlapic_build_id() uses vcpu_id to retrieve the lapic_id per_cpu variable: vlapic_id = per_cpu(lapic_id, vcpu->vcpu_id); SOS vcpu_id may not equal to pcpu_id, and in that case it runs into problems. For example, if any pre-launched VMs are launched on PCPUs whose IDs are smaller than any PCPU IDs that are used by SOS. This patch fixes the issue and simplify the code to create or get vapic_id by: - assign vapic_id in create_vlapic(), which now takes pcpu_id as input argument, and save it in the new field: vlapic->vapic_id, which will never be changed. - simplify vlapic_get_apicid() by returning te saved vapid_id directly. - remove vlapic_build_id(). - vlapic_init() is only called once, merge it into vlapic_create(). Tracked-On: #4268 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-15 14:34:15 +08:00
dongshen	00ad3863a1	hv: maintain a per-pCPU array of vCPUs and handle posted interrupt IRQs Maintain a per-pCPU array of vCPUs (struct acrn_vcpu *vcpu_array[CONFIG_MAX_VM_NUM]), one VM cannot have multiple vCPUs share one pcpu, so we can utilize this property and use the containing VM's vm_id as the index to the vCPU array: In create_vcpu(), we simply do: per_cpu(vcpu_array, pcpu_id)[vm->vm_id] = vcpu; In offline_vcpu(): per_cpu(vcpu_array, pcpuid_from_vcpu(vcpu))[vcpu->vm->vm_id] = NULL; so basically we use the containing VM's vm_id as the index to the vCPU array, as well as the index of posted interrupt IRQ/vector pair that are assigned to this vCPU: 0: first vCPU and first posted interrupt IRQs/vector pair (POSTED_INTR_IRQ/POSTED_INTR_VECTOR) ... CONFIG_MAX_VM_NUM-1: last vCPU and last posted interrupt IRQs/vector pair ((POSTED_INTR_IRQ + CONFIG_MAX_VM_NUM - 1U)/(POSTED_INTR_VECTOR + CONFIG_MAX_VM_NUM - 1U) In the posted interrupt handler, it will do the following: Translate the IRQ into a zero based index of where the vCPU is located in the vCPU list for current pCPU. Once the vCPU is found, we wake up the waiting thread and record this request as ACRN_REQUEST_EVENT Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com> Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-04-15 13:47:22 +08:00
dongshen	14fa9c563c	hv: define posted interrupt IRQs/vectors This is a preparation patch for adding support for VT-d PI related vCPU scheduling. ACRN does not support vCPU migration, one vCPU always runs on the same pCPU, so PI's ndst is never changed after startup. VCPUs of a VM won’t share same pCPU. So the maximum possible number of VCPUs that can run on a pCPU is CONFIG_MAX_VM_NUM. Allocate unique Activation Notification Vectors (ANV) for each vCPU that belongs to the same pCPU, the ANVs need only be unique within each pCPU, not across all vCPUs. This reduces # of pre-allocated ANVs for posted interrupts to CONFIG_MAX_VM_NUM, and enables ACRN to avoid switching between active and wake-up vector values in the posted interrupt descriptor on vCPU scheduling state changes. A total of CONFIG_MAX_VM_NUM consecutive IRQs/vectors are reserved for posted interrupts use. The code first initializes vcpu->arch.pid.control.bits.nv dynamically (will be added in subsequent patch), the other code shall use vcpu->arch.pid.control.bits.nv instead of the hard-coded notification vectors. Rename some functions: apicv_post_intr --> apicv_trigger_pi_anv posted_intr_notification --> handle_pi_notification setup_posted_intr_notification --> setup_pi_notification Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
dongshen	c2d350c5cc	hv: enable VT-d PI for ptdev if intr_src->pid_addr is non-zero Fill in posted interrupt fields (vector, pda, etc) and set mode to 1 to enable VT-d PI (posted mode) for this ptdev. If intr_src->pi_vcpu is 0, fall back to use the remapped mode. Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
dongshen	f7be985a23	hv: check if the IRQ is intended for a single destination vCPU Given the vcpumask, check if the IRQ is single destination and return the destination vCPU if so, the address of associated PI descriptor for this vCPU can then be passed to dmar_assign_irte() to set up the posted interrupt IRTE for this device. For fixed mode interrupt delivery, all vCPUs listed in vcpumask should service the interrupt requested. But VT-d PI cannot support multicast/broadcast IRQs, it only supports single CPU destination. So the number of vCPUs shall be 1 in order to handle IRQ in posted mode for this device. Add pid_paddr to struct intr_source. If platform_caps.pi is true and the IRQ is single-destination, pass the physical address of the destination vCPU's PID to ptirq_build_physical_msi and dmar_assign_irte Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
dongshen	6496da7c56	hv: add function to check if using posted interrupt is possible for vm Add platform_caps.c to maintain platform related information Set platform_caps.pi to true if all iommus are posted interrupt capable, false otherwise If lapic passthru is not configured and platform_caps.pi is true, the vm may be able to use posted interrupt for a ptdev, if the ptdev's IRQ is single-destination Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
Sainath Grandhi	47f883db30	hv: Hypervisor access to PCI devices with 64-bit MMIO BARs PCI devices with 64-bit MMIO BARs and requiring large MMIO space can be assigned with physical address range at the very high end of platform supported physical address space. This patch uses the board info for 64-bit MMIO window as programmed by BIOS and constructs 1G page tables for the same. As ACRN uses identity mapping from Linear to Physical address space physical addresses upto 48 bit or 256TB can be supported. Tracked-On: #4586 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-13 16:52:18 +08:00
Sainath Grandhi	1c21f747be	hv: Add HI_MMIO_START and HI_MMIO_END macros to board files Add 64-bit MMIO window related MACROs to the supported board files in the hypervisor source code. Tracked-On: #4586 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-04-13 16:52:18 +08:00
Jian Jun Chen	159c9ec759	hv: add lock for ept add/modify/del EPT table can be changed concurrently by more than one vcpus. This patch add a lock to protect the add/modify/delete operations from different vcpus concurrently. Tracked-On: #4253 Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2020-04-13 11:38:55 +08:00
Li Fei1	74edf2e54b	hv: vmcs: remove vmcs field check for a vcpu The VMCS field is an embedded array for a vCPU. So there's no need to check for NULL before use. Tracked-On: #3813 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-09 09:40:26 +08:00
Li Fei1	366214e567	hv: virq: refine pending event inject sequence Inject pending exception prior pending interrupt to complete the previous instruction. Tracked-On: #1842 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-09 09:40:00 +08:00
Li Fei1	572f755037	hv: vm: refine the devices unregistration sequence of vm shutdown Conceptually, the devices unregistration sequence of the shutdown process should be opposite to create. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-08 10:13:37 +08:00
Sainath Grandhi	5958d6f65f	hv: Fix issues with the patch to reserve EPT 4K pages after boot This patch fixes couple of minor issues with patch `8ffe6fc6` Tracked-On: #4563 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-04-03 11:06:14 +08:00
Yan, Like	70fa6dce53	hv: config: enable RDT for apl-up2 by default Tracked-On: #4566 Signed-off-by: Yan, Like <like.yan@intel.com>	2020-04-02 13:55:35 +08:00
Yan, Like	2997c4b570	HV: CAT: support cache allocation for each vcpu This commit allows hypervisor to allocate cache to vcpu by assigning different clos to vcpus of a same VM. For example, we could allocate different cache to housekeeping core and real-time core of an RTVM in order to isolate the interference of housekeeping core via cache hierarchy. Tracked-On: #4566 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Chen, Zide <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-02 13:55:35 +08:00
Binbin Wu	fcd9a1ca73	hv: vtd: use local var instead of global var In dmar_issue_qi_request, currently use a global var qi_status, which could cause potential issue when concurrent call to dmar_issue_qi_request for different DMAR units. Use local var instead. Tracked-On: #4535 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-02 11:31:40 +08:00
Sainath Grandhi	8ffe6fc67a	hv: Reserve space for VMs' EPT 4k pages after boot As ACRN prepares to support servers with large amounts of memory current logic to allocate space for 4K pages of EPT at compile time will increase the size of .bss section of ACRN binary. Bootloaders could run into a situation where they cannot find enough contiguous space to load ACRN binary under 4GB, which is typically heavily fragmented with E820 types Reserved, ACPI data, 32-bit PCI hole etc. This patch does the following 1) Works only for "direct" mode of vboot 2) reserves space for 4K pages of EPT, after boot by parsing platform E820 table, for all types of VMs. Size comparison: w/o patch Size of DRAM Size of .bss 48 GB 0xe1bbc98 (~226 MB) 128 GB 0x222abc98 (~548 MB) w/ patch Size of DRAM Size of .bss 48 GB 0x1991c98 (~26 MB) 128 GB 0x1a81c98 (~28 MB) Tracked-On: #4563 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 21:13:37 +08:00
Qian Wang	c20228d36f	HV: simplified the logic of dmar_wait_completion hv: vtd: simplified the logic of dmar_wait_completion Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	60704a5d9c	HV: renamed some static functions related to dmar hv: vtd: renamed some static functions from dmar_verb to verb_dmar Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	c2bcf9fade	HV: simplified the logic of iommu_read/write64 hv: vtd: simplified the logic of iommu_read/write64 Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	a1e081073f	HV: Corrected return type of two static functions hv: vtd: corrected the return type of get_qi_queue and get_ir_table to void * Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	b55f414a9d	HV: Removed unused member variable of iommu_domain and related code hv: vtd: removed is_host (always false) and is_tt_ept (always true) member variables of struct iommu_domain and related codes since the values are always determined. Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Li Fei1	ea2616fbbf	hv: vlapic: minor fix about dereference vcpu from vlapic Since vcpu if remove from vlapic, we could not dereference it directly. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 15:59:52 +08:00
Li Fei1	2b7168da9e	hv: vmtrr: remove vcpu structure pointer from vmtrr We could use container_of to get vcpu structure pointer from vmtrr. So vcpu structure pointer is no need in vmtrr structure. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 10:57:47 +08:00
Li Fei1	a7768fdb6a	hv: vlapic: remove vcpu/vm structure pointer from vlapic We could use container_of to get vcpu/vm structure pointer from vlapic. So vcpu/vm structure pointer is no need in vlapic structure. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 10:57:47 +08:00
Li Fei1	7f342bf62f	hv: list: rename list_entry to container_of This function casts a member of a structure out to the containing structure. So rename to container_of is more readable. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 10:57:47 +08:00
dongshen	1328dcb205	hv: extend union dmar_ir_entry to support VT-d posted interrupts Exend union dmar_ir_entry to support VT-d posted interrupts. Rename some fields of union dmar_ir_entry: entry --> value sw_bits --> avail Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	016c1a5073	hv: pass pointer to functions Pass intr_src and dmar_ir_entry irte as pointers to dmar_assign_irte(), which fixes the "Attempt to change parameter passed by value" MISRA C violation. A few coding style fixes Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	0f3c876a91	hv: extend struct pi_desc to support VT-d posted interrupts For CPU side posted interrupts, it only uses bit 0 (ON) of the PI's 64-bit control , other bits are don't care. This is not the case for VT-d posted interrupts, define more bit fields for the PI's 64-bit control. Use bitmap functions to manipulate the bit fields atomically. Some MISRA-C violation and coding style fixes Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	8f732f2809	hv: move pi_desc related code from vlapic.h/vlapic.c to vmx.h/vmx.c/vcpu.h The posted interrupt descriptor is more of a vmx/vmcs concept than a vlapic concept. struct acrn_vcpu_arch stores the vmx/vmcs info, so put struct pi_desc in struct acrn_vcpu_arch. Remove the function apicv_get_pir_desc_paddr() A few coding style/typo fixes Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	b384d04ad1	hv: rename vlapic_pir_desc to pi_desc Rename struct vlapic_pir_desc to pi_desc Rename struct member and local variable pir_desc to pid pir=posted interrupt request, pi=posted interrupt pid=posted interrupt descriptor pir is part of pi descriptor, so it is better to use pi instead of pir struct pi_desc will be moved to vmx.h in subsequent commit. Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
Zide Chen	eef3b51eda	hv: move error message logging into gpa copy APIs In this way, the code looks simpler and line of code is reduced. Tracked-On: #3854 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-30 13:19:01 +08:00
Li Fei1	4512ef7ec9	hv: cpuid: remove cpuid() The cupid() can be replaced with cupid_subleaf, which is more clear. Having both APIs makes reading difficult. Tracked-On: #4526 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-25 13:26:58 +08:00
Sainath Grandhi	6b517c58f1	hv: Server platforms can have more than 8 IO-APICs To support server platforms with more than 8 IO-APICs Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	fe5a108c7b	hv: vioapic init for SOS VM on platforms with multiple IO-APICs For SOS VM, when the target platform has multiple IO-APICs, there should be equal number of virtual IO-APICs. This patch adds support for emulating multiple vIOAPICs per VM. Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	f67ac09141	hv: Handle holes in GSI i.e. Global System Interrupt for multiple IO-APICs MADT is used to specify the GSI base for each IO-APIC and the number of interrupt pins per IO-APIC is programmed into Max. Redir. Entry register of that IO-APIC. On platforms with multiple IO-APICs, there can be holes in the GSI space. For example, on a platform with 2 IO-APICs, the following configuration has a hole (from 24 to 31) in the GSI space. IO-APIC 1: GSI base - 0, number of pins - 24 IO-APIC 2: GSI base - 32, number of pins - 8 This patch also adjusts the size for variables used to represent the total number of IO-APICs on the system from uint16_t to uint8_t as the ACPI MADT uses only 8-bits to indicate the unique IO-APIC IDs. Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	85217e362f	hv: Introduce Global System Interrupt (GSI) into INTx Remapping As ACRN prepares to support platforms with multiple IO-APICs, GSI is a better way to represent physical and virtual INTx interrupt source. 1) This patch replaces usage of "pin" with "gsi" whereever applicable across the modules. 2) PIC pin to gsi is trickier and needs to consider the usage of "Interrupt Source Override" structure in ACPI for the corresponding VM. Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	dd6c80c305	hv: Move error checking for hypercall parameters out of assign module Moving checks on validity of IOAPIC interrupt remapping hypercall parameters to hypercall module Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	06b59e0bc1	hv: Use ptirq_lookup_entry_by_sid to lookup virtual source id in IOAPIC irq entries Reverts `538ba08c`: hv:Add vpin to ptdev entry mapping for vpic/vioapic ACRN uses an array of size per VM to store ptirq entries against the vIOAPIC pin and an array of size per VM to store ptirq entries against the vPIC pin. This is done to speed up "ptirq entry" lookup at runtime for Level triggered interrupts in API ptirq_intx_ack used on EOI. This patch switches the lookup API for INTx interrupts to the API, ptirq_lookup_entry_by_sid This could add delay to processing EOI for Level triggered interrupts. Trade-off here is space saved for array/s of size CONFIG_MAX_IOAPIC_LINES with 8 bytes per data. On a server platform, ACRN needs to emulate multiple vIOAPICs for SOS VM, same as the number of physical IO-APICs. Thereby ACRN would need around 10 such arrays per VM. Removes the need of "pic_pin" except for the APIs facing the hypercalls hcall_set_ptdev_intr_info, hcall_reset_ptdev_intr_info Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Victor Sun	52f26cba8a	hv: a few fixes for multiboot2 boot - need to specify the load_addr in the multiboot2 address tag. GRUB needs it to correctly calculate the ACRN binary's load size if load_end_addr is a non-zero value. - multiboot2 can be enabled if hypervisor relocation is disabled. - print the name of the boot loader. This might be helpful if the boot loader, e.g. GRUB, inludes its version in the name string. Tracked-On: #4441 Signed-off-by: Victor Sun <victor.sun@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>	2020-03-24 08:44:20 +08:00
Li Fei1	e5c7a96513	hv: vpci: sos could access low severity guest pci cfg space There're some cases the SOS (higher severity guest) needs to access the post-launched VM (lower severity guest) PCI CFG space: 1. The SR-IOV PF needs to reset the VF 2. Some pass through device still need DM to handle some quirk. In the case a device is assigned to a UOS and is not in a zombie state, the SOS is able to access, if and only if the SOS has higher severity than the UOS. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-20 10:08:43 +08:00
Mingqiang Chi	14692ef60c	hv:Rename two VM states Rename: VM_STARTED --> VM_RUNNING VM_POWERING_OFF --> VM_READY_TO_POWEROFF Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-13 10:34:29 +08:00
Victor Sun	a8c2ba03fc	HV: add pci_devices.h for nuc6cayh and apl-up2 As pci_devices.h is included by <page.h>, need to prepare pci_devices.h for nuc6cayh and apl-up2 board. Also the #error info in generic/pci_devices.h should be removed, otherwise the build will be failed in sdc/sdc2/industry scenarios. Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	a68f655a11	HV: update ept address range for pre-launched VM For a pre-launched VM, a region from PTDEV_HI_MMIO_START is used to store 64bit vBARs of PT devices which address is high than 4G. The region should be located after all user memory space and be coverd by guest EPT address. Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	e74553492a	HV: move create_sos_vm_e820 to ve820.c ve820.c is a common file in arch/x86/guest/ now, so move function of create_sos_vm_e820() to this file to make code structure clear; Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	a7b61d2511	HV: remove board specific ve820 Remove useless per board ve820.c as arch/x86/guest/ve820.c is common for all boards now; Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	d7eac3fe6a	HV: decouple prelaunch VM ve820 from board configs hypervisor/arch/x86/configs/$(BOARD)/ve820.c is used to store pre-launched VM specific e820 entries according to memory configuration of customer. It should be a scenario based configurations but we had to put it in per board foler because of different board memory settings. This brings concerns to customer on configuration orgnization. Currently the file provides same e820 layout for all pre-launched VMs, but they should have different e820 when their memory are configured differently. Although we have acrn-config tool to generate ve802.c automatically, it is not friendly to modify hardcoded ve820 layout manually, so the patch changes the entries initialization method by calculating each entry item in C code. Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	4c0965d89e	HV: correct ept page array usage Currently ept_pages_info[] is initialized with first element only that force VM of id 0 using SOS EPT pages. This is incorrect for logical partition and hybrid scenario. Considering SOS_RAM_SIZE and UOS_RAM_SIZE are configured separately, we should use different ept pages accordingly. So, the PRE_VM_NUM/SOS_VM_NUM and MAX_POST_VM_NUM macros are introduced to resolve this issue. The macros would be generated by acrn-config tool when user configure ACRN for their specific scenario. One more thing, that when UOS_RAM_SIZE is less then 2GB, the EPT address range should be (4G + PLATFORM_HI_MMIO_SIZE). Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Mingqiang Chi	790614e952	hv:rename several variables and api for ioapic rename: ioapic_get_gsi_irq_addr --> gsi_to_ioapic_base ioapic_addr -->ioapic_base Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-03-11 13:26:15 +08:00
Li Fei1	e5ae37eb69	hv: mmu: minor fix about add_pte In Commit `127c73c3`, we remove the strict check for adding page table mapping. However, we just replace the ASSERT of pr_fatal in add_pte. This is not enough. We still add the virtual address by 4K if the page table mapping is exist and check the virtual address is over the virtual address region for this mapping. Otherwise, The complain will continue for 512 times at most. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-09 10:03:01 +08:00
Sainath Grandhi	460e7ee5b1	hv: Variable/macro renaming for intr handling of PT devices using IO-APIC/PIC 1. Renames DEFINE_IOAPIC_SID with DEFINE_INTX_SID as the virtual source can be IOAPIC or PIC 2. Rename the src member of source_id.intx_id to ctlr to indicate interrupt controller 2. Changes the type of src member of source_id.intx_id from uint32_t to enum with INTX_CTLR_IOAPIC and INTX_CTLR_PIC Tracked-On: #4447 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-03-06 11:29:02 +08:00
Victor Sun	b6684f5b61	HV: sanitize config file for whl-ipc-i5 - remove limit of CONFIG_HV_RAM_SIZE which is for scenario of 2 VMs only, the default size from Kconfig could build scenario which up to 5 VMs; - rename whl-ipc-i5_acpi_info.h to platform_acpi_info.h, since the former one should be generated by acrn-config tool; - add SOS related macros in misc.h, otherwise build scenarios which has SOS VM would be failed; Tracked-On: #4463 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-03-06 08:34:12 +08:00
Zide Chen	67cb1029d9	hv: update the hypervisor 64-bit entry address for efi-stub - remove .data and .text directives. We want to place all the boot data and text in the .entry section since the boot code is different from others in terms of relocation fixup. With this change, the page tables are in entry section now and it's aligned at 4KB. - regardless CONFIG_MULTIBOOT2 is set or not, the 64-bit entry offset is fixed at 0x1200: 0x00 -- 0x10: Multiboot1 header 0x10 -- 0x88: Multiboot2 header if CONFIG_MULTIBOOT2 is set 0x1000: start of entry section: cpu_primary_start_32 0x1200: cpu_primary_start_64 (thanks to the '.org 0x200' directive) GDT tables initial page tables etc. Tracked-On: #4441 Reviewed-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-06 08:27:46 +08:00
Zide Chen	49ffe168af	hv: fixup relocation delta for symbols belong to entry section This is to enable relocation for code32. - RIP relative addressing is available in x86-64 only so we manually add relocation delta to the target symbols to fixup code32. - both code32 and code64 need to load GDT hence both need to fixup GDT pointer. This patch declares separate GDT pointer cpu_primary64_gdt_ptr for code64 to avoid double fixup. - manually fixup cpu_primary64_gdt_ptr in code64, but not rely on relocate() to do that. Otherwise it's very confusing that symbols from same file could be fixed up externally by relocate() or self-relocated. - to make it clear, define a new symbol ld_entry_end representing the end of the boot code that needs manually fixup, and use this symbol in relocate() to filter out all symbols belong to the entry sections. Tracked-On: #4441 Reviewed-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-06 08:27:46 +08:00
Chen, Zide	2aa8c9e5d4	hv: add multiboot2 tags to load relocatable raw binary GRUB multiboot2 doesn't support relocation for ELF, which means it can't load acrn.32.out to other address other than the one specified in ELF header. Thus we need to use the raw binary file acrn.bin, and add address/entry address/relocatable tags to instruct multiboot2 loader how to load the raw binary. Tracked-On: #4441 Reviewed-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-06 08:27:46 +08:00
Chen, Zide	97fc0efe20	hv: remove unused cpu_primary_save_32() In direct boot mode, boot_context[] which is saved from cpu_primary_save_32() is no longer used since commit `6beb34c3cb` ("vm_load: update init gdt preparation"). Thus, the call to it and the function itself can be removed. Tracked-On: #4441 Reviewed-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-06 08:27:46 +08:00
Li Fei1	7c82efb938	hv: pci: add some pre-assumption and safety check for PCIe ECAM Add some pre-assumption and safety check for PCIe ECAM: 1) ACRN only support platforms with PCIe ECAM to access PCIe device CFG space; 2) Must not use ECAM to access PCIe device CFG space before pci_switch_to_mmio_cfg_ops was called. (In release version, ACRN didn't support IO port Mechanism. ECAM is the only way to access the PCIe device CFG space). Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-05 15:42:53 +08:00
Zide Chen	93fa2bc0fc	hv: minor fixes in init_paging() - change variable name from hpa to hva because in this function we are dealing with hva, not hpa. - can get the address of ld_text_end by directly referring to this symbol, because relative addressing yields the correct hva, not the hva before relocation. Tracked-On: #4441 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-05 10:18:56 +08:00
Vijay Dhanraj	b6c0558b60	HV: Update existing board.c files for RDT MBA This patch updates board.c files for RDT MBA on existing platforms. Also, fixes setting RDT flag in WHL config file. Tracked-On: #3725 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-04 17:33:50 +08:00
Vijay Dhanraj	92ee33b035	HV: Add MBA support in ACRN This patch adds RDT MBA support to detect, configure and and setup MBA throttle registers based on VM configuration. Tracked-On: #3725 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-04 17:33:50 +08:00
Yuan Liu	320ed6c238	hv: refine init_one_dev_config The init_one_dev_config is used to initialize a acrn_vm_pci_dev_config SRIOV needs a explicit acrn_vm_pci_dev_config to create a VF vdev,so refine it to return acrn_vm_pci_dev_config. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Conghui Chen	595cefe3f2	hv: xsave: move assembler to individual function Current code avoid the rule 88 S in MISRA-C, so move xsaves and xrstors assembler to individual functions. Tracked-On: #4436 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 17:55:06 +08:00
Yuan Liu	5e989f13c6	hv: check if there is enough room for all SRIOV VFs. Make the SRIOV-Capable device invisible from SOS if there is no room for its all virtual functions. v2: fix a issue that if a PF has been dropped, the subsequent PF will be dropped too even there is room for its VFs. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 14:04:01 +08:00
Zide Chen	c751a8e88b	hv: refine confusing e820 table logging layout It puts the new line in the wrong place, and the logs are confusing. For example, for these entries: mmap[0] - type: 1, base: 0x00000, length: 0x9800 mmap[1] - type: 2, base: 0x98000, length: 0x8000 mmap[2] - type: 3, base: 0xc0000, length: 0x4000 Currently it prints them in this way: mmap table: 0 type: 0x1 Base: 0x0000000000000000 length: 0x0000000000098000 mmap table: 1 type: 0x2 Base: 0x0000000000098000 length: 0x0000000000008000 mmap table: 2 type: 0x3 Base: 0x00000000000c0000 length: 0x0000000000040000 With this fix, it looks like the following, and now it's of same style with how prepare_sos_vm_memmap() logs ve820 tables. mmap table: 0 type: 0x1 Base: 0x0000000000000000 length: 0x0000000000098000 mmap table: 1 type: 0x2 Base: 0x0000000000098000 length: 0x0000000000008000 mmap table: 2 type: 0x3 Base: 0x00000000000c0000 length: 0x0000000000040000 Tracked-On: #1842 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-02-28 09:34:17 +08:00
Conghui Chen	c246d1c9b8	hv: xsave: bugfix for init value The init value for XCR0 and XSS should be the same with spec: In SDM Vol1 13.3: XCR0[0] is associated with x87 state (see Section 13.5.1). XCR0[0] is always 1. The other bits in XCR0 are all 0 coming out of RESET. The IA32_XSS MSR (with MSR index DA0H) is zero coming out of RESET. The previous code try to fix the xsave area leak to other VMs during init phase, but bring the error to linux. Besides, it cannot avoid the possible leak in running phase. Need find a better solution. Tracked-On: #4430 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 09:19:29 +08:00
Vijay Dhanraj	cef3322da8	HV: Add WhiskeyLake board configuration files This patch adds offline tool generated WhiskeyLake board configurations files. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	eaad91fd71	HV: Remove RDT code if CONFIG_RDT_ENABLED flag is not set This patch does the following, 1. Removes RDT code if CONFIG_RDT_ENABLED flag is not set. 2. Set the CONFIG_RDT_ENABLED flag only on platforms that support RDT so that build scripts will automatically reflect the config. Tracked-On: #3715 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	d0665fe220	HV: Generalize RDT infrastructure and fix RDT cache configuration. This patch creates a generic infrastructure for RDT resources instead of just L2 or L3 cache. This patch also fixes L3 CAT config overwrite by L2 in cases where both L2 and L3 CAT are supported. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	887e3813bc	HV: Add both HW and SW checks for RDT support There can be times when user unknowinlgy enables CONFIG_CAT_ENBALED SW flag, but the hardware might not support L3 or L2 CAT. In such case software can end up writing to the CAT MSRs which can cause undefined results. The patch fixes the issue by enabling CAT only when both HW as well software via the CONFIG_CAT_ENABLED supports CAT. The patch also address typo with "clos2prq_msr" function name. It should be "clos2pqr_msr" instead. PQR stands for platform qos register. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	b8a021d658	HV: split L2 and L3 cache resource MSR Upcoming intel platforms can support both L2 and L3 but our current code only supports either L2 or L3 CAT. So split the MSRs so that we can support allocation for both L2 and L3. This patch does the following, 1. splits programming of L2 and L3 cache resource based on the resource ID. 2. Replace generic platform_clos_array struct with resource specific struct in all the existing board.c files. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	2597429903	HV: Rename cat.c/.h files to rdt.c/.h As part of rdt cat refactoring, goal is to combine all rdt specific features such as CAT under one module. So renaming rdt resouce specific files such as cat.c/.h to generic rdt.c/.h files. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Yonghua Huang	b2c6cf7753	hv: refine retpoline speculation barriers Per Section 4.4 Speculation Barriers, in "Retpoline: A Branch Target Inject Mitigation" white paper, "LFENCE instruction limits the speculative execution that a processor implementation can perform around the LFENCE, possibly impacting processor performance,but also creating a tool with which to mitigate speculative-execution side-channel attacks." Tracked-On: #4424 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-02-26 09:24:54 +08:00
Victor Sun	da3d181f62	HV: init efi info with multiboot2 Initialize efi info of acrn mbi when boot from multiboot2 protocol, with this patch hypervisor could get host efi info and pass it to Linux zeropage, then make guest Linux possible to boot with efi environment; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	69da0243f5	HV: init module and rsdp info with multiboot2 Initialize module info and ACPI rsdp info of acrn mbi when boot from multiboot2 protocol, with this patch SOS VM could be loaded sucessfully with correct ACPI RSDP; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	d008b72fdd	HV: add multiboot2 header info Add multiboot2 header info in HV image so that bootloader could recognize it. Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	19ffaa50dc	HV: init and sanitize acrn multiboot info Initialize and sanitize a acrn specific multiboot info struct with current supported multiboot1 in very early boot stage, which would bring below benifits: - don't need to do hpa2hva convention every time when refering boot_regs; - panic early if failed to sanitize multiboot info, so that don't need to check multiboot info pointer/flags and panic in later boot process; - keep most code unchanged when introduce multiboot2 support in future; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	520a0222d3	HV: re-arch boot component header The patch re-arch boot component header files by: - moving multiboot.h from include/arch/x86/ to boot/include/ and keep this header for multiboot1 protocol data struct only; - moving multiboot related MACROs in cpu_primary.S to multiboot.h; - creating an independent boot.h to store acrn specific boot information for other files' reference; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	708cae7c88	HV: remove DBG_LEVEL_PARSE - It is meaningless to enable debug function in parse_hv_cmdline() because the function run in very eary stage and uart has not been initialized at that time, so remove this debug level definition; - Rewrite parse_hv_cmdline() function to make it compliant with MISRA-C; - Decouple uart16550 stuff from Init.c module and let console.c handle it; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Conghui Chen	a7563cb9bd	hv: sched_bvt: add BVT scheduler BVT (Borrowed virtual time) scheduler is used to schedule vCPUs on pCPU. It has the concept of virtual time, vCPU with earliset virtual time is dispatched first. Main concepts: tick timer: a period tick is used to measure the physcial time in units of MCU (minimum charing unit). runqueue: thread in the runqueue is ordered by virtual time. weight: each thread receives a share of the pCPU in proportion to its weight. context switch allowance: the physcial time by which the current thread is allowed to advance beyond the next runnable thread. warp: a thread with warp enabled will have a change to minus a value (Wi) from virtual time to achieve higher priority. virtual time: AVT: actual virtual time, advance in proportional to weight. EVT: effective virtual time. EVT <- AVT - ( warp ? Wi : 0 ) SVT: scheduler virtual time, the minimum AVT in the runqueue. Tracked-On: #4410 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-25 09:11:32 +08:00
Yonghua Huang	64b874ce4c	hv: rename BOOT_CPU_ID to BSP_CPU_ID 1. Rename BOOT_CPU_ID to BSP_CPU_ID 2. Repace hardcoded value with BSP_CPU_ID when ID of BSP is referenced. Tracked-On: #4420 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-02-25 09:08:14 +08:00
Li Fei1	e8479f84cd	hv: vPCI: remove passthrough PCI device unuse code Now we split passthrough PCI device from DM to HV, we could remove all the passthrough PCI device unused code. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-24 16:17:38 +08:00
Li Fei1	dafa3da693	vPCI: split passthrough PCI device from DM to HV In this case, we could handle all the passthrough PCI devices in ACRN hypervisor. But we still need DM to initialize BAR resources and Intx for passthrough PCI device for post-launched VM since these informations should been filled into ACPI tables. So 1. we add a HC vm_assign_pcidev to pass the extra informations to replace the old vm_assign_ptdev. 2. we saso remove HC vm_set_ptdev_msix_info since it could been setted by the post-launched VM now same as SOS. 3. remove vm_map_ptdev_mmio call for PTDev in DM since ACRN hypervisor will handle these BAR access. 4. the most important thing is to trap PCI configure space access for PTDev in HV for post-launched VM and bypass the virtual PCI device configure space access to DM. This patch doesn't do the clean work. Will do it in the next patch. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-02-24 16:17:38 +08:00
Li Fei1	fe3182ea05	hv: vPCI: add assign/deassign PCI device HC APIs Add assign/deassign PCI device hypercall APIs to assign a PCI device from SOS to post-launched VM or deassign a PCI device from post-launched VM to SOS. This patch is prepared for spliting passthrough PCI device from DM to HV. The old assign/deassign ptdev APIs will be discarded. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-02-24 16:17:38 +08:00
Wei Liu	f3a4b2325f	hv: add P2SB device to whitelist for apl-mrb apl-mrb need to access P2SB device, so add 00:0d.0 P2SB device to whitelist for platform pci hidden device. Tracked-On: #3475 Signed-off-by: Wei Liu <weix.w.liu@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Victor Sun <victor.sun@intel.com>	2020-02-24 12:21:29 +08:00
Junming Liu	1303861d26	hv:enable gpu iommu except APL platforms To enable gvt-d,need to allow the GPU IOMMU. While gvt-d hasn't been enabled on APL yet, so let APL disable GPU IOMMU. v2 -> v3: * let APL platforms disable GPU IOMMU. Tracked-On: #4405 Signed-off-by: Junming Liu <junming.liu@intel.com> Reviewed-by: Wu Binbin <binbin.wu@intel.com>	2020-02-24 11:47:10 +08:00
Junming Liu	1f1eb7fdba	hv:disable iommu snoop control to enable gvt-d by an option If one of the enabled VT-d DMAR units doesn’t support snoop control, then bit 11 of leaf PET of EPT is not set, since the field is treated as reserved(0) by VT-d hardware implementations not supporting snoop control. GUP IOMMU doesn’t support snoop control, this patch add an option to disable iommu snoop control for gvt-d. v2 -> v3: * refine the MICRO name and description. Tracked-On: #4405 Signed-off-by: Junming Liu <junming.liu@intel.com> Reviewed-by: Wu Binbin <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-24 11:47:10 +08:00
Shuo A Liu	53de3a727c	hv: reset vcpu events in reset_vcpu On UEFI UP2 board, APs might execute HLT before SOS kernel INIT them. After SOS kernel take over and will re-init the APs directly. The flows from HV perspective is like: HLT trap: wait_event(VCPU_EVENT_VIRTUAL_INTERRUPT) -> sleep_thread SOS kernel INIT, SIPI APs: pause_vcpu(ZOMBIE) -> sleep_thread -> reset_vcpu -> launch_vcpu -> wake_vcpu However, the last wake_vcpu will fail because the cpu event VCPU_EVENT_VIRTUAL_INTERRUPT had not got signaled. This patch will reset all vcpu events in reset_vcpu. If the thread was previously waiting for a event, its waiting status will be cleared and launch_vcpu will wake it to running. Tracked-On: #4402 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-23 16:27:57 +08:00
Zide Chen	cc6f094926	hv: CAT is supposed to be enabled in the system level In platforms that support CAT, when it is enabled by ACRN, i.e. IA32_resourceType_MASK_n registers are programmed with customized values, it has impacts to the whole system. The per guest flag GUEST_FLAG_CLOS_REQUIRED suggests that CAT may be enabled in some guests, but not in others who don't have this flag, which is conceptually incorrect. This patch removes GUEST_FLAG_CLOS_REQUIRED, and adds a new Kconfig entry CAT_ENABLED for CAT enabling. When it's enabled, platform_clos_array[] defines a set of system-wide Class of Service (COS, or CLOS), and the per guest vm_configs[].clos associates the guest with particular CLOS. Tracked-On: #2462 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-02-17 08:51:59 +08:00
Zide Chen	f3249e77bd	hv: enable early pr_xxx() logs Currently panic() and pr_xxx() statements before init_primary_pcpu_post() won't be printed, which is inconvenient and misleading for debugging. This patch makes pr_xxx() APIs working before init_pcpu_pre(): - clear .bss in init.c, which makes sense to clear .bss at the very beginning of initialization code. Also this makes it possible to call init_logmsg() before init_pcpu_pre(). - move parse_hv_cmdline() and uart16550_init(true) to init.c. - refine ticks_to_us() to handle the case that it's called before calibrate_tsc(). As a side effect, it prints "0us" in early pr_xxx() calls. - call init_debug_pre() in init_primary_pcpu() and after this point, both printf() and pr_xxx() APIs are available. However, this patch doesn't address the issue that pr_xxx() could be called on PCPUs that set_current_pcpu_id() hasn't been called, which implies that the PCPU ID shown in early logs may not be accurate. Tracked-On: #2987 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-11 08:53:56 +08:00
Zide Chen	086e0f19d8	hv: fix pcpu_id mask issue in smp_call_function() INVALID_BIT_INDEX has 16 bits only, which removes all pcpu_id that is >= 16 from the destination mask. Tracked-On: #4354 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-01-17 09:20:53 +08:00
Yonghua Huang	fd4775d044	hv: rename VECTOR_XXX and XXX_IRQ Macros 1. Align the coding style for these MACROs 2. Align the values of fixed VECTORs Tracked-On: #4348 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-01-14 10:21:23 +08:00
Yonghua Huang	b90862921e	hv: rename the ACRN_DBG_XXX Refine this MACRO 'ACRN_DBG_XXX' to 'DBG_LEVEL_XXX' Tracked-On: #4348 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-01-14 10:21:23 +08:00
Shuo A Liu	b59e5a870a	hv: Disable HLT and PAUSE-loop exiting emulation in lapic passthrough In lapic passthrough mode, it should passthrough HLT/PAUSE execution too. This patch disable their emulation when switch to lapic passthrough mode. Tracked-On: #4329 Tested-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-13 10:16:30 +08:00
Shuo A Liu	db708fc3e8	hv: rename is_completion_polling to is_polling_ioreq is_polling_ioreq is more straightforward. Rename it. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-13 10:16:30 +08:00
Li Fei1	65ed6c3529	hv: vpci: trap PCIe ECAM access for SOS SOS will use PCIe ECAM access PCIe external configuration space. HV should trap this access for security(Now pre-launched VM doesn't want to support PCI ECAM; post-launched VM trap PCIe ECAM access in DM). Besides, update PCIe MMCONFIG region to be owned by hypervisor and expose and pass through platform hide PCI devices by BIOS to SOS. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-01-07 16:05:30 +08:00
Li Fei1	1e50ec8899	hv: pci: use ECAM to access PCIe Configuration Space Use Enhanced Configuration Access Mechanism (MMIO) instead of PCI-compatible Configuration Mechanism (IO port) to access PCIe Configuration Space PCI-compatible Configuration Mechanism (IO port) access is used for UART in debug version. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-01-07 16:05:30 +08:00
Li Fei1	65f3751ea3	hv: pci: add hide pci devices configuration for apl-up2 Other Platforms are not added for now. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-01-07 16:05:30 +08:00

... 4 5 6 7 8 ...

2284 Commits