acrn-hypervisor

mirror of https://github.com/projectacrn/acrn-hypervisor.git synced 2025-07-11 06:15:05 +00:00

Author	SHA1	Message	Date
Shuang Zheng	13d39fda85	hv: update hybrid_rt with 2 post-launched VMs in Kconfig update the help message of config SCENARIO to set 2 standard post-launched VMs for default hybrid_rt scenario in Kconfig. Tracked-On: #5390 Signed-off-by: Shuang Zheng <shuang.zheng@intel.com> Acked-by: Victor Sun <victor.sun@intel.com>	2020-10-14 14:00:45 +08:00
Yuan Liu	38e2903770	hv: move mem_regions to ivshmem.c This is a bug fix that avoids multiple declarations of mem_regions Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-22 09:28:24 +08:00
dongshen	ef9a961523	acrn-config/hv: create new file pt_intx_c.py to generate the pt_intx.c file Move struct pt_intx_config vm0_pt_intx[] defintion to pt_intx.c so that vm_configurations.h/vm_configurations.c are consistent for different boards Tracked-On: #5229 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-09-16 10:37:09 +08:00
Victor Sun	c63899fc81	HV: correct hpa calculation for pre-launched VM The commit of `da81a0041d` "HV: add e820 ACPI entry for pre-launched VM" introduced a issue that the base_hpa and remaining_hpa_size are also calculated on the entry of 32bit PCI hole which from 0x80000000 to 0xffffffff, which is incorrect; Tracked-On: #5266 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-09-15 09:45:10 +08:00
Victor Sun	8b86714af8	HV: fix uart hang issue caused by bdf overridden On a PCI type HV uart, the bdf value is in a union together with mmio_base_vaddr, then the value would be overridden by mmio_base_addr in uart16550_init(), result in is_pci_dbg_uart() returns a wrong value and then uart hang. Tracked-On: #5288 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-09 10:10:06 +08:00
Li Fei1	a2fd8c5a9d	pci: mcfg: limit device bus numbers which could access by ECAM Per PCI Firmware Specification Revision 3.0, 4.1.2. MCFG Table Description: Memory Mapped Enhanced Configuration Space Base Address Allocation Structure assign the Start Bus Number and the End Bus Number which could decoded by the Host Bridge. We should not access the PCI device which bus number outside of the range of [Start Bus Number, End Bus Number). For ACRN, we should: 1. Don't detect PCI device which bus number outside the range of [Start Bus Number, End Bus Number) of MCFG ACPI Table. 2. Only trap the ECAM MMIO size: [MMCFG_BASE_ADDRESS, MMCFG_BASE_ADDRESS + (End Bus Number - Start Bus Number + 1) * 0x100000) for SOS. Tracked-On: #5233 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-09 09:31:56 +08:00
Shuang Zheng	03036062cd	makefile: compile ACPI tables for pre-launched VMs to one binary compile ACPI tables for pre-launched VMs to one binary when pre-build hypervisor. Tracked-On: #5266 Signed-off-by: Shuang Zheng <shuang.zheng@intel.com> Acked-by: Victor Sun <victor.sun@intel.com>	2020-09-08 19:52:25 +08:00
Victor Sun	2c0bc146ce	HV: remove deprecated vacpi build method The old method of build pre-launched VM vacpi by HV source code is deprecated, so remove related source code; Tracked-On: #5266 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-08 19:52:25 +08:00
Victor Sun	34547e1e19	HV: add acpi module support for pre-launched VM Previously we use a pre-defined structure as vACPI table for pre-launched VM, the structure is initialized by HV code. Now change the method to use a pre-loaded multiboot module instead. The module file will be generated by acrn-config tool and loaded to GPA 0x7ff00000, a hardcoded RSDP table at GPA 0x000f2400 will point to the XSDT table which at GPA 0x7ff00080; Tracked-On: #5266 Signed-off-by: Victor Sun <victor.sun@intel.com> Signed-off-by: Shuang Zheng <shuang.zheng@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-08 19:52:25 +08:00
Victor Sun	4290a79951	HV: refine get multiboot module API change API of uint32_t get_mod_idx_by_tag(const struct multiboot_module mods, uint32_t mods_count, const char tag) to struct multiboot_module get_mod_by_tag(const struct acrn_multiboot_info mbi, const char *tag) to simplify caller interface; Tracked-On: #5266 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-08 19:52:25 +08:00
Victor Sun	da81a0041d	HV: add e820 ACPI entry for pre-launched VM Previously the ACPI table was stored in F segment which might not be big enough for a customized ACPI table, hence reserve 1MB space in pre-launched VM e820 table to store the ACPI related data: 0x7ff00000 ~ 0x7ffeffff : ACPI Reclaim memory 0x7fff0000 ~ 0x7fffffff : ACPI NVS memory Tracked-On: #5266 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-08 19:52:25 +08:00
Qian Wang	0267cc4ef1	HV: Fix SR-IOV problem on EHL hv: vpci: Add 0x45, which is the high-byte of device id of EHL, to the enumeration array in vhostbridge.c. This is to fix the problem that PCIe extended capabilities like SR-IOV cannot be used on EHL. Tracked-On: #5256 Signed-off-by: Qian Wang <qian1.wang@intel.com>	2020-09-08 08:44:56 +08:00
Victor Sun	0461ac209f	HV: set CONFIG_HV_RAM_START as min addr when RELOC enabled Previously the min load_addr for HV image is hard coded to 0x10000000 when CONFIG_RELOC is enabled, now use CONFIG_HV_RAM_START as its prefer minimum address like setting of CONFIG_PHYSICAL_START do in Linux kernel. With this patch, we can offload the CONFIG_HV_RAM_START algorithm to acrn-config or manually set it in scenario XML on some special boards. Tracked-On: #5275 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-09-07 15:03:53 +08:00
Nishioka, Toshiki	77fb21e98c	hv: add vgpio device model support When HV pass through the P2SB MMIO device to pre-launched VM, vgpio device model traps MMIO access to the GPIO registers within P2SB so that it can expose virtual IOAPIC pins to the VM in accordance with the programmed mappings between gsi and vgsi. Tracked-On: #5246 Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-07 14:52:02 +08:00
Nishioka, Toshiki	ba99984f69	hv: add INTx mapping for pre-launched VMs Add the capability of forwarding specified physical IOAPIC interrupt lines to pre-launched VMs as virtual IOAPIC interrupts. This is for the sake of the certain MMIO pass-thru devices on EHL CRB which can support only INTx interrupts. Tracked-On: #5245 Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-07 14:52:02 +08:00
Stanley Chang	871a662a6c	hv: support PIO access to platform hidden devices Kernel driver and ACPI ASL may access a platform hidden device thru PIO, e.g., Intel ICH LPC driver. If the access is originated in SOS or Pre-launched OS, vpci_pio_cfgdata_write/read should support it. This commit also reworks vpci_write_cfg/vpci_read_cfg to do the access check and elimiates the access from post-launched VM (that should be handled by DM). Tracked-On: #5257 Signed-off-by: Stanley Chang <stanley.chang@intel.com> Reviewed-by: Li Fei <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-07 14:08:40 +08:00
Shuo A Liu	d6b9682581	hv: debug: Convert PCI UART paramter from a BDF string to a hex value BDF string can be parsed by the configuration tool. A 16bit WORD value with format (B:8, D:5, F:3) can be passed from configuration to the hypervisor directly to save some BDF string parse code. Tracked-On: #4937 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-01 15:13:53 +08:00
dongshen	3880e6186e	hv: add pt_intx related members to struct acrn_vm_config On EHL platform, we need to expose GPIO chassis interrupt to pre-launched VM as INTx. Add related data structures so that they can be used in subsequent commits. Tracked-On: #5241 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-09-01 09:35:50 +08:00
dongshen	10d4773f1d	hv: add a new field pt_p2sb_bar to struct acrn_vm_config On EHL platform, we need to pass through P2SB bridge to pre-launched VM. Use pt_p2sb_bar to indicate whether to passthru p2sb bridge to pre-launched VM or not. Tracked-On: #5221 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-09-01 09:35:50 +08:00
Yonghua Huang	c03623f3fb	hv[v2]: Remove deprecated term in vPIC submodule This patch cleanup below deprecated terms: 'master' -> 'primary' 'slave' -> 'secondary' v2 update: Refine comments. Tracked-On: #5249 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-09-01 09:30:08 +08:00
Stanley Chang	d55813e80b	hv: passthru DHRD-ignored device When trying to passthru a DHRD-ignored PCI device, iommu_attach_device shall report success. Otherwise, the assign_vdev_pt_iommu_domain will result in HV panic. Same for iommu_detach_device case. Tracked-On: #5240 Signed-off-by: Stanley Chang <stanley.chang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-01 09:29:25 +08:00
Shuo A Liu	902ed60806	hv: Restrain several hypercalls which may impact target VM Some hypercalls to a target VM are only acceptable in some certain states, else it impacts target VM. Add some restrictive status checks to avoid that. Tracked-On: #5208 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-01 09:00:45 +08:00
Shuo A Liu	e587f029de	hv: Add severity check against SOS hypercalls Virtual interrupts injection and memory mapping operations can impact target VM. By design, these type of operations from lower severity VM to higher severity VM should be blocked by the hypervisor. While the hypercalls are the interface between SOS VM and the hypervisor, severity checks can be implemented at the beginning of hypercalls needed. Added severity checks in below hypercalls: * hcall_set_vm_memory_regions() * hcall_notify_ioreq_finish() * hcall_set_irqline() * hcall_inject_msi() * hcall_write_protect_page() Tracked-On: #5208 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-09-01 09:00:45 +08:00
Yuan Liu	1b711ed629	hv: ignore the initialization of vdevs whose vbdf is unassigned if device configuration vbdf is unassigned, then the corresponding vdev will not be initialized, instead, the vdev will be initialized by device model through hypercall. Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-28 16:53:12 +08:00
Yuan Liu	6d0f0ebd8a	hv: implement ivshmem device creation and destruction For ivshmem vdev creation, the vdev vBDF, vBARs, shared memory region name and size are set by device model. The shared memory name and size must be same as the corresponding device configuration which is configured by offline tool. v3: add a comment to the vbar_base member of the acrn_vm_pci_dev_config structure that vbar_base is power-on default value Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-28 16:53:12 +08:00
Yuan Liu	8a34cf03ca	hv: add new hypercalls to create and destroy an emulated device in hypervisor Add HC_CREATE_VDEV and HC_DESTROY_VDEV two hypercalls that are used to create and destroy an emulated device(PCI device or legacy device) in hypervisor v3: 1) change HC_CREATE_DEVICE and HC_DESTROY_DEVICE to HC_CREATE_VDEV and HC_DESTROY_VDEV 2) refine code style v4: 1) remove unnecessary parameter 2) add VM state check for HC_CREATE_VDEV and HC_DESTROY hypercalls Tracked-On: #4853 Reviewed-by: Wang, Yu1 <yu1.wang@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-28 16:53:12 +08:00
Wei Liu	29ac258134	acrn-config: code refactoring for CAT/MBA 1.Modify clos_mask and mba_delay as a member of the union type. 2.Move HV_SUPPORTED_MAX_CLOS ,MAX_CACHE_CLOS_NUM_ENTRIES and MAX_MBA_CLOS_NUM_ENTRIES to misc_cfg.h file. Tracked-On: #5229 Signed-off-by: Wei Liu <weix.w.liu@intel.com> Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-08-28 16:44:06 +08:00
dongshen	a425730f64	acrn-config: rename MAX_PLATFORM_CLOS_NUM to HV_SUPPORTED_MAX_CLOS HV_SUPPORTED_MAX_CLOS: This value represents the maximum CLOS that is allowed by ACRN hypervisor. This value is set to be least common Max CLOS (CPUID.(EAX=0x10,ECX=ResID):EDX[15:0]) among all supported RDT resources in the platform. In other words, it is min(maximum CLOS of L2, L3 and MBA). This is done in order to have consistent CLOS allocations between all the RDT resources. Tracked-On: #5229 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-08-28 16:44:06 +08:00
Yin Fengwei	d0e06c4f80	hv: debug: Enable MMIO UART support New board, EHL CRB, does not have legacy port IO UART. Even the PCI UART are not work due to BIOS's bug workaround(the BARs on LPSS PCI are reset after BIOS hand over control to OS). For ACRN console usage, expose the debug UART via ACPI PnP device (access by MMIO) and add support in hypervisor debug code. Another special thing is that register width of UART of EHL CRB is 1byte. Introduce reg_width for each struct console_uart. Tracked-On: #4937 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2020-08-27 13:31:17 +08:00
Mingqiang Chi	53b11d1048	refine hypercall -- use an array to fast locate the hypercall handler to replace switch case. -- uniform hypercall handler as below: int32_t (*handler)(sos_vm, target_vm, param1, param2) Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2020-08-26 14:55:24 +08:00
Geoffroy Van Cutsem	f8883f43e9	hv: enhance help text for the scenario option in Kconfig Enhance the help text that accompanies the CONFIG_SCENARIO symbol in Kconfig Tracked-On: #5203 Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>	2020-08-26 08:51:50 +08:00
Shuo A Liu	7602304692	hv: Fix thread status mess if wake_thread() happens in transition stage `2abbb99f6a` ("hv: make thread status more accurate") introduced a transition stage, marked as var be_blocking, between RUNNING->BLOCKED of thread status. wake_thread() does not work in this transition stage because it only checks thread->status. Need to check thread->be_blocking as well in wake_thread(). When wake_thread() happens in the transition stage, the previous sleep operation rolled back. Tracked-On: #5190 Fixes: `2abbb99f6a` ("hv: make thread status more accurate") Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2020-08-20 10:32:31 +08:00
Yuan Liu	f60896951b	hv: change log level for find_match_mmio_node Replace pr_fatal with pr_info to reduce printing logs Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-19 15:06:15 +08:00
Yuan Liu	43683c7fc9	hv: implement ivshmem device-specific registers emulation Ivshmem device defines four registers including Interrupt Mask, Interrupt Status, IVPostion and Doorbell. The first two are useless and no emulation is required. The latter two are used for interrupts and will be implemented in the future. This patch also introduces a new priv_data member for structure pci_vdev, it can be used to find an ivshmem device through pci_vdev. v2: refine code style v3: 1) add @pre for ivshmem_mmio_handler function 2) refine code style v4: 1) set ivshmem registers default value when vBAR mapping 2) change find_ivshmem_device to set_ivshmem_device v5: 1) change set_ivshmem_device to find_and_set_ivshmem_device 2) add a ASSERT to check if the vdev->priv_data is set successfully v6: change find_and_set_ivshmem_device to create_ivshmem_device Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-19 15:06:15 +08:00
Yuan Liu	b6661e48d8	hv: implement configuration space operations of ivshmem device Implement read_vdev_cfg/write_vdev_cfg operations for ivshmem deivce v2: read_vdev_cfg/write_vdev_cfg always return zero, the ivshmem device only emulated in HV. Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-19 15:06:15 +08:00
Yuan Liu	0f5ccab68e	hv: code cleanup for vBAR writing This patch introduces vpci_update_one_vbar API to simplify vBAR mapping/unmapping when vBAR writing. v2: refine commit message v4: refine commit message Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-19 15:06:15 +08:00
Yuan Liu	24fe34630d	hv: initialize BARs of ivshmem device ivshmem device supports two BARs, BAR 0 is used for inter-VM notification mechanism, BAR 2 is used to provide shared memory base address and size. v4: check if the return value of get_shm_region function is NULL v5: 1) change get_shm_region to find_shm_region 2) add print log when ivshmem device doesn't find memory region Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-19 15:06:15 +08:00
Shuang Zheng	c26ae8c420	hv: Inter-VM communication config for hybrid_rt on whl-ipc-i5 add an IVSHMEM regoin and the related configuration parameters in hybrid_rt scenario on whl-ipc-i5. The size of the shared memory is 2M, and it is used for the communication between VM0 and VM2. v6: rename shm name; remove unnecessary MACROs. v7: rename MACRO for shm name; add unassigned vbdf for post-launched VMs. Tracked-On: #4853 Signed-off-by: Shuang Zheng <shuang.zheng@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-19 15:06:15 +08:00
Yuan Liu	92f9f5a4f3	hv: add ivshmem device Ivshmem device is used for shared memory based communication between pre-launched/post-launched VMs. this patch implements ivshmem device configuration space initialization and ivshmem device operation methods. v2: introduce init_one_pcibar interface to simplify BAR initialization operation of HV emulated PCI device. v3: 1) due to init_one_pcibar API is only used for pre-launched VM vdevs it can't be applied to all vdevs, so remove it. 2) move ivshmem BARs initialization to subsequent patch, this patch only introduce ivshmem configuration space initialization. Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-19 15:06:15 +08:00
Yuan Liu	d6f563c4eb	hv: implement ivshmem memory regions initialization The ivshmem memory regions use the memory of the hypervisor and they are continuous and page aligned. this patch is used to initialize each memory region hpa. v2: 1) if CONFIG_IVSHMEM_SHARED_MEMORY_ENABLED is not defined, the entire code of ivshmem will not be compiled. 2) change ivshmem shared memory unit from byte to page to avoid misconfiguration. 3) add ivshmem configuration and vm configuration references v3: 1) change CONFIG_IVSHMEM_SHARED_MEMORY_ENABLED to CONFIG_IVSHMEM_ENABLED 2) remove the ivshmem configuration sample, offline tool provides default ivshmem configuration. 3) refine code style. v4: 1) make ivshmem_base 2M aligned. Tracked-On: #4853 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-19 15:06:15 +08:00
Wei Liu	088cd62d8b	HV: sync hv reference code that generated by config tool Sync hv reference code that generated by acrn-config tool. Tracked-On: #5092 Signed-off-by: Wei Liu <weix.w.liu@intel.com>	2020-08-17 14:34:30 +08:00
Junming Liu	23d9c13c41	hv:cpuid:refine cpuid_subleaf interface There's a corner case: When want to get CPUID.01H:EDX value, may have the following code snippet: uint32_t unused,edx; cpuid_subleaf(0x1U, 0x0U, &unused, &unused, &unused, &edx); while in cpuid_subleaf: eax = leaf; ecx = subleaf; eax and ecx point to the same location, When deep into asm_cpuid, it's input value will be 0x0U and 0x0U. but the expected input value is 0x1U and 0x0U. This case will return CPUID.00H:EDX, which is the wrong answer. Tracked-On: #4526 Signed-off-by: Junming Liu <junming.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-17 10:14:00 +08:00
Junming Liu	3631a85c3c	hv:cpu-caps:refine is_apl_platform func and clean up duplicated code Fix the bug for "is_apl_platform" func. "monitor_cap_buggy" is identical to "is_apl_platform", so remove it. On apl platform: 1) ACRN doesn't use monitor/mwait instructions 2) ACRN disable GPU IOMMU Tracked-On:#3675 Signed-off-by: Junming Liu <junming.liu@intel.com>	2020-08-14 10:08:50 +08:00
liujunming	538e7cf74d	hv:cpu-caps:refine processor family and model info v3 -> v4: Refine commit message and code stype 1. SDM Vol. 2A 3-211 states DisplayFamily = Extended_Family_ID + Family_ID when Family_ID == 0FH. So it should be family += ((eax >> 20U) & 0xffU) when Family_ID == 0FH. 2. IF (Family_ID = 06H or Family_ID = 0FH) THEN DisplayModel = (Extended_Model_ID « 4) + Model_ID; While previous code this logic: IF (DisplayFamily = 06H or DisplayFamily = 0FH) Fix the bug about calculation of display family and display model according to SDM definition. 3. use variable name to distinguish Family ID/Display Family/Model ID/Display Model, then the code is more clear to avoid some mistake Tracked-On:#3675 Signed-off-by: liujunming <junming.liu@intel.com> Reviewed-by: Wu Xiangyang <xiangyang.wu@linux.intel.com> Acked-by： Eddie Dong <eddie.dong@intel.com>	2020-08-14 10:08:50 +08:00
Victor Sun	b5dfe369da	HV: move vm configuration check to pre-build time This patch will move the VM configuration check to pre-build stage, a test program will do the check for pre-defined VM configuration data before making hypervisor binary. If test failed, the make process will be aborted. So once the hypervisor binary is built successfully or start to run, it means the VM configuration has been sanitized. The patch did not add any new VM configuration check function, it just port the original sanitize_vm_config() function from cpu.c to static_checks.c with below change: 1. remove runtime rdt detection for clos check; 2. replace pr_err() from logmsg.h with printf() from stdio.h; 3. replace runtime call get_pcpu_nums() in ALL_CPUS_MASK macro with static defined MAX_PCPU_NUM; 4. remove cpu_affinity check since pre-launched VM might share pcpu with SOS VM; The BOARD/SCENARIO parameter check and configuration folder check is also moved to prebuild Makefile. Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-12 10:21:17 +08:00
Victor Sun	8245145317	HV: remove sanitize_vm_config function Remove function of sanitize_vm_config() since the processing of sanitizing will be moved to pre-build process. When hypervisor has booted, we assume all VM configurations is sanitized; Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-12 10:21:17 +08:00
Wei Liu	e2a5c56840	Makefile: modify the realpath to abspath function realpath function would be got null while the directory or file is not exist, modify the function abspath to instead realpath. Tracked-On: #5146 Signed-off-by: Wei Liu <weix.w.liu@intel.com>	2020-08-08 17:07:48 +08:00
Mingqiang Chi	a67a85c70d	hv:refine vm & vcpu lock -- move vm_state_lock to other place in vm structure to avoid the memory waste because of the page-aligned. -- remove the memset from create_vm -- explicitly set max_emul_mmio_regions and vcpuid_entry_nr to 0 inside create_vm to avoid use without initialization. -- rename max_emul_mmio_regions to nr_emul_mmio_regions v1->v2: add deinit_emul_io in shutdown_vm Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-05 13:39:28 +08:00
Victor Sun	af9867f4cc	Makefile: fix issue on make menuconfig The BOARD/SCENARIO envrionment variable should be passed in one shell command; Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-08-04 09:05:29 +08:00
Victor Sun	b9ad04d24d	HV: add cpu affinity info for SOS VM Previously the CPU affinity of SOS VM is initialized at runtime during sanitize_vm_config() stage, follow the policy that all physical CPUs except ocuppied by Pre-launched VMs are all belong to SOS_VM. Now change the process that SOS CPU affinity should be initialized at build time and has the assumption that its validity is guarenteed before runtime. Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-04 09:05:29 +08:00
Victor Sun	3acafd140f	HV: Make: simplify acpi info header file check Previously we have complicated check mechanism on platform_acpi_info.h which is supposed to be generated by acrn-config tool, but given the reality that all configurations should be generated by acrn-config before build acrn hypervisor, this check is not needed anymore. Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-04 09:05:29 +08:00
Victor Sun	07ad37f436	HV: Make: remove sdc scenario build support The SDC scenario configurations will not be validated so remove it from build makefile; Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-04 09:05:29 +08:00
Victor Sun	8f0b0472f6	HV: correct RO mask of MSI cap structure In MSI Capability Structure, bit 7 (64 bit address capable) of MSICTRL is RO; Tracked-On: #5125 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Li Fei <fei1.li@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-08-03 13:40:27 +08:00
Conghui Chen	d531e84d32	hv: fix deadloop in sleep_thread_sync As we only set BLOCKED status in context switch_out, which means, only running thread can be changed to BLOCKED, but runnable thread can not. This lead to the deadloop in sleep_thread_sync. To solve the problem, in sleep_thread, we set the status to BLOCKED directly when the original thread status is RUNNABLE. Tracked-On: #5115 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2020-07-30 16:54:19 +08:00
lirui34	61aa89da12	HV: fix hide all sriov in ecap When VM read pre-sriov header in ECAP of ptdev, only emulate the reading if SRIOV is hidden. Write to pre-sriov header is ignored so no need to fix writting. Tracked-On: #5085 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>	2020-07-27 11:10:36 +08:00
Victor Sun	62c87856ce	HV: remove deprecated old layout configuration source The old layout configuration source which located in: hypervisor/arch/x86/configs/ is abandoned, remove it; Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-07-24 16:16:06 +08:00
Victor Sun	a57a4fd7fb	HV: Make: enable build for new configs layout The make command is same as old configs layout: under acrn-hypervisor folder: make hypervisor BOARD=xxx SCENARIO=xxx [TARGET_DIR]=xxx [RELEASE=x] under hypervisor folder: make BOARD=xxx SCENARIO=xxx [TARGET_DIR]=xxx [RELEASE=x] if BOARD/SCENARIO parameter is not specified, the default will be: BOARD=nuc7i7dnb SCENARIO=industry Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-24 16:16:06 +08:00
Victor Sun	e792fa3d3c	HV: nuc7i7dnb example of new VM configuratons layout There are 3 kinds of configurations in ACRN hypervisor source code: hypervisor overall setting, per-board setting and scenario specific per-VM setting. Currently Kconfig act as hypervisor overall setting and its souce is located at "hypervisor/arch/x86/configs/$(BOARD).config"; Per-board configs are located at "hypervisor/arch/x86/configs/$(BOARD)" folder; scenario specific per-VM configs are located at "hypervisor/scenarios/$(SCENARIO)" folder. This layout brings issues that board configs and VM configs are coupled tightly. The board specific Kconfig file and misc_cfg.h are shared by all scenarios, and scenario specific pci_dev.c is shared by all boards. So the user have no way to build hypervisor binary for different scenario on different board with one source code repo. The patch will setup a new VM configurations layout as below: misc/vm_configs ├── boards --> folder of supported boards │ ├── <board_1> --> scenario-irrelevant board configs │ │ ├── board.c --> C file of board configs │ │ ├── board_info.h --> H file of board info │ │ ├── pci_devices.h --> pBDF of PCI devices │ │ └── platform_acpi_info.h --> native ACPI info │ ├── <board_2> │ ├── <board_3> │ └── <board...> └── scenarios --> folder of supported scenarios ├── <scenario_1> --> scenario specific VM configs │ ├── <board_1> --> board specific VM configs for <scenario_1> │ │ ├── <board_1>.config --> Kconfig for specific scenario on specific board │ │ ├── misc_cfg.h --> H file of board specific VM configs │ │ ├── pci_dev.c --> board specific VM pci devices list │ │ └── vbar_base.h --> vBAR base info of VM PT pci devices │ ├── <board_2> │ ├── <board_3> │ ├── <board...> │ ├── vm_configurations.c --> C file of scenario specific VM configs │ └── vm_configurations.h --> H file of scenario specific VM configs ├── <scenario_2> ├── <scenario_3> └── <scenario...> The new layout would decouple board configs and VM configs completely: The boards folder stores kinds of supported boards info, each board folder stores scenario-irrelevant board configs only, which could be totally got from a physical platform and works for all scenarios; The scenarios folder stores VM configs of kinds of working scenario. In each scenario folder, besides the generic scenario specific VM configs, the board specific VM configs would be put in a embedded board folder. In new layout, all configs files will be removed out of hypervisor folder and moved to a separate folder. This would make hypervisor LoC calculation more precisely with below fomula: typical LoC = Loc(hypervisor) + Loc(one vm_configs) which Loc(one vm_configs) = Loc(misc/vm_configs/boards/<board>) + LoC(misc/vm_configs/scenarios/<scenario>/<board>) + Loc(misc/vm_configs/scenarios/<scenario>/vm_configurations.c + Loc(misc/vm_configs/scenarios/<scenario>/vm_configurations.h Tracked-On: #5077 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-24 16:16:06 +08:00
Victor Sun	8bcab8e294	HV: add VM uuid and type for pre-launched RTVM add VM UUID and CONFIG_XX_VM() api for pre-launched RTVM; Tracked-On: #5081 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-07-23 21:58:32 +08:00
Qian Wang	5d8f5023d0	HV: inject physical PCIEXBAR to SOS vhostbridge hv: vpci: inject physical PCIEXBAR to SOS vhostbridge in order to fully emulate a full host bridge following HW spec The vhostbridge we emulate currently is a "Celeron N3350/ Pentium N4200/Atom E3900 Series Host Bridge", which is of Appollo Lake SoC, but the emulation is incomplete, and we need to implement a full vhostbridge following HW spec. This is a step-by-step process, and in this patch we fixes the simulation of PCIEXBAR register (0x60) and thus solved bug #6464. -------#6464: SOS cannot make use of ECAM--------------- Generally, SOS will check the MMIO Base Addr in ACPI MCFG table to confirm it is a reserved memory area. There will be 3 methods to check: 1. Via E820 table 2. Via EFI runtime service 3. To check with the value in PCIEXBAR(0x60) of hostbridge For SOS, method 2 is not feasible since no EFI runtime service is available for SOS. And on newer platform like EHL/TGL, its BIOS somehow doesn't reserve it in native E820, thus SOS will try use method 3 to verify, so we should inject physical ECAM to vhostbridge, otherwise all 3 methods will fail, and SOS will not make use of ECAM, which will results in that SOS cannot use PCIe Extended Capabilities like SR-IOV. ------------------------------------------------------- TODO: 1. In the future, we may add one or more virtual hostbridges for CPUs that are incompatible in layout with the current one, according to HW specs 2. Besides PCIEXBAR(0x60), there are also some registers needs to be emulated more precisely rather than be treated as read-only and hard-coded, will be fixed in future patches. Tracked-On: #5056 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Jason Chen <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-23 20:20:13 +08:00
Qian Wang	ff10c25ae9	HV: refine init_vhostbridge to be dword-aligned hv: vpci: refine init_vhostbridge to be dword-aligned Refine the hard-coded non-dword-aligned sentences in init_vhostbridge to be dword-aligned to simplify the initialization operation Tracked-On: #5056 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Jason Chen <jason.cj.chen@intel.com> Reviewed-by: Li Fei <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-23 20:20:13 +08:00
Shuo A Liu	112f02851c	hv: Disable XSAVE-managed CET state of guest VM To hide CET feature from guest VM completely, the MSR IA32_MSR_XSS also need to be intercepted because it comprises CET_U and CET_S feature bits of xsave/xstors operations. Mask these two bits in IA32_MSR_XSS writing. With IA32_MSR_XSS interception, member 'xss' of 'struct ext_context' can be removed because it is duplicated with the MSR store array 'vcpu->arch.guest_msrs[]'. Tracked-On: #5074 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-07-23 20:15:57 +08:00
Shuo A Liu	ac598b0856	hv: Hide CET feature from guest VM Return-oriented programming (ROP), and similarly CALL/JMP-oriented programming (COP/JOP), have been the prevalent attack methodologies for stealth exploit writers targeting vulnerabilities in programs. CET (Control-flow Enforcement Technology) provides the following capabilities to defend against ROP/COP/JOP style control-flow subversion attacks: * Shadow stack: Return address protection to defend against ROP. * Indirect branch tracking: Free branch protection to defend against COP/JOP The full support of CET for Linux kernel has not been merged yet. As the first stage, hide CET from guest VM. Tracked-On: #5074 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-07-23 20:15:57 +08:00
Li Fei1	5e605e0daf	hv: vmcall: check vm id in dispatch_sos_hypercall Check whether vm_id is valid in dispatch_sos_hypercall Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Li Fei1	1859727abc	hv: vapci: add tpm2 support for pre-launched vm On WHL platform, we need to pass through TPM to Secure pre-launched VM. In order to do this, we need to add TPM2 ACPI Table and add TPM DSDT ACPI table to include the _CRS. Now we only support the TPM 2.0 device (TPM 1.2 device is not support). Besides, the TPM must use Start Method 7 (Uses the Command Response Buffer Interface) to notify the TPM 2.0 device that a command is available for processing. Tracked-On: #5053 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Li Fei1	7971f34344	hv: vapci: refine acpi table header initialization Using ACPI_TABLE_HEADER MACRO to initial the ACPI Table Header. Tracked-On: #5053 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-07-23 20:13:20 +08:00
Li Fei1	acc69007e2	hv: mmio_dev: add mmio device pass through support Add mmio device pass through support for pre-launched VM. When we pass through a MMIO device to pre-launched VM, we would remove its resource from the SOS. Now these resources only include the MMIO regions. Tracked-On: #5053 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Li Fei1	baf77a79ad	hv: mmio_dev: add hypercall to support mmio device pass through Add two hypercalls to support MMIO device pass through for post-launched VM. And when we support MMIO pass through for pre-launched VM, we could re-use the code in mmio_dev.c Tracked-On: #5053 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-23 20:13:20 +08:00
Conghui Chen	821c65b40c	hv: fix possible SSE region mismatch issue During context switch in hypervisor, xsave/xrstore are used to save/resotre the XSAVE area according to the XCR0 and XSS. The legacy region in XSAVE area include FPU and SSE, we should make sure the legacy region be saved during contex switch. FPU in XCR0 is always enabled according to SDM. For SSE, we enable it in XCR0 during context switch. Tracked-On: #5062 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-22 14:19:21 +08:00
Conghui Chen	53d4a7169b	hv: remove kick_thread from scheduler module kick_thread function is only used by kick_vcpu to kick vcpu out of non-root mode, the implementation in it is sending IPI to target CPU if target obj is running and target PCPU is not current one; while for runnable obj, it will just make reschedule request. So the kick_thread is not actually belong to scheduler module, we can drop it and just do the cpu notification in kick_vcpu. Tracked-On: #5057 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-22 13:38:41 +08:00
Conghui Chen	b6422f8985	hv: remove 'running' from vcpu structure vcpu->running is duplicated with THREAD_STS_RUNNING status of thread object. Introduce an API sleep_thread_sync(), which can utilize the inner status of thread object, to do the sync sleep for zombie_vcpu(). Tracked-On: #5057 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-22 13:38:41 +08:00
Conghui Chen	2abbb99f6a	hv: make thread status more accurate 1. Update thread status after switch_in/switch_out. 2. Add 'be_blocking' to represent the intermediate state during sleep_thread and switch_out. After switch_out, the thread status update to THREAD_STS_BLOCKED. Tracked-On: #5057 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-22 13:38:41 +08:00
Mingqiang Chi	aa89eb3541	hv:add per-vm lock for vm & vcpu state change -- replace global hypercall lock with per-vm lock -- add spinlock protection for vm & vcpu state change v1-->v2: change get_vm_lock/put_vm_lock parameter from vm_id to vm move lock obtain before vm state check move all lock from vmcall.c to hypercall.c Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-20 11:22:17 +08:00
yuhong.tao@intel.com	6992d00e45	HV: ptdev hide SRIOV capability for VM Hide sriov capability of passthrough devices for VMs at init_vdev_pt(). And for post-launched VM, allow assign PF. Tracked-On: #5041 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-16 17:27:18 +08:00
yuhong.tao@intel.com	eb36337622	HV: vdev passthough hidding SRIOV Support hide SRIOV extend capability for passthough device Tracked-On: #5041 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com>	2020-07-16 17:27:18 +08:00
Yin Fengwei	fcec5a94be	kconfig: extend the max msix table number to 64 There are some devices (like Samsung NVMe SSD SM981/PM981 which has 33 MSIX tables) which have more than 16 MSIX tables. Extend the default value to 64 to handle them. Tracked-On: #4994 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2020-07-10 19:39:11 +08:00
Li Fei1	80c7da8f1c	hv: vioapic: expose ioapic to guest unconditionally Some OSes assume the platform must have the IOAPIC. For example: Linux Kernel allocates IRQ force from GSI (0 if there's no PIC and IOAPIC) on x86. And it thinks IRQ 0 is an architecture special IRQ, not for device driver. As a result, the device driver may goes wrong if the allocated IRQ is 0 for RTVM. This patch expose vIOAPIC to RTVM with LAPIC passthru even though the RTVM can't use IOAPIC, it servers as a place holder to fullfil the guest assumption. After vIOAPIC has exposed to guest unconditionally, the 'ready' field could be removed since we do vIOAPIC initialization for each guest. Tracked-On: #4691 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-07-10 19:33:46 +08:00
Mingqiang Chi	b1357cdc0d	hv:use spinlock_irqsave_obtain api for uart replace spinlock_obtain/spinlock_release with spinlock_irqsave_obtain and spinlock_irqrestore_release to avoid dead lock for uart module. this uart lock may be accessed in ISR context like this path: dispatch_interrupt->pr_err/pr_xxx or printf ->console_write->uart16550_puts Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-07-03 17:41:17 +08:00
Li Fei1	41805eb2e8	hv: vpci: minor refine about MSI/MSI-X de-initialization About the MSI/MSI-X Capability, there're some fields of it would never been changed once they had been initialized. So it's no need to reset them once the vdev instance is still used. What need to reset are the fields which would been changed by guest at runtime. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-07-02 13:03:36 +08:00
Mingqiang Chi	3b120807c9	hv:rename vioapic.mtx to vioapic.lock rename vioapic.mtx to vioapic.lock Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-07-02 09:40:29 +08:00
Mingqiang Chi	7751c7933d	hv:unify spin_lock initialization will follow this convention for spin lock initialization: -- for simple global variable locks, use this style: static spinlock_t xxx_spinlock = {.head = 0U, .tail = 0U,} -- for the locks inside a data structure, need to call spinlock_init to initialize. Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-07-02 09:40:29 +08:00
Mingqiang Chi	7b32fce06f	hv:use spinlock_irqsave_obtain api for vpic replace spinlock_obtain/spinlock_release with spinlock_irqsave_obtain and spinlock_irqrestore_release to avoid dead lock for vpic module. this vpic lock may be accessed in ISR context like this path: dispatch_interrupt->do_softirq->softirq_handlers ->ptirq_softirq->ptirq_handle_intx->vpic_set_irqline Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-07-02 09:40:29 +08:00
Qian Wang	aee4515ff0	HV: restrict conditions to assign/deassign pcidev hv: hypercall: restrict the condition to assign/deassign a pci device to a post-launched VM for safety For the safety of post-launched VMs, pci devices assignments should occur only when VM is being created (at VM_CREATED STATUS), and pci devices de-assignment should occur only when VM is being created or shutdown/reset (at VM_CREATED or VM_PAUSED status) Tracked-On: #4995 Acked-by: Eddie Done <eddie.dong@intel.com> Reviewed-by: Li Fei <Fei1.Li@intel.com> Signed-off-by: Wang Qian <qian1.wang@intel.com>	2020-07-01 16:19:05 +08:00
Shuo A Liu	2276f1c43d	hv: Change to a permissive check with broken DMAR table From the VT-d spec 8.3: If a DRHD structure with INCLUDE_PCI_ALL flag Set is reported for a Segment, it must be enumerated by BIOS after all other DRHD structures for the same Segment. However, some broken BIOS violate the rules. To bring up ACRN with them, change the ASSERT to a permissive check to unblock the BIOS limitation. Also, scan the DRHD list to find the one who has INCLUDE_PCI_ALL flag. Tracked-On: #4937 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2020-06-30 13:59:38 +08:00
Shuo A Liu	03e18ec492	hv: Refine dmar parsing code Replace dmar_iterate_tbl() by a direct for loop. Handle the dmar_unit_cnt and handle_one_drhd() of each DRHD in the direct for loop. Also tune some function definitions to save LOC. Tracked-On: #4937 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2020-06-30 13:59:38 +08:00
Shuo A Liu	025df6d44c	hv: use SELF IPI Register for self IPI in X2APIC mode According to SDM 10.12.11, we can know this register is dedicated to the purpose of sending self-IPIs with the intent of enabling a highly optimized path for sending self-IPIs. Also sending the IPI via the Self Interrupt Register ensures that interrupt is delivered to the processor core. Specifically completion of the WRMSR instruction to the SELF IPI register implies that the interrupt has been logged into the IRR. Tracked-On: #4937 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by： Eddie Dong <eddie.dong@intel.com>	2020-06-28 10:33:22 +08:00
Shuo A Liu	0397cb7174	hv: Fix the interrupts lost issue with PI support Currently, not all platforms support posted interrupt processing of both VT-x and VT-d. On EHL, VT-d doesn't support posted interrupt processing. So in such scenario, is_pi_capable() in vcpu_handle_pi_notification() will bypass the PIR pending bits check which might cause a self-NV-IPI lost. With commit "bf1ff8c98 (hv: Offload syncing PIR to vIRR to processor hardware)", the syncing PIR to vIRR is postponed and it is handled by a self-NV-IPI in the following VMEnter. The process looks like, a) vcpu A accepts a virtual interrupt -> 1) ACRN_REQUEST_EVENT is set 2) corresponding bit in PIR is set 3) Posted Interrupt ON bit is set b) vcpu A does virtual interrupt injection on resume path due to the pending ACRN_REQUEST_EVENT -> 1) hypervisor disables host interrupt 2) ACRN_REQUEST_EVENT is cleared 3) a self-NV-IPI is sent via ICR of LAPIC. 4) IRR bit of the self-NV-IPI is set c) (VM-ENTRY) vcpu A returns into non-root mode 1) host interrupt enable(by HW) 2) posted interrupt processing clears the ON bit, sync PIR to vIRR 3) deliver the virtual interrupt if guest rflags.IF=1 d) (VM-EXIT) vcpu A traps due to a instruction execution (e.g. HLT) 1) host interrupt disable(by HW) 2) hypervisor enable host interrupt Above illustrates a normal process of the virtual interrupt injection with cpu PI support. However, a failing case is observed. The failing case is that the self-NV-IPI from b-3 is not accepted by the core until a timing between d-1 and d-2. b-4 happening between d-1 and d-2 is observed by debug trace. So the self-NV-IPI will be handled in root-mode which cannot do the syncing PIR to vIRR processing. Due to the bug described in the first paragraph, vcpu_handle_pi_notification() cannot succeed the virtual interrupt injection request. This patch fix it by removing the wrong check in vcpu_handle_pi_notification() because vcpu_handle_pi_notification() only happens on platform with cpu PI support. Here are some cost data for sending IPI via LAPIC ICR regsiter. Normally, the cycles between ICR write and IRR got set is 140~260, which is not accurate due to the MSR read overhead. And from b-3 to c is about 560 cycles. So b-4 happens during this period. But in bad case, b-4 doesn't happen even c is triggered. The worse case i captured is that ICR write and IRR got set costs more than 1900 cycles. Now, the best GUESS of the huge cost of IPI via ICR is the ACPI bus arbitration(refer to SDM 10.6.3, 10.7 and Figure 10-17). Tracked-On: #4937 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by： Eddie Dong <eddie.dong@intel.com>	2020-06-28 10:33:22 +08:00
Yin Fengwei	ef5c1b5481	Build: disable zero length array warning We hit following build error when using gcc10: arch/x86/page.c:240:48: error: array subscript is outside array bounds of 'struct page[0][1]' [-Werror=array-bounds] It happens with gcc10 on different Linux distributions. Regarding the case that ACRN depends on zero length array in sevaral places, we disable the zero length array warning by gcc option. Tracked-On: #4810 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2020-06-23 08:39:34 +08:00
Li Fei1	da7c2ba3e9	hv: ept: wrap a function to do guest ept flush Wrap a function to do guest ept flush. This function doesn't do real EPT flush. It just make the EPT flush request and do the real flush just before vcpu vmenter. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-22 16:25:03 +08:00
Li Fei1	82f9233d4a	hv: vpci: a minor fix about is_zombie_vf Now we check whether a device is zombie by the ->user != NULL. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-21 12:07:15 +08:00
Mingqiang Chi	1b84741a56	rename vm_lock/vlapic_state in VM structure rename: vlapic_state-->vlapic_mode vm_lock --> vlapic_mode_lock check_vm_vlapic_state --> check_vm_vlapic_mode Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
Mingqiang Chi	0bd6555cab	remove pci_device_lock in pci.c -- remove unnecessary lock in pci_mmcfg_read_cfg and pci_mmcfg_write_cfg since the mmio operation is atomic if the offest is aligned with 1/2/4 bytes. -- move pci_is_valid_access to pci.h Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
Mingqiang Chi	d808031a04	remove spin lock for micro code update remove spin lock for micro code update since the guest operating system will take lock Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
Mingqiang Chi	ac65898f35	cleanup spin lock in vtd.c move dm_unit->lock into dmar_issue_qi_request Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
Mingqiang Chi	67a7c355ec	cleanup spin lock in irq.c -- move exception_lock to dump.c -- optimize the lock usage in request_irq Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
Mingqiang Chi	d0a4052518	remove dead code in io.h remove thess APIs: set64 set32 set16 set8 Tracked-On: #4958 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-06-19 16:13:20 +08:00
yuhong.tao@intel.com	aff687896b	HV: Fix split-locked access detection is disabled by default The commit 'HV: Config Splitlock Detection to be disable' allows using CONFIG_ENFORCE_TURNOFF_AC to turn off splitlock #AC. If CONFIG_ENFORCE_TURNOFF_AC is not set, splitlock #AC should be turn on Tracked-On: #4962 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>	2020-06-19 09:22:58 +08:00
Conghui Chen	2a4c59db74	hv: add check for BASIC VMX INFORMATION Check bit 48 in IA32_VMX_BASIC MSR, if it is 1, return error, as we only support Intel 64 architecture. SDM: Appendix A.1 BASIC VMX INFORMATION Bit 48 indicates the width of the physical addresses that may be used for the VMXON region, each VMCS, anddata structures referenced by pointers in a VMCS (I/O bitmaps, virtual-APIC page, MSR areas for VMX transitions). If the bit is 0, these addresses are limited to the processor’s physical-address width.2 If the bit is 1, these addresses are limited to 32 bits. This bit is always 0 for processors that support Intel 64 architecture. Tracked-On: #4956 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2020-06-18 14:05:56 +08:00
Conghui Chen	906284eec8	hv: remove unnecessary debug symbols remove unnecessary debug symbols. Tracked-On: #4956 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2020-06-18 14:05:56 +08:00
Conghui Chen	f4292752b0	hv: remove check for OSXSAVE in host We always assume the physical platform has XSAVE, and we always enable XSAVE at the beginning, so, no need to check the OXSAVE in host. Tracked-On: #4956 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2020-06-18 14:05:56 +08:00
Conghui Chen	53f74f18ac	hv: remove repeated assignment remove repeated assignment for vmcs_pa. Tracked-On: #4956 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2020-06-18 14:05:56 +08:00
Victor Sun	ca9e98cc74	HV: add board and scenario info in log As build variants for different board and different scenario growing, users might make mistake on HV binary distributions. Checking board/scenario info from log would be the fastest way to know whether the binary matches. Also it would be of benifit to developers for confirming the correct binary they are debugging. Tracked-On: #4946 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-18 13:05:42 +08:00
Qian Wang	882c9d5d76	HV: refine pci_find_vdev with hash hv: pci: refine pci_find_vdev with hash 1. Refined pci_find_vdev with BDF-hashing for better performance Tracked-On: #4857 Signed-off-by: Wang Qian <qian1.wang@intel.com> Reviewed-by: Li Fei <Fei1.Li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-18 12:58:40 +08:00
Qian Wang	8fb8d81935	HV: refine pci_lookup_drhd_for_pbdf with hash hv: pci: refine pci_lookup_drhd_for_pbdf with hash 1. Added an auxiliary function pci_find_pdev using hash to find pdev with pbdf, thus pci_lookup_drhd_for_pbdf will have a better performance Tracked-On: #4857 Signed-off-by: Wang Qian <qian1.wang@intel.com> Reviewed-by: Li Fei <Fei1.Li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-18 12:58:40 +08:00
Qian Wang	f58bf1f03f	HV: rename pci_pdev_array to pci_pdevs hv: pci: rename pci_pdev_array to pci_pdevs to make it clearer Tracked-On: #4857 Signed-off-by: Wang Qian <qian1.wang@intel.com> Reviewed-by: Li Fei <Fei1.Li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-18 12:58:40 +08:00
Binbin Wu	6be27cdcab	hv: vmsi: add vmsix on msi emulation support Some passthrough devices require multiple MSI vectors, but don't support MSI-X. In meanwhile, Linux kernel doesn't support continuous vector allocation. On native platform, this issue can be mitigated by IOMMU via interrupt remapping. However, on ACRN, there is no vIOMMU. vMSI-X on MSI emulation is one solution to mitigate this problem on ACRN. This patch adds MSI-X emulation on MSI capability. For the device needs to do MSI-X emulation, HV will hide MSI capability and present MSI-X capability to guest. The guest driver may need to modify to reqeust MSI-X vector. For example: ret = pci_alloc_irq_vectors(pdev, 1, STMMAC_MSI_VEC_MAX, - PCI_IRQ_MSI); + PCI_IRQ_MSI \| PCI_IRQ_MSIX); To enable MSI-X emulation, the device should: - 1. The device should be in vmsix_on_msi_devs array. - 2. Support MSI, but don't support MSI-X. - 3. MSI capability should support per-vector mask. - 4. The device should have an unused BAR. - 5. The device driver should not rely on PBA for functionality. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	da1788c9a3	hv: vtd: add an API to reserve continuous irtes dmar_reserve_irte is added to reserve N coutinuous IRTEs. N could be 1, 2, 4, 8, 16, or 32. The reserved IRTEs will not be freed. Tracked-On:#4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	7bfcc673a6	hv: ptirq: associate an irte with ptirq_remapping_info entry For a ptirq_remapping_info entry, when build IRTE: - If the caller provides a valid IRTE, use the IRET - If the caller doesn't provide a valid IRTE, allocate a IRET when the entry doesn't have a valid IRTE, in this case, the IRET will be freed when free the entry. Tracked-On:#4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	2fe4280cfa	hv: vtd: add two paramters for dmar_assign_irte idx_in: - If the caller of dmar_assign_irte passes a valid IRTE index, it will be resued; - If the caller of dmar_assign_irte passes INVALID_IRTE_ID as IRTE index, the function will allocate a new IRTE. idx_out: This paramter return the actual index of IRTE used. The caller need to check whether the return value is valid or not. Also this patch adds an internal function alloc_irte. The function takes count as input paramter to allocate continuous IRTEs. The count can only be 1, 2, 4, 8, 16 or 32. This is prepared for multiple MSI vector support. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-16 08:52:56 +08:00
Binbin Wu	9c52b353fa	hv: kconfig: add a range for MAX_IR_ENTRIES Script only append 'U' for the config of int with a range. Add a range to MAX_IR_ENTRIES. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com>	2020-06-16 08:52:56 +08:00
Li Fei1	65e4a16e6a	hv: mmu: release 1GB cpu side support constrain There're some platforms still doesn't support 1GB large page on CPU side. Such as lakefield, TNT and EHL platforms on which have some silicon bug and this case CPU don't support 1GB large page. This patch tries to release this constrain to support more hardware platform. Note this patch doesn't release the constrain on IOMMU side. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-15 15:16:34 +08:00
Li Fei1	6e57553015	Revert "hv: Let trampoline execution use 1GB pages" This patch tries to release hardware platform 1GB large page support constrain on CPU side. There're some silicon bug on lakefield, TNT and EHL platforms which cause CPU couldn't support 1GB large page. As a result, the pre-assumption The platform which ACRN supports must support 1GB large page on both CPU side and VTD side is not true any more. This reverts commit `f01aad7e` to let trampoline execution use 2MB pages. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-15 15:16:34 +08:00
Binbin Wu	c907a820df	hv: config: add msix emulation support The information needed to enable MSI-x emulation. Only enable MSI-x emuation for the devices in msix_emul_devs array. Currently, only EHL has the need to enable MSI-x emulation for TSN devices. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-10 14:32:15 +08:00
Victor Sun	bdaa2a58df	HV: correct mmap info for multiboot2 The acrn_mbi.mi_mmap_va should point to struct multiboot2_mmap_entry when boot from multiboot2, which is different from struct multiboot_mmap when boot from multiboot1. So we should handle mmap info separately for multiboot2. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-06-10 10:50:41 +08:00
Victor Sun	db690e0967	HV: enable multiboot module string as kernel bootargs Previously the VM kernel bootargs for pre-launched VMs and direct boot mode of SOS VM are built-in hypervisor binary so end users have no way to change it. Now we provide another option that the multiboot module string could be used as bootargs also. This would bring convenience to end users when they use GRUB as bootloader because the bootargs could be configurable in GRUB menu. The usage is if there is any string follows configured kernel_mod_tag in module string, the string will be used as new kernel bootargs instead of built-in kernel bootargs. If there is no string follows kernel_mod_tag, then the built-in bootargs will be the default kernel bootargs. Please note kernel_mod_tag must be the first word in module string in any case, it is used to specify the module for which VM. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Victor Sun	80262f0602	HV: rename append_seed_arg to fill_seed_arg Previously append_seed_arg() just do fill in seed arg to dest cmd buffer, so rename the api name to fill_seed_arg(). Since fill_seed_arg() will be called in SOS VM path only, the param of bool vm_is_sos is not needed and will be replaced by dest buffer size. The seed_args[] which used by fill_seed_arg() is pre-defined as all-zero, so memset() is not needed in fill_seed_arg(), buffer pointer check and strncpy_s() are not needed also. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Victor Sun	47d20f37e1	HV: replace merge_cmdline api with strncat_s Add a standard string api strncat_s() to replace merge_cmdline() to make code more readable. Another change is that the multiboot cmdline will be appended to the end of configured SOS bootargs instead of the beginning, this would enable a feature that some kernel cmdline paramter items could be overriden by multiboot cmdline since the later one would win if same parameters configured in kernel cmdline. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Victor Sun	bad12039c6	HV: rewrite strncpy_s to be iso c11 compliant Per C11 standard (ISO/IEC 9899:2011): K.3.7.1.4 1. Copying shall not take place between objects that overlap; 2. If there is a runtime-constraint violation, the strncpy_s function sets s1[0] to '\0\; 3. The strncpy_s function returns zero if there was no runtime-constraint violation. Otherwise, a nonzero value is returned. 4. The function is implemented with memcpy_s() because the runtime-constraint detection is almost same. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Victor Sun	e254be150a	HV: rewrite memcpy_s to be iso c11 compliant Per C11 standard (ISO/IEC 9899:2011): K.3.7.1.1 1. Copying shall not take place between objects that overlap; 2. If there is a runtime-constraint violation, the memcpy_s function stores zeros in the first s1max characters of the object; 3. The memcpy_s function returns zero if there was no runtime-constraint violation. Otherwise, a nonzero value is returned. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Victor Sun	45d1f38a5b	HV: add hv cmdline support for multiboot2 The multiboot2 cmdline would be used as hypervisor cmdline, add parse logic for the case that hypervisor boot from multiboot2 protocol. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Victor Sun	c74b1941a0	HV: split sanitize_multiboot_info api Previously sanitize_multiboot_info() was called after init_debug_pre() because the debug message can only print after uart is initialized. On the other hand, multiboot cmdline need to be parsed before init_debug_pre() because the cmdline could override uart settings and make sure debug message printed successfully. This cause multiboot info was parsed in two stages. The patch revise the multiboot parse logic that split sanitize_multiboot_info() api and use init_acrn_multiboot_info() api for the early stage. The most of multiboot info will be initialized during this stage and no debug message need to be printed. After uart is initialized, the sanitize_multiboot_info() would do sanitize multiboot info and print needed debug messages. Tracked-On: #4885 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-08 13:30:04 +08:00
Li Fei1	c4b5af9663	hv: pci: remove some unnecessary functions We define some functions to read some fields of the CFG header registers. We could remove them since they're not necessary since calling pci_pdev_read_cfg is simple. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-06-05 16:53:33 +08:00
Naveen Saini	b919122c34	use variables for installation directories. Don't hardcode install paths. Instead of hardcoding where binaries are installed, add variables that installer can override. Tracked-On: #4864 Signed-off-by: Chee Yang Lee <chee.yang.lee@intel.com> Signed-off-by: Naveen Saini <naveen.kumar.saini@intel.com>	2020-06-05 15:25:12 +08:00
Binbin Wu	65ec6f3f3b	hv: vtd: fix potential dead loop if qi request timeout Fix potential dead loop if qi request timeout. Tracked-On: #4680 Signed-off-by: Binbin Wu <binbin.wu@intel.com>	2020-06-05 05:31:16 +08:00
Victor Sun	a6e552b7b5	Makefile: minor fix on hypervisor dependency The $(VERSION) should be depended on config.h change. For example, when RELEASE parameter is changed in make commmand, CONFIG_RELEASE need to be updated in defconfig file, and then message in version.h should be updated. The patch also fix a bug that a code path in make defconfig never be triggered because shell will treat [ ! -f $(KCONFIG_FILE) ] as false when $(KCONFIG_FILE) is not specified. (i.e. "$(KCONFIG_FILE)" == "") Tracked-On: #2412 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-06-04 15:03:43 +08:00
Wei Liu	fbb1dfa264	HV/Kconfig: update efi bootloader image file path for Kconfig Update efi bootloader image file path for Yocto rootfs in Kconfig. Tracked-On: #4868 Signed-off-by: Wei Liu <weix.w.liu@intel.com> Reviewed-by: Victor Sun <victor.sun@intel.com>	2020-06-03 22:02:58 +08:00
Li Fei1	ae4fa40adc	hv: vpci: hv: vpci: refine pci device assignment logic Now Host Bridge and PCI Bridge could only be added to SOS's acrn_vm_pci_dev_config. So For UOS, we always emualte Host Bridge and PCI Bridge for it and assign PCI device to it; for SOS, if it's the highest severity VM, we will assign Host Bridge and PCI Bridge to it directly, otherwise, we will emulate them same as UOS. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-03 22:00:43 +08:00
Li Fei1	b8f151a55f	hv: pci: check whether a PCI device is host bridge or not by class According PCI Code and ID Assignment Specification Revision 1.11, a PCI device whose Base Class is 06h and Sub-Class is 00h is a Host bridge. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-03 22:00:43 +08:00
Li Fei1	0bd2daf1c5	hv: pci: remove host bridge BDF definition We should check whether a PCI device is host bridge or not by Base Class (06h) and Sub-Class (00h). Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-06-03 22:00:43 +08:00
wenlingz	0f6d11866b	Modify Makefile to add acrn.bin after install Tracked-On: #4842 Signed-off-by: wenlingz <wenling.zhang@intel.com>	2020-05-29 10:31:16 +08:00
Li Fei1	ce3451827a	hv: vpci: add vmsix capability registers rw permission control Guest may write a MSI-X capability register with only RW bits setting on. This works well on native since the hardware will make sure RO register bits could not over-write. However, the software needs more efforts to achieve this. This patch does this by defining a RW permission mapping base on bits. When a guest tries to write a MSI-X Capability register, only modify the RW bits on vCFG space. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-05-28 13:44:18 +08:00
Li Fei1	ea0ba47b02	hv: vpci: add vmsi capability registers rw permission control Guest may write a MSI capability register with only RW bits setting on. This works well on native since the hardware will make sure RO register bits could not over-write. However, the software needs more efforts to achieve this. This patch does this by defining a RW permission mapping base on bits. When a guest tries to write a MSI Capability register, only modify the RW bits on vCFG space. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-05-28 13:44:18 +08:00
Vijay Dhanraj	d03df0c7e2	HV: Fix MP Init sequence hang by adding a delay As per the BWG a delay should be provided between the INIT IPI and Startup IPI. Without the delay observe hangs on certain platforms during MP Init sequence. So Setting a delay of 10us between assert INIT IPI and Startup IPI. Also, as per SDM section 10.7 the the de-assert INIT IPI is only used for Pentium and P6 processors. This is not applicable for Pentium4 and Xeon processors so removing this sequence. Tracked-On: #4835 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 13:34:59 +08:00
Minggui Cao	8c090c71ca	HV: fix bug to clear guest flags after it not used in shutdown_vm, it uses guest flags when handling the phyiscal CPUs whose LAPIC is pass-through. So if it is cleared first, the related vCPUs and pCPUs can not be switched to correct state. so move the clear action after the flags used. Tracked-On: #4848 Signed-off-by: Minggui Cao <minggui.cao@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>	2020-05-27 11:35:47 +08:00
Binbin Wu	454bb14348	hv: vtd: remove some unnecessary check 1. context_entry couldn't be NULL in iommu_attach_device since bus number is checked before the call. 2. root_entry couldn't be NULL in iommu_detach_device since bus number is checked before the call. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 11:27:42 +08:00
Binbin Wu	e9901b3edd	hv: vtd: add a function to check valid of dmar unit Add a function dmar_unit_valid to check if the input dmar uint is valid or not. A valid dmar_unit should not be NULL, or ignore flag should not be set. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 11:27:42 +08:00
Binbin Wu	3009d9399f	hv: vtd: cleanup snoop control related code Snoop control will not be turned on by hypervisor, delete snoop control related code. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 11:27:42 +08:00
Binbin Wu	a94c3ef763	hv: vtd: init DMAR/IR table address when register Initialize root_table_addr/ir_table_addr of dmar uint when register the dmar uint. So no need to check if they are initialzed or not later. Tracked-On: #4831 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-27 11:27:42 +08:00
Minggui Cao	564984570c	HV: explicitly init lock variable before using it 1. though "pci_device_lock" & "logmsg_ctl.lock" are set to 0 when system dose memory initialization, it is better to explicitly init them before using. 2. unify the usage of spinlock_init Tracked-On: #4827 Signed-off-by: Minggui Cao <minggui.cao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-26 10:26:59 +08:00
Binbin Wu	6c05af8ded	hv: ptirq : fix a bug in ptirq_release_entry The mask valuei 0x3F was added to prevent out of range in array access. However, it should not be hardcoded. Since in ptirq_alloc_entry_id, the valid allocated id is no greater than CONFIG_MAX_PT_IRQ_ENTRIES, it will not cause out of range array access without mask. So this patch removes the mask. Also, use bitmap_clear_lock instead of bitmap_clear_nolock becuase there could be the chance that more than 1 core to access a same 64bit var. Tracked-On: #4828 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-21 15:24:25 +08:00
Shuo A Liu	9a15ea82ee	hv: pause all other vCPUs in same VM when do wbinvd emulation Invalidate cache by scanning and flushing the whole guest memory is inefficient which might cause long execution time for WBINVD emulation. A long execution in hypervisor might cause a vCPU stuck phenomenon what impact Windows Guest booting. This patch introduce a workaround method that pausing all other vCPUs in the same VM when do wbinvd emulation. Tracked-On: #4703 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-21 15:21:29 +08:00
Minggui Cao	42d5533e6f	HV: makefile: to avoid duplicated build libs 1. improve makefile to avoid duplicated build libs when make in acrn-hypervisor/hypervisor directory to build HV only. 2. for debug/release library just select one makefile to build Tracked-On: #2412 Signed-off-by: Minggui Cao <minggui.cao@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com>	2020-05-21 15:12:21 +08:00
Mingqiang Chi	f994b5ffaf	hv:cleanup vcpu state -- remove VCPU_PAUSED and resume_vcpu -- remove vcpu->prev_state in vcpu structure -- rename pause_vcpu to zombie_vcpu Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-05-21 15:08:49 +08:00
Shuo A Liu	8287cfac6c	hv: debug: reboot directly when issue 'reboot' shell cmd Tracked-On: #4817 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-21 14:57:22 +08:00
Li Fei1	53af096726	hv: ptirq: refine find_ptirq_entry by hashing Refine find_ptirq_entry by hashing instead of walk each of the PTIRQ entries one by one. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com>	2020-05-20 16:04:16 +08:00
Yonghua Huang	63c019c6d2	hv: Add 64 bits hash function This patch adds hash function to hash 64bit value. Tracked-On: #4550 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com>	2020-05-20 16:04:16 +08:00
Yonghua Huang	3391bffb27	hv:fix rtvm hang with maxcpus=0/1 in bootargs RTVM (with lapic PT) boots hang when maxcpus is assigned a value less than the CPU number configured in hypervisor. In this case, vlapic_state(per VM) is left in TRANSITION state after BSP boot, which blocks interupts to be injected to this UOS. Tracked-On: #4803 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Li, Fei <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-15 10:09:13 +08:00
Li Fei1	27a66acd0e	hv: ptdev: refine look up MSI ptirq entry There's no need to look up MSI ptirq entry by virtual SID any more since the MSI ptirq entry would be removed before the device is assigned to a VM. Now the logic of MSI interrupt remap could simplify as: 1. Add the MSI interrupt remap first; 2. If step is already done, just do the remap part. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com> Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com>	2020-05-13 14:31:01 +08:00
Li Fei1	27c6f1c007	hv: vpci: remove vpci->vm not equal to null pre-condition In commit `0a7770cb`, we remove vm pointer in vpci structrue. So there's no need for such pre-condition since vpci is embedded in vm structure. The vm can't be NULL Once the vpci is not NULL. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com>	2020-05-13 14:31:01 +08:00
Li Fei1	f9d26a80ed	hv: vpci: refine vpci deinit The existing code do separately for each VM when we deinit vpci of a VM. This is not necessary. This patch use the common handling for all VMs: we first deassign it from the (current) user, then give it back to its parent user. When we deassign the vdev from the (current) user, we would de-initialize the vMSI/VMSI-X remapping, so does the vMSI/vMSI-X data structure. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com>	2020-05-13 14:31:01 +08:00
Li Fei1	15e3062631	hv: vpci: remove is_own_device() Now we could know a device status by 'user' filed, like --------------------------------------------------------------------------- \| NULL \| == vdev \| != NULL && != vdev vdev->user \| device is de-init \| used by itself VM \| assigned to another VM --------------------------------------------------------------------------- So we don't need to modify 'vpci' field accordingly. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com>	2020-05-13 14:31:01 +08:00
Li Fei1	af8329394b	hv: vpci: minor refine the vdev ownership data structure Add a new field 'parent_user' to record the parent user of the vdev. And refine 'new_owner' to 'user' to record who is the current user of the vdev. Like ----------------------------------------------------------------------------------------------- vdev in \| HV \| pre-VM \| SOS \| post-VM \| \| \|vdev used by SOS\|vdev used by post-VM\| ----------------------------------------------------------------------------------------------- parent_user\| NULL(HV) \| NULL(HV) \| NULL(HV) \| NULL(HV) \| vdev in SOS ----------------------------------------------------------------------------------------------- user \| vdev in HV \| vdev in pre-VM \| vdev in SOS \| vdev in post-VM \| vdev in post-VM ----------------------------------------------------------------------------------------------- Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong<eddie.dong@Intel.com>	2020-05-13 14:31:01 +08:00
Zide Chen	1bc5c7ac5b	hv/acrn-config/efi-stuf: assign hvlog and ramoops buffer address < 256MB If HV relocation is enabled, either ACRN efi-stub or GRUB relocates hypervisor image above HPA 256MB, thus we put hvlog and ramoops buffer under 256MB to avoid conflict with hypervisor owned address. This patch hardcodes these addresses: 0xa00000 - 0xdfffff: 4MiB for ramoops buffer 0xe00000 - 0xffffff: 2MiB for hvlog buffer However, user can customize them to other addresses as long as it's under 256MB, available in host e820, and SOS bootarg "nokaslr" is not specified. If HV relocation is disabled, need to make sure that these buffer addresses are not between HV_RAM_START and HV_RAM_START + HV_RAM_SIZE. Tracked-On: #4760 Signed-off-by: Zide Chen <zide.chen@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-05-13 08:36:54 +08:00
Zide Chen	0a956c34c7	hv: add a new field cpu_affinity in struct acrn_vm For post-launched VMs, the configured CPU affinity could be different from the actual running CPU affinity. This new field acrn_vm->cpu_affinity recognizes this difference so that it's possible that CREATE_VM hypercall won't overwrite the configured CPU afifnity. Change name cpu_affinity_bitmap in acrn_vm_config to cpu_affinity. This is read-only in run time, never overwritten by acrn-dm. Remove vm_config->vcpu_num, which means the number of vCPUs of the configured CPU affinity. This is not to be confused with the actual running vCPU number: vm->hw.created_vcpus. Changed get_vm_bsp_pcpu_id() to get_configured_bsp_pcpu_id() for less confusion. Tracked-On: #4616 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-08 11:04:31 +08:00
Zide Chen	0629c5c8c2	hv/acrn-config: changed name from cpu_affinity_bitmap to cpu_affinity Tracked-On: #4616 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-05-08 11:04:31 +08:00
Sainath Grandhi	bf1ff8c98f	hv: Offload syncing PIR to vIRR to processor hardware ACRN syncs PIR to vIRR in the software in cases the Posted Interrupt notification happens while the pCPU is in root mode. Sync can be achieved by processor hardware by sending a posted interrupt notiification vector. This patch sends a self-IPI, if there are interrupts pending in PIR, which is serviced by the logical processor at the next VMEnter Tracked-On: #4777 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-05-08 10:01:07 +08:00
Yan, Like	869ccb7ba8	HV: RDT: add CDP support in ACRN CDP is an extension of CAT. It enables isolation and separate prioritization of code and data fetches to the L2 or L3 cache in a software configurable manner, depending on hardware support. This commit adds a Kconfig switch "CDP_ENABLED" which depends on "RDT_ENABLED". CDP will be enabled if the capability available and "CDP_ENABLED" is selected. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-08 08:50:13 +08:00
Yan, Like	277c668b04	HV: RDT: clean up RDT code This commit makes some RDT code cleanup, mainling including: - remove the clos_mask and mba_delay validation check in setup_res_clos_msr(), the check will be done in pre-build; - rename platform_clos_num to valid_clos_num, which is set as the minimal clos_mas of all enabled RDT resouces; - init the platform_clos_array in the res_cap_info[] definition; - remove the unnecessary return values and return value check. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com>	2020-05-08 08:50:13 +08:00
Yan, Like	f774ee1fba	HV: RDT: merge struct rdt_cache and rdt_membw in to a union A RDT resource could be CAT or MBA, so only one of struct rdt_cache and struct rdt_membw would be used at a time. They should be a union. This commit merge struct rdt_cache and struct rdt_membw in to a union res. Tracked-On: #4604 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com	2020-05-08 08:50:13 +08:00
yuhong.tao@intel.com	b0ae9cfa2b	HV: Config Splitlock Detection to be disable #AC should be normally enabled for slpitlock detection, however, community developers may want to run ACRN on buggy system. In this case, CONFIG_ENFORCE_TURNOFF_AC can be used to turn off the #AC, to let the guest run without #AC. Tracked-On: #4765 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-05-07 09:35:22 +08:00
Li Fei1	0c6b3e57d6	hv: ptdev: minor refine about ptirq_build_physical_msi The virtual MSI information could be included in ptirq_remapping_info structrue, there's no need to pass another input paramater for this puepose. So we could remove the ptirq_msi_info input. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-05-06 11:51:11 +08:00
Li Fei1	73335b7276	hv: ptirq: rename ptirq_lookup_entry_by_sid to find_ptirq_entry We look up PTIRQ entru only by SID. So _by_sid could removed. And refine function name to verb-obj style. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-05-06 11:51:11 +08:00
Yin Fengwei	68269a559f	gpa2hva: add INVAVLID_HPA return value check For return value of local_gpa2hpa, either INVALID_HPA or NULL means the EPT walking failure. Current code only take care of NULL return and leave INVALID_HPA as correct case. In some cases (if guest page table is filled with invalid memory address), it could crash ACRN from guest. Add INVALID_HPA return check as well. Also add @pre assumptions for some gpa2hpa usages. Tracked-On: #4730 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2020-05-06 11:29:30 +08:00
Wei Liu	624edea6af	HV: assign PCPU0-1 to post vm by default Assign PCPU0-1 to post-launched VM. The CPU affinity can be overridden with the '--cpu_affinity' parameter of acrn-dm. Tracked-On: #4641 Signed-off-by: Wei Liu <weix.w.liu@intel.com> Acked-by: Victor Sun <victor.sun@intel.com>	2020-05-06 11:25:57 +08:00
Wei Liu	22aecf83e0	HV: modify CONFIG_HV_RAM_START for NUC7i7DNB When boot ACRN hypervisor from grub multiboot, HV will be loaded at CONFIG_HV_RAM_START since relocation is not supported in grub multiboot1. The CONFIG_HV_RAM_SIZE in industry scenario will take ~330MB(0x14000000), unfortunately the efi memmap on NUC7i7DNB is truncated at 0x6dba2000 although it is still usable from 0x6dba2000. So from grub point of view, it could not find a continuous memory from 0x6000000 to load industry scenario. Per efi memmap, there is a big memory area available from 0x40400000, so put CONFIG_HV_RAM_START to 0x41000000 is much safe for NUC7i7DNB. Tracked-On: #4641 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-05-06 11:25:57 +08:00
Minggui Cao	691a0e2e56	HV: add a specific stack space used in CPU booting The original stack used in CPU booting is: ld_bss_end + 4KB; which could be out of the RAM size limit defined in link_ram file. So add a specific stack space in link_ram file, and used in CPU booting. Tracked-On: #4738 Signed-off-by: Minggui Cao <minggui.cao@intel.com> Reviewed by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-29 13:56:40 +08:00
Li Fei1	4c733708bf	hv: lapic: minor refine about init_lapic According to SDM Vol 3, Chap 10.4.7.2 Local APIC State After It Has Been Software Disabled, The mask bits for all the LVT entries are set when the local APIC has been software disabled. So there's no need to mask all the LVT entries one by one. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-26 10:48:49 +08:00
Li Fei1	067b439e69	hv: irq: minor refine about structure idt_64_descriptor The 'value' field in structure idt_64_descriptor is no one used. We could remove it. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-26 10:48:49 +08:00
Li Fei1	c8618dd2fb	hv: vioapic: minor refine about madt ioapic parse Remove ioapic_parse_madt and do MADT IOAPIC parse in parse_madt_ioapic. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-24 15:35:38 +08:00
Li Fei1	907a0f7c04	hv: vioapic: minor refine about vioapic_init Most code in the if ... else is duplicated. We could put it out of the conditional statement. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-24 15:35:38 +08:00
Zide Chen	3691e305c0	hv: dynamically configure CPU affinity through hypercall - add a new member cpu_affinity to struct acrn_create_vm, so that acrn-dm is able to assign CPU affinity through HC_CREATE_VM hypercall. - if vm_create.cpu_affinity is zero, hypervisor launches the VM with the statically configured CPU affinity. Tracked-On: #4616 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-23 09:38:54 +08:00
Zide Chen	9150284ca7	hv: replace vcpu_affinity array with cpu_affinity_bitmap Currently the vcpu_affinity[] array fixes the vCPU to pCPU mapping. While the new cpu_affinity_bitmap doesn't explicitly sepcify this mapping, instead, it implicitly assumes that vCPU0 maps to the pCPU with lowest pCPU ID, vCPU1 maps to the second lowest pCPU ID, and so on. This makes it possible for post-launched VM to run vCPUs on a subset of these pCPUs only, and not all of them. acrn-dm may launch post-launched VMs with the current approach: indicate VM UUID and hypervisor launches all VCPUs from the PCPUs that are masked in cpu_affinity_bitmap. Also acrn-dm can choose to launch the VM on a subset of PCPUs that is defined in cpu_affinity_bitmap. In this way, acrn-dm must specify the subset of PCPUs in the CREATE_VM hypercall. Additionally, with this change, a guest's vcpu_num can be easily calculated from cpu_affinity_bitmap, so don't assign vcpu_num in vm_configuration.c. Tracked-On: #4616 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-23 09:38:54 +08:00
Victor Sun	00a38e89c4	Makefile: do not override RELEASE when build with XML SCENARIO XML file has included RELEASE or DEBUG info already, so if RELEASE is not specified in make command, Makefile should not override RELEASE info in SCENARIO XML. If RELEASE is specified in make command, then RELEASE info in SCENARIO XML could be overridden by make command. The patch also fixed a issue that get correct board defconfig when build hypervisor from TARGET_DIR; Tracked-On: #4688 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-22 16:45:43 +08:00
Victor Sun	9264f51456	HV: refine usage of idle=halt in sos cmdline The parameter of "idle=halt" for SOS cmdline is only needed when cpu sharing is enabled, otherwise it will impact SOS power. Tracked-On: #4329 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-22 14:49:04 +08:00
Li Fei1	113f2f1e35	hv: vacpi: add ioapic madt table Add IOAPIC MADT table support so that guest could detect IOAPIC exist. Tracked-On: #4623 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-22 08:42:19 +08:00
Li Fei1	1dccbdbaa2	hv: vapic: add mcfg table support Add MCFG table support to allow guest access PCIe external CFG space by ECAM Tracked-On: #4623 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-22 08:42:19 +08:00
Li Fei1	4eb3f5a0c7	hv: vacpi: add fadt table support Add FADT table support to support guest S5 setting. According to ACPI 6.3 Spec, OSPM must ignored the DSDT and FACS fields if them're zero. However, Linux kernel seems not to abide by the protocol, it will check DSDT still. So add an empty DSDT to meet it. Tracked-On: #4623 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-22 08:42:19 +08:00
Victor Sun	e5561a5c71	HV: remove sdc2 scenario support Remove sdc2 scenario since the VM launch requirement under this scenario could be satisfied by industry scenario now; Tracked-On: #4661 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-20 14:59:23 +08:00
Victor Sun	a90890e9c7	HV: support up to 7 post launched VMs for industry scenario In industry scenario, hypervisor will support 1 post-launched RT VM and 1 post-launched kata VM and up to 5 post-launched standard VMs; Tracked-On: #4661 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-20 14:59:23 +08:00
Victor Sun	09212cf4b6	HV: Kconfig: enable CPU sharing by default The patch enables CPU sharing feature by default, the default scheduler is set to SCHED_BVT; Tracked-On: #4661 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-20 14:59:23 +08:00
Sainath Grandhi	60c4ec0c59	hv: Wake up vCPU for interrupts from vPIC Wake up vCPUs that are blocked upon interrupts from vPIC. Tracked-On: #4664 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-04-20 09:49:41 +08:00
Victor Sun	dfb947fe91	HV: fix wrong gpa start of hpa2 in ve820.c The current logic puts hpa2 above GPA 4G always, which is incorrect. Need to set gpa start of hpa2 right after hpa1 when hpa1 size is less then 2G; Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-17 14:08:54 +08:00
Victor Sun	fe6407155f	HV: set default MCFG base for generic board On most board the MCFG base is set to 0xe0000000, so modify this value in platform_acpi_info.h for generic boards; The description of ACPI_PARSE_ENABLED is modified also to match its usage. Tracked-On: #4157 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-17 13:49:58 +08:00
Victor Sun	3fd5fc9b51	Kconfig: remove MAX_KATA_VM_NUM CONFIG_MAX_KATA_VM_NUM is a scenario specific configuration, so it is better to put the MACRO in scenario folder directly, to instead the Kconfig item in Kconfig file which should work for all scenarios; Tracked-On: #4616 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-17 13:45:18 +08:00
Victor Sun	dba0591f72	Kconfig: change scenario variable type to string Basicly ACRN scenario is a configuration name for specific usage. By giving scenario name ACRN will load corresponding VM configurations to build the hypervisor. But customer might have their own scenario name, change the scenario type from choice to string is friendly to them since Kconfig source file change will not be needed. With this change, CONFIG_$(SCENARIO) will not exist in kconfig file and will be instead of CONFIG_SCENARIO, so the Makefile need to be changed accordingly; Tracked-On: #4616 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-17 13:45:18 +08:00
Victor Sun	7282b933fb	HV: merge sos_pci_dev config to sos macro The pci_dev config settings of SOS are same so move the config interface from vm_configurations.c to CONFIG_SOS_VM macro; Tracked-On: #4616 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 13:45:18 +08:00
Victor Sun	55b50f408f	HV: init vm uuid and severity in macro Currently the vm uuid and severity is initilized separately in vm_config struct, developer need to take care both items carefuly otherwise hypervisor would have trouble with the configurations. Given the vm loader_order/uuid and severity are binded tightly, the patch merged these tree settings in one macro so that developer will have a simple interface to configure in vm_config struct. Tracked-On: #4616 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 13:45:18 +08:00
yuhong.tao@intel.com	7c80acee95	HV: emulate MSR_TEST_CTL If CPU has MSR_TEST_CTL, show an emulaued one to VCPU Tracked-On: #4496 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 09:53:59 +08:00
yuhong.tao@intel.com	dd3fa8ed75	HV: enable #AC for Splitlock Access If CPU support rise #AC for Splitlock Access, then enable this feature at each CPU. Tracked-On: #4496 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 09:53:59 +08:00
yuhong.tao@intel.com	ea1bce0cbf	HV: enumerate capability of #AC for Splitlock Access When the destination of an atomic memory operation located in 2 cache lines, it is called a Splitlock Access. LOCK# bus signal is asserted for splitlock access which may lead to long latency. #AC for Splitlock Access is a CPU feature, it allows rise alignment check exception #AC(0) instead of asserting LOCK#, that is helpful to detect Splitlock Access. This feature is enumerated by MSR(0xcf) IA32_CORE_CAPABILITIES[bit5] Add helper function: bool has_core_cap(uint32_t bitmask) Tracked-On: #4496 Signed-off-by: Tao Yuhong <yuhong.tao@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-17 09:53:59 +08:00
Mingqiang Chi	f90100e382	hv: add pre-condition for vcpu APIs remove unnecessary state check and add pre-condition for vcpu APIs. Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-16 21:59:03 +08:00
Jason Chen CJ	0584981c03	hv:add pre-condition for vm APIs check the vm state in hypercall api, add pre-condition for vm api. Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-16 21:59:03 +08:00
Mingqiang Chi	fe929d0a10	hv: move out pause_vm from shutdown_vm now it will call pause_vm in shutdown_vm, move it out from shutdown_vm to reduce coupling. Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-16 21:59:03 +08:00
Yonghua Huang	84eaf94ae6	hv: wrap a function to initialize pCPU for second phase This patch wrapps a common function to initialize physical CPU for the second phase to reduce redundant code. Tracked-On: #861 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-04-16 14:02:29 +08:00
Zide Chen	5420b34a26	hv: provide vm_config information in get_platform_info hypercall Hypervisor reports VM configuration information to SOS which can be used to dynamically allocate VCPU affinity. Servise OS can get the vm_configs in this order: 1. call platform_info HC (set vm_configs_addr with 0) to get max_vms and vm_config_entry_size. 2. allocate memory for acrn_vm_config array based on the number of VMs and entry size that just got in step 1. 3. call platform_info HC again to collect VM configurations. Tracked-On: #4616 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-16 13:46:27 +08:00
Xiaoguang Wu	d4f789f47e	hv: iommu: remove snoop related code ACRN disables Snoop Control in VT-d DMAR engines for simplifing the implementation. Also, since the snoop behavior of PCIE transactions can be controlled by guest drivers, some devices may take the advantage of the NO_SNOOP_ATTRIBUTE of PCIE transactions for better performance when snoop is not needed. No matter ACRN enables or disables Snoop Control, the DMA operations of passthrough devices behave correctly from guests' point of view. This patch is used to clean all the snoop related code. Tracked-On: #4509 Signed-off-by: Xiaoguang Wu <xiaoguang.wu@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-16 08:40:17 +08:00
Xiaoguang Wu	b4f1e5aa85	hv: iommu: disable snoop bit in EPT-PTE/SL-PTE Due to the fact that i915 iommu doesn't support snoop, hence it can't access memory when the SNOOP bit of Secondary Level page PTE (SL-PTE) is set, this will cause many undefined issues such as invisible cursor in WaaG etc. Current hv design uses EPT as Scondary Leval Page for iommu, and this patch removes the codes of setting SNOOP bit in both EPT-PTE and SL-PTE to avoid errors. And according to SDM 28.2.2, the SNOOP bit (11th bit) will be ignored by EPT, so it will not affect the CPU address translation. Tracked-On: #4509 Signed-off-by: Xiaoguang Wu <xiaoguang.wu@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-16 08:40:17 +08:00
Conghui Chen	84ad340898	hv: fix for waag 2 core reboot issue Waag will send NMIs to all its cores during reboot. But currently, NMI cannot be injected to vcpu which is in HLT state. To fix the problem, need to wakeup target vcpu, and inject NMI through interrupt-window. Tracked-On: #4620 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-15 14:42:00 +08:00
Binbin Wu	597f7658fc	hv: guest: fix bug in get_vcpu_paging_mode Align the implementation to SDM Vol.3 4.1.1. Also this patch fixed a bug that doesn't check paging status first in some cpu mode. Tracked-On: #4628 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-15 14:40:02 +08:00
Mingqiang Chi	3df6d71e08	hv:print relocation delta now the actual address does not match with the MAP file if enable CONFIG_RELO when there are some exceptions, this patch print the delta between the actual load addess and CONFIG_HV_RAM_START. Tracked-On: #4144 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-04-15 14:34:30 +08:00
Zide Chen	6040d8f6a2	hv: fix SOS vapic_id assignment issue Currently vlapic_build_id() uses vcpu_id to retrieve the lapic_id per_cpu variable: vlapic_id = per_cpu(lapic_id, vcpu->vcpu_id); SOS vcpu_id may not equal to pcpu_id, and in that case it runs into problems. For example, if any pre-launched VMs are launched on PCPUs whose IDs are smaller than any PCPU IDs that are used by SOS. This patch fixes the issue and simplify the code to create or get vapic_id by: - assign vapic_id in create_vlapic(), which now takes pcpu_id as input argument, and save it in the new field: vlapic->vapic_id, which will never be changed. - simplify vlapic_get_apicid() by returning te saved vapid_id directly. - remove vlapic_build_id(). - vlapic_init() is only called once, merge it into vlapic_create(). Tracked-On: #4268 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-15 14:34:15 +08:00
dongshen	00ad3863a1	hv: maintain a per-pCPU array of vCPUs and handle posted interrupt IRQs Maintain a per-pCPU array of vCPUs (struct acrn_vcpu *vcpu_array[CONFIG_MAX_VM_NUM]), one VM cannot have multiple vCPUs share one pcpu, so we can utilize this property and use the containing VM's vm_id as the index to the vCPU array: In create_vcpu(), we simply do: per_cpu(vcpu_array, pcpu_id)[vm->vm_id] = vcpu; In offline_vcpu(): per_cpu(vcpu_array, pcpuid_from_vcpu(vcpu))[vcpu->vm->vm_id] = NULL; so basically we use the containing VM's vm_id as the index to the vCPU array, as well as the index of posted interrupt IRQ/vector pair that are assigned to this vCPU: 0: first vCPU and first posted interrupt IRQs/vector pair (POSTED_INTR_IRQ/POSTED_INTR_VECTOR) ... CONFIG_MAX_VM_NUM-1: last vCPU and last posted interrupt IRQs/vector pair ((POSTED_INTR_IRQ + CONFIG_MAX_VM_NUM - 1U)/(POSTED_INTR_VECTOR + CONFIG_MAX_VM_NUM - 1U) In the posted interrupt handler, it will do the following: Translate the IRQ into a zero based index of where the vCPU is located in the vCPU list for current pCPU. Once the vCPU is found, we wake up the waiting thread and record this request as ACRN_REQUEST_EVENT Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com> Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>	2020-04-15 13:47:22 +08:00
dongshen	14fa9c563c	hv: define posted interrupt IRQs/vectors This is a preparation patch for adding support for VT-d PI related vCPU scheduling. ACRN does not support vCPU migration, one vCPU always runs on the same pCPU, so PI's ndst is never changed after startup. VCPUs of a VM won’t share same pCPU. So the maximum possible number of VCPUs that can run on a pCPU is CONFIG_MAX_VM_NUM. Allocate unique Activation Notification Vectors (ANV) for each vCPU that belongs to the same pCPU, the ANVs need only be unique within each pCPU, not across all vCPUs. This reduces # of pre-allocated ANVs for posted interrupts to CONFIG_MAX_VM_NUM, and enables ACRN to avoid switching between active and wake-up vector values in the posted interrupt descriptor on vCPU scheduling state changes. A total of CONFIG_MAX_VM_NUM consecutive IRQs/vectors are reserved for posted interrupts use. The code first initializes vcpu->arch.pid.control.bits.nv dynamically (will be added in subsequent patch), the other code shall use vcpu->arch.pid.control.bits.nv instead of the hard-coded notification vectors. Rename some functions: apicv_post_intr --> apicv_trigger_pi_anv posted_intr_notification --> handle_pi_notification setup_posted_intr_notification --> setup_pi_notification Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
dongshen	c2d350c5cc	hv: enable VT-d PI for ptdev if intr_src->pid_addr is non-zero Fill in posted interrupt fields (vector, pda, etc) and set mode to 1 to enable VT-d PI (posted mode) for this ptdev. If intr_src->pi_vcpu is 0, fall back to use the remapped mode. Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
dongshen	f7be985a23	hv: check if the IRQ is intended for a single destination vCPU Given the vcpumask, check if the IRQ is single destination and return the destination vCPU if so, the address of associated PI descriptor for this vCPU can then be passed to dmar_assign_irte() to set up the posted interrupt IRTE for this device. For fixed mode interrupt delivery, all vCPUs listed in vcpumask should service the interrupt requested. But VT-d PI cannot support multicast/broadcast IRQs, it only supports single CPU destination. So the number of vCPUs shall be 1 in order to handle IRQ in posted mode for this device. Add pid_paddr to struct intr_source. If platform_caps.pi is true and the IRQ is single-destination, pass the physical address of the destination vCPU's PID to ptirq_build_physical_msi and dmar_assign_irte Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
dongshen	6496da7c56	hv: add function to check if using posted interrupt is possible for vm Add platform_caps.c to maintain platform related information Set platform_caps.pi to true if all iommus are posted interrupt capable, false otherwise If lapic passthru is not configured and platform_caps.pi is true, the vm may be able to use posted interrupt for a ptdev, if the ptdev's IRQ is single-destination Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-04-15 13:47:22 +08:00
Sainath Grandhi	47f883db30	hv: Hypervisor access to PCI devices with 64-bit MMIO BARs PCI devices with 64-bit MMIO BARs and requiring large MMIO space can be assigned with physical address range at the very high end of platform supported physical address space. This patch uses the board info for 64-bit MMIO window as programmed by BIOS and constructs 1G page tables for the same. As ACRN uses identity mapping from Linear to Physical address space physical addresses upto 48 bit or 256TB can be supported. Tracked-On: #4586 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-13 16:52:18 +08:00
Sainath Grandhi	1c21f747be	hv: Add HI_MMIO_START and HI_MMIO_END macros to board files Add 64-bit MMIO window related MACROs to the supported board files in the hypervisor source code. Tracked-On: #4586 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-04-13 16:52:18 +08:00
Jian Jun Chen	159c9ec759	hv: add lock for ept add/modify/del EPT table can be changed concurrently by more than one vcpus. This patch add a lock to protect the add/modify/delete operations from different vcpus concurrently. Tracked-On: #4253 Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2020-04-13 11:38:55 +08:00
Li Fei1	74edf2e54b	hv: vmcs: remove vmcs field check for a vcpu The VMCS field is an embedded array for a vCPU. So there's no need to check for NULL before use. Tracked-On: #3813 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-09 09:40:26 +08:00
Li Fei1	366214e567	hv: virq: refine pending event inject sequence Inject pending exception prior pending interrupt to complete the previous instruction. Tracked-On: #1842 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-09 09:40:00 +08:00
Li Fei1	2d66d39529	hv: vpci: refine comment for pci_vdev_update_vbar_base Refine why we set the base_gpa to zero for a vBAR. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-08 10:15:34 +08:00
Li Fei1	572f755037	hv: vm: refine the devices unregistration sequence of vm shutdown Conceptually, the devices unregistration sequence of the shutdown process should be opposite to create. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-04-08 10:13:37 +08:00
Victor Sun	e5777adf5e	Makefile: support make with external configurations Customer might have specific folder where stores their own configurations for their customized scenario/board, so add TARGET_DIR parameter to support option that build hyprvisor with specified configurations. So valid usages are: (target = all \| hypervisor) 1. make <target> 2. make <target> KCONFIG_FILE=xxx [TARGET_DIR=xxx] 3. make <target> BOARD=xxx SCENARIO=xxx [TARGET_DIR=xxx] 4. make <target> BOARD_FILE=xxx SCENARIO_FILE=xxx [TARGET_DIR=xxx] 5. make <target> KCONFIG_FILE=xxx BOARD_FILE=xxx SCENARIO_FILE=xxx [TARGET_DIR=xxx] If TARGET_DIR parameter is not specified in make command, hypervisor will be built with board configurations under hypervisor/arch/x86/configs/ and scenario configurations under hypervisor/scenarios/. Moreover, the configurations would be overwritten if BOARD/SCENARIO files are specified in make command. If TARGET_DIR parameter is specified in make command, hypervisor will be built with configuration under that folder if no BOARD/SCENARIO files are specified. When BOARD/SCENARIO files are available in make command, the TARGET_DIR is used to store configurations that BOARD/SCENARIO file provided, i.e. Configurations in TARGET_DIR folder will be overwritten. Tracked-On: #4517 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-03 15:15:05 +08:00
Victor Sun	471f7e5b28	Makefile: parameters check for board and scenario When user use make parameters to specify BOARD and SCENARIO, there might be some conflict because parameter of KCONFIG_FILE/BOARD_FILE/SCENARIO_FILE also includes BOARD/SCENARIO info. To simplify, we only alow below valid usages: 1. make <target> 2. make <target> KCONFIG_FILE=xxx 3. make <target> BOARD=xxx SCENARIO=xxx 4. make <target> BOARD_FILE=xxx SCENARIO_FILE=xxx 5. make <target> KCONFIG_FILE=xxx BOARD_FILE=xxx SCENARIO_FILE=xxx Especially for case 1 that no any parameters are specified: a. If hypervisor/build/.config file which generated by "make menuconfig" exist, the .config file will be loaded as KCONFIG_FILE: i.e. equal: make <target> KCONFIG_FILE=hypervisor/build/.config b. If hypervisor/build/.config file does not exist, the default BOARD/SCENARIO will be loaded: i.e. equal: make <target> BOARD=$(BOARD) SCENARIO=$(SCENARIO) Tracked-On: #4517 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-03 15:15:05 +08:00
Sainath Grandhi	5958d6f65f	hv: Fix issues with the patch to reserve EPT 4K pages after boot This patch fixes couple of minor issues with patch `8ffe6fc6` Tracked-On: #4563 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-04-03 11:06:14 +08:00
Victor Sun	3888c444cc	HV: misra fix for multiboot2.c The patch fixed a few misra violations for multiboot2.c; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-04-03 09:01:24 +08:00
Yan, Like	70fa6dce53	hv: config: enable RDT for apl-up2 by default Tracked-On: #4566 Signed-off-by: Yan, Like <like.yan@intel.com>	2020-04-02 13:55:35 +08:00
Yan, Like	2997c4b570	HV: CAT: support cache allocation for each vcpu This commit allows hypervisor to allocate cache to vcpu by assigning different clos to vcpus of a same VM. For example, we could allocate different cache to housekeeping core and real-time core of an RTVM in order to isolate the interference of housekeeping core via cache hierarchy. Tracked-On: #4566 Signed-off-by: Yan, Like <like.yan@intel.com> Reviewed-by: Chen, Zide <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-02 13:55:35 +08:00
Binbin Wu	fcd9a1ca73	hv: vtd: use local var instead of global var In dmar_issue_qi_request, currently use a global var qi_status, which could cause potential issue when concurrent call to dmar_issue_qi_request for different DMAR units. Use local var instead. Tracked-On: #4535 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-02 11:31:40 +08:00
Sainath Grandhi	8ffe6fc67a	hv: Reserve space for VMs' EPT 4k pages after boot As ACRN prepares to support servers with large amounts of memory current logic to allocate space for 4K pages of EPT at compile time will increase the size of .bss section of ACRN binary. Bootloaders could run into a situation where they cannot find enough contiguous space to load ACRN binary under 4GB, which is typically heavily fragmented with E820 types Reserved, ACPI data, 32-bit PCI hole etc. This patch does the following 1) Works only for "direct" mode of vboot 2) reserves space for 4K pages of EPT, after boot by parsing platform E820 table, for all types of VMs. Size comparison: w/o patch Size of DRAM Size of .bss 48 GB 0xe1bbc98 (~226 MB) 128 GB 0x222abc98 (~548 MB) w/ patch Size of DRAM Size of .bss 48 GB 0x1991c98 (~26 MB) 128 GB 0x1a81c98 (~28 MB) Tracked-On: #4563 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 21:13:37 +08:00
Qian Wang	da027ff622	HV: Changed enum to marco to pass MISRA-C check hv: acpi: changed the enum "acpi_dmar_type" to macros to pass the MISRA-C check Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	c20228d36f	HV: simplified the logic of dmar_wait_completion hv: vtd: simplified the logic of dmar_wait_completion Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	698ad6bd4d	HV: renamed some structs more understandably hv: pci: renamed some internal data structs to make them more understandable Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	60704a5d9c	HV: renamed some static functions related to dmar hv: vtd: renamed some static functions from dmar_verb to verb_dmar Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	c2bcf9fade	HV: simplified the logic of iommu_read/write64 hv: vtd: simplified the logic of iommu_read/write64 Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	a1e081073f	HV: Corrected return type of two static functions hv: vtd: corrected the return type of get_qi_queue and get_ir_table to void * Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	fc9f089902	HV: remove multi-return in drhd_find_iter hv: dmar_parse: remove multi-return in drhd_find_iter Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Qian Wang	b55f414a9d	HV: Removed unused member variable of iommu_domain and related code hv: vtd: removed is_host (always false) and is_tt_ept (always true) member variables of struct iommu_domain and related codes since the values are always determined. Tracked-On: #4535 Signed-off-by: Qian Wang <qian1.wang@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-04-01 10:43:54 +08:00
Li Fei1	ea2616fbbf	hv: vlapic: minor fix about dereference vcpu from vlapic Since vcpu if remove from vlapic, we could not dereference it directly. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 15:59:52 +08:00
Li Fei1	2b7168da9e	hv: vmtrr: remove vcpu structure pointer from vmtrr We could use container_of to get vcpu structure pointer from vmtrr. So vcpu structure pointer is no need in vmtrr structure. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 10:57:47 +08:00
Li Fei1	1946661c51	hv: vpic: remove vm structure pointer from vpic We could use container_of to get vm structure pointer from vpic. So vm structure pointer is no need in vpic structure. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 10:57:47 +08:00
Li Fei1	0a7770cbb7	hv: vpci: remove vm structure pointer from vpci We could use container_of to get vm structure pointer from vpci. So vm structure pointer is no need in vpci structure. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 10:57:47 +08:00
Li Fei1	a7768fdb6a	hv: vlapic: remove vcpu/vm structure pointer from vlapic We could use container_of to get vcpu/vm structure pointer from vlapic. So vcpu/vm structure pointer is no need in vlapic structure. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 10:57:47 +08:00
Li Fei1	7f342bf62f	hv: list: rename list_entry to container_of This function casts a member of a structure out to the containing structure. So rename to container_of is more readable. Tracked-On: #4550 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-31 10:57:47 +08:00
dongshen	1328dcb205	hv: extend union dmar_ir_entry to support VT-d posted interrupts Exend union dmar_ir_entry to support VT-d posted interrupts. Rename some fields of union dmar_ir_entry: entry --> value sw_bits --> avail Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	016c1a5073	hv: pass pointer to functions Pass intr_src and dmar_ir_entry irte as pointers to dmar_assign_irte(), which fixes the "Attempt to change parameter passed by value" MISRA C violation. A few coding style fixes Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	0f3c876a91	hv: extend struct pi_desc to support VT-d posted interrupts For CPU side posted interrupts, it only uses bit 0 (ON) of the PI's 64-bit control , other bits are don't care. This is not the case for VT-d posted interrupts, define more bit fields for the PI's 64-bit control. Use bitmap functions to manipulate the bit fields atomically. Some MISRA-C violation and coding style fixes Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	8f732f2809	hv: move pi_desc related code from vlapic.h/vlapic.c to vmx.h/vmx.c/vcpu.h The posted interrupt descriptor is more of a vmx/vmcs concept than a vlapic concept. struct acrn_vcpu_arch stores the vmx/vmcs info, so put struct pi_desc in struct acrn_vcpu_arch. Remove the function apicv_get_pir_desc_paddr() A few coding style/typo fixes Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
dongshen	b384d04ad1	hv: rename vlapic_pir_desc to pi_desc Rename struct vlapic_pir_desc to pi_desc Rename struct member and local variable pir_desc to pid pir=posted interrupt request, pi=posted interrupt pid=posted interrupt descriptor pir is part of pi descriptor, so it is better to use pi instead of pir struct pi_desc will be moved to vmx.h in subsequent commit. Tracked-On: #4506 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-31 10:30:30 +08:00
Zide Chen	1d2aea1ebd	hv: some coding refinement in hypercall.c - since now we don't need to print error messages if copy_to/from_gpa() fails, then in many cases we can simplify the function return handling. In the following example, my fix could change the 'ret' value from the original '-1' to the actual errno returned from copy_to_gpa(). But this is valid. Ideally we may replace all '-1' with the actual errno. - if (copy_to_gpa() < 0) { - pr_err("error messages"); - ret = -1; - } else { - ret = 0; - } + ret = copy_to_gpa(); - in most cases, 'ret' is declared with a default value 0 or -1, then the redundant assignment statements can be removed. - replace white spaces with tabs. Tracked-On: #3854 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-30 13:19:01 +08:00
Zide Chen	eef3b51eda	hv: move error message logging into gpa copy APIs In this way, the code looks simpler and line of code is reduced. Tracked-On: #3854 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-30 13:19:01 +08:00
Li Fei1	4512ef7ec9	hv: cpuid: remove cpuid() The cupid() can be replaced with cupid_subleaf, which is more clear. Having both APIs makes reading difficult. Tracked-On: #4526 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-25 13:26:58 +08:00
Sainath Grandhi	6b517c58f1	hv: Server platforms can have more than 8 IO-APICs To support server platforms with more than 8 IO-APICs Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	fe5a108c7b	hv: vioapic init for SOS VM on platforms with multiple IO-APICs For SOS VM, when the target platform has multiple IO-APICs, there should be equal number of virtual IO-APICs. This patch adds support for emulating multiple vIOAPICs per VM. Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	f67ac09141	hv: Handle holes in GSI i.e. Global System Interrupt for multiple IO-APICs MADT is used to specify the GSI base for each IO-APIC and the number of interrupt pins per IO-APIC is programmed into Max. Redir. Entry register of that IO-APIC. On platforms with multiple IO-APICs, there can be holes in the GSI space. For example, on a platform with 2 IO-APICs, the following configuration has a hole (from 24 to 31) in the GSI space. IO-APIC 1: GSI base - 0, number of pins - 24 IO-APIC 2: GSI base - 32, number of pins - 8 This patch also adjusts the size for variables used to represent the total number of IO-APICs on the system from uint16_t to uint8_t as the ACPI MADT uses only 8-bits to indicate the unique IO-APIC IDs. Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	85217e362f	hv: Introduce Global System Interrupt (GSI) into INTx Remapping As ACRN prepares to support platforms with multiple IO-APICs, GSI is a better way to represent physical and virtual INTx interrupt source. 1) This patch replaces usage of "pin" with "gsi" whereever applicable across the modules. 2) PIC pin to gsi is trickier and needs to consider the usage of "Interrupt Source Override" structure in ACPI for the corresponding VM. Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	2fe3004202	hv: Pass address of vioapic struct to register_mmio_emulation_handler Changes the mmio handler data from that of the acrn_vm struct to the acrn_vioapic. Add nr_pins and base_addr to the acrn_vioapic data structure. Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	dd6c80c305	hv: Move error checking for hypercall parameters out of assign module Moving checks on validity of IOAPIC interrupt remapping hypercall parameters to hypercall module Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Sainath Grandhi	06b59e0bc1	hv: Use ptirq_lookup_entry_by_sid to lookup virtual source id in IOAPIC irq entries Reverts `538ba08c`: hv:Add vpin to ptdev entry mapping for vpic/vioapic ACRN uses an array of size per VM to store ptirq entries against the vIOAPIC pin and an array of size per VM to store ptirq entries against the vPIC pin. This is done to speed up "ptirq entry" lookup at runtime for Level triggered interrupts in API ptirq_intx_ack used on EOI. This patch switches the lookup API for INTx interrupts to the API, ptirq_lookup_entry_by_sid This could add delay to processing EOI for Level triggered interrupts. Trade-off here is space saved for array/s of size CONFIG_MAX_IOAPIC_LINES with 8 bytes per data. On a server platform, ACRN needs to emulate multiple vIOAPICs for SOS VM, same as the number of physical IO-APICs. Thereby ACRN would need around 10 such arrays per VM. Removes the need of "pic_pin" except for the APIs facing the hypercalls hcall_set_ptdev_intr_info, hcall_reset_ptdev_intr_info Tracked-On: #4151 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-25 09:36:18 +08:00
Victor Sun	52f26cba8a	hv: a few fixes for multiboot2 boot - need to specify the load_addr in the multiboot2 address tag. GRUB needs it to correctly calculate the ACRN binary's load size if load_end_addr is a non-zero value. - multiboot2 can be enabled if hypervisor relocation is disabled. - print the name of the boot loader. This might be helpful if the boot loader, e.g. GRUB, inludes its version in the name string. Tracked-On: #4441 Signed-off-by: Victor Sun <victor.sun@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>	2020-03-24 08:44:20 +08:00
Li Fei1	e99ddf28c3	hv: vpci: handle the quirk part for pass through pci device cfg access in dm There're some PCI devices need special handler for vendor-specical feature or capability CFG access. The Intel GPU is one of them. In order to keep the ACRN-HV clean, we want to throw the qurik part of PCI CFG asccess to DM to handle. To achieve this, we implement per-device policy base on whether it needs quirk handler for a VM: each device could configure as "quirk pass through device" or not. For a "quirk pass through device", we will handle the general part in HV and the quirk part in DM. For a non "quirk pass through device", we will handle all the part in HV. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-20 10:08:43 +08:00
Li Fei1	e5c7a96513	hv: vpci: sos could access low severity guest pci cfg space There're some cases the SOS (higher severity guest) needs to access the post-launched VM (lower severity guest) PCI CFG space: 1. The SR-IOV PF needs to reset the VF 2. Some pass through device still need DM to handle some quirk. In the case a device is assigned to a UOS and is not in a zombie state, the SOS is able to access, if and only if the SOS has higher severity than the UOS. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-20 10:08:43 +08:00
Yuan Liu	9375c634dc	hv: unmap SR-IOV VF MMIO when the VF physical device is disabled To avoid information leakage, we need to ensure that the device is inaccessble when it does not exist. For SR-IOV disabled VF device, we have the following operations. 1. The configuration space accessing will get 0xFFFFFFFF as a return value after set the device state to zombie. 2. The BAR MMIO EPT mapping are removed, the accesssing causes EPT violation. 3. The device will be detached from IOMMU. 4. The IRQ pin and vector are released. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-18 21:46:54 +08:00
Mingqiang Chi	14692ef60c	hv:Rename two VM states Rename: VM_STARTED --> VM_RUNNING VM_POWERING_OFF --> VM_READY_TO_POWEROFF Tracked-On: #4320 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-13 10:34:29 +08:00
Victor Sun	a8c2ba03fc	HV: add pci_devices.h for nuc6cayh and apl-up2 As pci_devices.h is included by <page.h>, need to prepare pci_devices.h for nuc6cayh and apl-up2 board. Also the #error info in generic/pci_devices.h should be removed, otherwise the build will be failed in sdc/sdc2/industry scenarios. Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	a68f655a11	HV: update ept address range for pre-launched VM For a pre-launched VM, a region from PTDEV_HI_MMIO_START is used to store 64bit vBARs of PT devices which address is high than 4G. The region should be located after all user memory space and be coverd by guest EPT address. Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	e74553492a	HV: move create_sos_vm_e820 to ve820.c ve820.c is a common file in arch/x86/guest/ now, so move function of create_sos_vm_e820() to this file to make code structure clear; Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	a7b61d2511	HV: remove board specific ve820 Remove useless per board ve820.c as arch/x86/guest/ve820.c is common for all boards now; Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	d7eac3fe6a	HV: decouple prelaunch VM ve820 from board configs hypervisor/arch/x86/configs/$(BOARD)/ve820.c is used to store pre-launched VM specific e820 entries according to memory configuration of customer. It should be a scenario based configurations but we had to put it in per board foler because of different board memory settings. This brings concerns to customer on configuration orgnization. Currently the file provides same e820 layout for all pre-launched VMs, but they should have different e820 when their memory are configured differently. Although we have acrn-config tool to generate ve802.c automatically, it is not friendly to modify hardcoded ve820 layout manually, so the patch changes the entries initialization method by calculating each entry item in C code. Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Victor Sun	4c0965d89e	HV: correct ept page array usage Currently ept_pages_info[] is initialized with first element only that force VM of id 0 using SOS EPT pages. This is incorrect for logical partition and hybrid scenario. Considering SOS_RAM_SIZE and UOS_RAM_SIZE are configured separately, we should use different ept pages accordingly. So, the PRE_VM_NUM/SOS_VM_NUM and MAX_POST_VM_NUM macros are introduced to resolve this issue. The macros would be generated by acrn-config tool when user configure ACRN for their specific scenario. One more thing, that when UOS_RAM_SIZE is less then 2GB, the EPT address range should be (4G + PLATFORM_HI_MMIO_SIZE). Tracked-On: #4458 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 14:56:34 +08:00
Yuan Liu	e9a99845f6	hv: refine read/write configuration APIs for vmsi/vmsix change vmsi_read_cfg to read_vmsi_cfg, same applies to writing change vmsix_read_cfg to read_vmsix_cfg, same applies to writing Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-12 10:40:02 +08:00
Li Fei1	4b6dd19ad1	hv: pci: rename CFG read/write function for PCI-compatible Configuration Mechanism Move CFG read/write function for PCI-compatible Configuration Mechanism from debug/uartuart16550.c to hw/pci.c and rename CFG read/write function for PCI-compatible Configuration Mechanism to pci_pio_read/write_cfg to align with CFG read/write function pci_mmcfg_read/write_cfg for PCI Express Enhanced Configuration Access Mechanism. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-12 09:17:02 +08:00
Mingqiang Chi	790614e952	hv:rename several variables and api for ioapic rename: ioapic_get_gsi_irq_addr --> gsi_to_ioapic_base ioapic_addr -->ioapic_base Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2020-03-11 13:26:15 +08:00
Li Fei1	fa74bf401d	hv: vpci: pass through stolen memory and opregion memory for GVT-D In order to add GVT-D support, we need pass through stolen memory and opregion memroy to the post-launched VM. To implement this, we first reserve the GPA for stolen memory and opregion memory through post-launched VM e820 table. Then we would build EPT mapping between the GPA and the stolen memory and opregion memory real HPA. The last, we need to return the GPA to post-launched VM if it wants to read the stolen memory and opregion memory address and prevent post-launched VM to write the stolen memory and opregion memory address register for now. We do the GPA reserve and GPA to HPA EPT mapping in ACRN-DM and the stolen memory and opregion memory CFG space register access emulation in ACRN-HV. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-11 10:59:23 +08:00
Zide Chen	659e5420df	hv: add static check for CONFIG_HV_RAM_START and CONFIG_HV_RAM_SIZE Hypervisor uses 2MB large page, and if either CONFIG_HV_RAM_START or CONFIG_HV_RAM_SIZE is not aligned to 2MB, ACRN won't boot. Add static check to avoid unexpected boot failures. If CONFIG_RELOC is enabled, CONFIG_HV_RAM_START is not directly referred by the code, but it causes problems because ld_text_end could be relocated to an address that is not 2MB aligned which fails mmu_modify_or_del(). Tracked-On: #4441 Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-11 10:37:50 +08:00
Yuan Liu	696f6c7ba4	hv: the VM can only deinit its own devices VM needs to check if it owns this device before deiniting it. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-11 08:35:30 +08:00
Yuan Liu	d8a19f9978	hv: refine naming Change enable_vf/disable_vf to create_vfs/disable_vfs Change base member of pci_vbar to base_gpa Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-11 08:35:30 +08:00
Yuan Liu	08ed45f4b4	hv: fix wrong VF BDF The vf_bdf is not initialized when invoking pci_pdev_read_cfg function. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-11 08:35:30 +08:00
Yuan Liu	7b429fe483	hv: prohibit PF from being assigned We didn't support SR-IOV capability of PF in UOS for now, we should hide the SR-IOV capability if we pass through the PF to a UOS. For now, we don't support assignment of PF to a UOS. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-03-11 08:35:30 +08:00
Yuan Liu	657af925c1	hv: passthrough a VF device Emulate Device ID, Vendor ID and MSE(Memory Space Enable) bit in configuration space for an assigned VF, initialize assgined VF Bars. The Device ID comes from PF's SRIOV capability The Vendor ID comes from PF's Vendor ID The PCI MSE bit always be set when VM reads from an assigned VF. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-11 08:35:30 +08:00
Yuan Liu	640cf57c14	hv: disable VF device If a VF instance is disabled, we didn’t remove the vdev instance, only set the vdev as a zombie vdev instance, indicating that it cannot be accessed anymore. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-11 08:35:30 +08:00
Yuan Liu	2a4235f200	hv: refine function find_vdev Change name find_vdev to find_available_vdev and add comments Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-11 08:35:30 +08:00
Yuan Liu	d67d0538e6	hv: initialize VF BARs The VF BARs are initialized by its PF SRIOV capability Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-11 08:35:30 +08:00
Yuan Liu	ddd6253a4c	hv: wrap msix map/unmap operations Refine coding style to wrap msix map/unmap operations, clean up repeated assignments for msix mmio_hpa and mmio_size. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-11 08:35:30 +08:00
Li Fei1	41350c533c	hv: vpci: add _v prefix for some function name Add _v prefix for some function name to indicate this function wants to operate on virtual CFG space or virtual BAR register. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-09 17:09:55 +08:00
fuzhongl	f727d1e741	HV: sdc2 UUID update Since there is no RTVM requirement for sdc2 scenario, replace uuid 495ae2e5-2603-4d64-af76-d4bc5a8ec0e5 which is dedicated to RTVM with 615db82a-e189-4b4f-8dbb-d321343e4ab3 Tracked-On: #4472 Signed-off-by: fuzhongl <fuzhong.liu@intel.com> Reviewed-by: Sun Victor <victor.sun@intel.com>	2020-03-09 13:28:31 +08:00
Yuan Liu	60a7c49bb0	hv: Refine code for API reduction Removed the pci_vdev_write_cfg_u8/u16/u32 APIs and only used pci_vdev_write_cfg as the API for writing vdev's cfgdata Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-09 12:56:00 +08:00
Li Fei1	e5ae37eb69	hv: mmu: minor fix about add_pte In Commit `127c73c3`, we remove the strict check for adding page table mapping. However, we just replace the ASSERT of pr_fatal in add_pte. This is not enough. We still add the virtual address by 4K if the page table mapping is exist and check the virtual address is over the virtual address region for this mapping. Otherwise, The complain will continue for 512 times at most. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-09 10:03:01 +08:00
Li Fei1	4367657771	hv: vpci: add a global CFG header configuration access handler Add cfg_header_read_cfg and cfg_header_write_cfg to handle the 1st 64B CFG Space header PCI configuration space. Only Command and Status Registers are pass through; Only Command and Status Registers and Base Address Registers are writable. In order to implement this, we add two type bit mask for per 4B register: pass through mask and read-only mask. When pass through bit mask is set, this means this bit of this 4B register is pass through, otherwise, it is virtualized; When read-only mask is set, this means this bit of this 4B register is read-only, otherwise, it's writable. We should write it to physical CFG space or virtual CFG space base on whether the pass through bit mask is set or not. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-06 14:08:04 +08:00
Sainath Grandhi	460e7ee5b1	hv: Variable/macro renaming for intr handling of PT devices using IO-APIC/PIC 1. Renames DEFINE_IOAPIC_SID with DEFINE_INTX_SID as the virtual source can be IOAPIC or PIC 2. Rename the src member of source_id.intx_id to ctlr to indicate interrupt controller 2. Changes the type of src member of source_id.intx_id from uint32_t to enum with INTX_CTLR_IOAPIC and INTX_CTLR_PIC Tracked-On: #4447 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>	2020-03-06 11:29:02 +08:00
Minggui Cao	2aaa050cab	HV: move out physical cfg write from vpci-bridge for vpci_bridge it is better just write the virtual configure space, so move out the PCI bridge phyiscal cfg write to pci.c also add some rules in config pci bridge. Tracked-On: #3381 Signed-off-by: Minggui Cao <minggui.cao@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-06 08:47:46 +08:00
Minggui Cao	ad4d14e37f	HV: enable ARI if PCI bridge support it For SRIOV needs ARI support, so enable it in HV if the PCI bridge support it. TODO: need check all the PCI devices under this bridge can support ARI, if not, it is better not enable it as PCIe spec. That check will be done when scanning PCI devices. Tracked-On: #3381 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Minggui Cao <minggui.cao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-06 08:47:46 +08:00
Victor Sun	b6684f5b61	HV: sanitize config file for whl-ipc-i5 - remove limit of CONFIG_HV_RAM_SIZE which is for scenario of 2 VMs only, the default size from Kconfig could build scenario which up to 5 VMs; - rename whl-ipc-i5_acpi_info.h to platform_acpi_info.h, since the former one should be generated by acrn-config tool; - add SOS related macros in misc.h, otherwise build scenarios which has SOS VM would be failed; Tracked-On: #4463 Signed-off-by: Victor Sun <victor.sun@intel.com>	2020-03-06 08:34:12 +08:00
Zide Chen	67cb1029d9	hv: update the hypervisor 64-bit entry address for efi-stub - remove .data and .text directives. We want to place all the boot data and text in the .entry section since the boot code is different from others in terms of relocation fixup. With this change, the page tables are in entry section now and it's aligned at 4KB. - regardless CONFIG_MULTIBOOT2 is set or not, the 64-bit entry offset is fixed at 0x1200: 0x00 -- 0x10: Multiboot1 header 0x10 -- 0x88: Multiboot2 header if CONFIG_MULTIBOOT2 is set 0x1000: start of entry section: cpu_primary_start_32 0x1200: cpu_primary_start_64 (thanks to the '.org 0x200' directive) GDT tables initial page tables etc. Tracked-On: #4441 Reviewed-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-06 08:27:46 +08:00
Zide Chen	49ffe168af	hv: fixup relocation delta for symbols belong to entry section This is to enable relocation for code32. - RIP relative addressing is available in x86-64 only so we manually add relocation delta to the target symbols to fixup code32. - both code32 and code64 need to load GDT hence both need to fixup GDT pointer. This patch declares separate GDT pointer cpu_primary64_gdt_ptr for code64 to avoid double fixup. - manually fixup cpu_primary64_gdt_ptr in code64, but not rely on relocate() to do that. Otherwise it's very confusing that symbols from same file could be fixed up externally by relocate() or self-relocated. - to make it clear, define a new symbol ld_entry_end representing the end of the boot code that needs manually fixup, and use this symbol in relocate() to filter out all symbols belong to the entry sections. Tracked-On: #4441 Reviewed-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-06 08:27:46 +08:00
Chen, Zide	2aa8c9e5d4	hv: add multiboot2 tags to load relocatable raw binary GRUB multiboot2 doesn't support relocation for ELF, which means it can't load acrn.32.out to other address other than the one specified in ELF header. Thus we need to use the raw binary file acrn.bin, and add address/entry address/relocatable tags to instruct multiboot2 loader how to load the raw binary. Tracked-On: #4441 Reviewed-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-06 08:27:46 +08:00
Chen, Zide	97fc0efe20	hv: remove unused cpu_primary_save_32() In direct boot mode, boot_context[] which is saved from cpu_primary_save_32() is no longer used since commit `6beb34c3cb` ("vm_load: update init gdt preparation"). Thus, the call to it and the function itself can be removed. Tracked-On: #4441 Reviewed-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-06 08:27:46 +08:00
Yuan Liu	f0e5387e1c	hv: remove pci_vdev_read_cfg_u8/16/32 reduce the use of similar APIs (particularly the name confusion) for CFG space read/write. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-05 22:21:21 +08:00
Yuan Liu	e1ca1ae2e9	hv: refine functions name Make the name of the functions more accurate Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-05 22:21:21 +08:00
Li Fei1	7c82efb938	hv: pci: add some pre-assumption and safety check for PCIe ECAM Add some pre-assumption and safety check for PCIe ECAM: 1) ACRN only support platforms with PCIe ECAM to access PCIe device CFG space; 2) Must not use ECAM to access PCIe device CFG space before pci_switch_to_mmio_cfg_ops was called. (In release version, ACRN didn't support IO port Mechanism. ECAM is the only way to access the PCIe device CFG space). Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-03-05 15:42:53 +08:00
Binbin Wu	667639b591	doc: fix a missing argument in the function description One argument is missing for the function ptirq_alloc_entry. This patch fixes the doc generation error. Tracked-On: #3882 Signed-off-by: Binbin Wu <binbin.wu@intel.com>	2020-03-05 13:08:57 +08:00
Zide Chen	93fa2bc0fc	hv: minor fixes in init_paging() - change variable name from hpa to hva because in this function we are dealing with hva, not hpa. - can get the address of ld_text_end by directly referring to this symbol, because relative addressing yields the correct hva, not the hva before relocation. Tracked-On: #4441 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-03-05 10:18:56 +08:00
Yuan Liu	734ad6ce30	hv: refine pci_read_cap and pci_read_ext_cap The pci_read_cap and pci_read_ext_cap are used to enumerate PCI legacy capability and extended capability. Change the name pci_read_cap to pci_enumerate_cap Change the name pci_read_ext_cap to pci_enumerate_ext_cap Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-05 10:15:15 +08:00
Binbin Wu	76f2e28e13	doc: update hv device passthrough document Fixed misspellings and rst formatting issues. Added ptdev.h to the list of include file for doxygen Tracked-On: #3882 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Signed-off-by: David B. Kinder <david.b.kinder@intel.com>	2020-03-04 18:05:15 -05:00
Binbin Wu	b05c1afa0b	doc: add doxygen style comments to ptdev Add doxygen style comments to ptdev public APIs. Add these API descriptions to group acrn_passthrough. Tracked-On: #3882 Signed-off-by: Binbin Wu <binbin.wu@intel.com>	2020-03-04 18:05:15 -05:00
Vijay Dhanraj	b6c0558b60	HV: Update existing board.c files for RDT MBA This patch updates board.c files for RDT MBA on existing platforms. Also, fixes setting RDT flag in WHL config file. Tracked-On: #3725 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-04 17:33:50 +08:00
Vijay Dhanraj	92ee33b035	HV: Add MBA support in ACRN This patch adds RDT MBA support to detect, configure and and setup MBA throttle registers based on VM configuration. Tracked-On: #3725 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-04 17:33:50 +08:00
Yuan Liu	d54deca87a	hv: initialize SRIOV VF device create new pdev and vdev structures for a SRIOV VF device initialization Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Yuan Liu	176cb31c31	hv: refine vpci_init_vdev function Add a new parameter pf_vdev for function vpci_init_vdev to support SRIOV VF vdev initializaiton. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Yuan Liu	320ed6c238	hv: refine init_one_dev_config The init_one_dev_config is used to initialize a acrn_vm_pci_dev_config SRIOV needs a explicit acrn_vm_pci_dev_config to create a VF vdev,so refine it to return acrn_vm_pci_dev_config. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Yuan Liu	87e7d79112	hv: refine init_pdev function Due to SRIOV VF physical device needs to be initialized when VF_ENABLE is set and a SRIOV VF physical device initialization is same with standard PCIe physical device, so expose the init_pdev for SRIOV VF physical device initialization. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Yuan Liu	abbdef4f5d	hv: implement SRIOV VF_BAR initialization All SRIOV VF physical devices don't have bars in configuration space, they are from the VF associated PF's VF_BAR registers of SRIOV capability. Adding a vbars data structure in pci_cap_sriov data structure to store SRIOV VF_BAR information, so that each VF bars can be initialized directly through the vbars instead multiple accessing of the PF VF_BAR registers. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Yuan Liu	298ef2f5c4	hv: refine init_vdev_pt function To support SRIOV capability initialization, add a new parameter is_sriov_pf_vdev for init_vdev_pt function. If parameter is_sriov_pf_vdev of function init_vdev_pt is true, then function init_vdev_pt initializes the vdev's SRIOV capability. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-03-03 09:32:11 +08:00
Conghui Chen	595cefe3f2	hv: xsave: move assembler to individual function Current code avoid the rule 88 S in MISRA-C, so move xsaves and xrstors assembler to individual functions. Tracked-On: #4436 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 17:55:06 +08:00
Yuan Liu	2f7483065b	hv: introduce SRIOV interception VF_ENABLE is one field of SRIOV capability that is used to create or remove VF physical devices. If VF_ENABLE is set, hv can detect if the VF physical devices are ready after waiting 100 ms. v2: Add sanity check for writing NumVFs register, add precondition and application constraints when VF_ENABLE is set and refine code style. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 14:04:01 +08:00
Yuan Liu	14931d11e0	hv: add SRIOV capability read/write entries Introduce SRIOV capability field for pci_vdev and add SRIOV capability interception entries. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 14:04:01 +08:00
Yuan Liu	5e989f13c6	hv: check if there is enough room for all SRIOV VFs. Make the SRIOV-Capable device invisible from SOS if there is no room for its all virtual functions. v2: fix a issue that if a PF has been dropped, the subsequent PF will be dropped too even there is room for its VFs. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 14:04:01 +08:00
Yuan Liu	ac1477956c	hv: implement SRIOV-Capable device detection. if the device has PCIe capability, walks all PCIe extended capabilities for SRIOV discovery. v2: avoid type casting and refine naming. Tracked-On: #4433 Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 14:04:01 +08:00
Zide Chen	c751a8e88b	hv: refine confusing e820 table logging layout It puts the new line in the wrong place, and the logs are confusing. For example, for these entries: mmap[0] - type: 1, base: 0x00000, length: 0x9800 mmap[1] - type: 2, base: 0x98000, length: 0x8000 mmap[2] - type: 3, base: 0xc0000, length: 0x4000 Currently it prints them in this way: mmap table: 0 type: 0x1 Base: 0x0000000000000000 length: 0x0000000000098000 mmap table: 1 type: 0x2 Base: 0x0000000000098000 length: 0x0000000000008000 mmap table: 2 type: 0x3 Base: 0x00000000000c0000 length: 0x0000000000040000 With this fix, it looks like the following, and now it's of same style with how prepare_sos_vm_memmap() logs ve820 tables. mmap table: 0 type: 0x1 Base: 0x0000000000000000 length: 0x0000000000098000 mmap table: 1 type: 0x2 Base: 0x0000000000098000 length: 0x0000000000008000 mmap table: 2 type: 0x3 Base: 0x00000000000c0000 length: 0x0000000000040000 Tracked-On: #1842 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-02-28 09:34:17 +08:00
Minggui Cao	bd92304dcf	HV: add vpci bridge operations support add vpci bridge operations in hypervisor, to avoid SOS mis-operations to affect other VM's PCI devices. assumption: before hypervisor bootup, the physical pci-bridge shall be configured correctly by BIOS or other bootloader; for ACS (Access Control Service) capability, it is configured by BIOS to support the devices under it to be isolated and allocated to different VMs. to simplify the emulations of vpci bridge, set limitations as following: 1. expose all configure space registers, but readonly 2. BIST not support; by default is 0 3. not support interrupt, including INTx and MSI. TODO: 1. configure tool can select whether a PCI bridge is emulated or pass through. Open: 1. SOS how to reset PCI device under the PCI bridge? Tracked-On: #3381 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Minggui Cao <minggui.cao@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2020-02-28 09:24:51 +08:00
Conghui Chen	c246d1c9b8	hv: xsave: bugfix for init value The init value for XCR0 and XSS should be the same with spec: In SDM Vol1 13.3: XCR0[0] is associated with x87 state (see Section 13.5.1). XCR0[0] is always 1. The other bits in XCR0 are all 0 coming out of RESET. The IA32_XSS MSR (with MSR index DA0H) is zero coming out of RESET. The previous code try to fix the xsave area leak to other VMs during init phase, but bring the error to linux. Besides, it cannot avoid the possible leak in running phase. Need find a better solution. Tracked-On: #4430 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-28 09:19:29 +08:00
Junming Liu	96f92373cd	hv:refine comment about intel integrated gpu dmar The dedicated DMAR unit for Intel integrated GPU shall be available on the physical platform. So remove the assert and add application constraint in handle_one_drhd func. Tracked-On: #4405 Signed-off-by: Junming Liu <junming.liu@intel.com> Reviewed-by: Wu Binbin <binbin.wu@intel.com> Reviewed-by: Wu Xiangyang <xiangyang.wu@linux.intel.com>	2020-02-28 09:14:27 +08:00
Vijay Dhanraj	cef3322da8	HV: Add WhiskeyLake board configuration files This patch adds offline tool generated WhiskeyLake board configurations files. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	eaad91fd71	HV: Remove RDT code if CONFIG_RDT_ENABLED flag is not set This patch does the following, 1. Removes RDT code if CONFIG_RDT_ENABLED flag is not set. 2. Set the CONFIG_RDT_ENABLED flag only on platforms that support RDT so that build scripts will automatically reflect the config. Tracked-On: #3715 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	d0665fe220	HV: Generalize RDT infrastructure and fix RDT cache configuration. This patch creates a generic infrastructure for RDT resources instead of just L2 or L3 cache. This patch also fixes L3 CAT config overwrite by L2 in cases where both L2 and L3 CAT are supported. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	887e3813bc	HV: Add both HW and SW checks for RDT support There can be times when user unknowinlgy enables CONFIG_CAT_ENBALED SW flag, but the hardware might not support L3 or L2 CAT. In such case software can end up writing to the CAT MSRs which can cause undefined results. The patch fixes the issue by enabling CAT only when both HW as well software via the CONFIG_CAT_ENABLED supports CAT. The patch also address typo with "clos2prq_msr" function name. It should be "clos2pqr_msr" instead. PQR stands for platform qos register. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	b8a021d658	HV: split L2 and L3 cache resource MSR Upcoming intel platforms can support both L2 and L3 but our current code only supports either L2 or L3 CAT. So split the MSRs so that we can support allocation for both L2 and L3. This patch does the following, 1. splits programming of L2 and L3 cache resource based on the resource ID. 2. Replace generic platform_clos_array struct with resource specific struct in all the existing board.c files. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Vijay Dhanraj	2597429903	HV: Rename cat.c/.h files to rdt.c/.h As part of rdt cat refactoring, goal is to combine all rdt specific features such as CAT under one module. So renaming rdt resouce specific files such as cat.c/.h to generic rdt.c/.h files. Tracked-On: #3715 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-27 10:44:07 +08:00
Yonghua Huang	b2c6cf7753	hv: refine retpoline speculation barriers Per Section 4.4 Speculation Barriers, in "Retpoline: A Branch Target Inject Mitigation" white paper, "LFENCE instruction limits the speculative execution that a processor implementation can perform around the LFENCE, possibly impacting processor performance,but also creating a tool with which to mitigate speculative-execution side-channel attacks." Tracked-On: #4424 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2020-02-26 09:24:54 +08:00
Victor Sun	da3d181f62	HV: init efi info with multiboot2 Initialize efi info of acrn mbi when boot from multiboot2 protocol, with this patch hypervisor could get host efi info and pass it to Linux zeropage, then make guest Linux possible to boot with efi environment; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	69da0243f5	HV: init module and rsdp info with multiboot2 Initialize module info and ACPI rsdp info of acrn mbi when boot from multiboot2 protocol, with this patch SOS VM could be loaded sucessfully with correct ACPI RSDP; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	b669a71931	HV: init mmap info with multiboot2 Initialize mmap info of acrn mbi when boot from multiboot2 protocol, with this patch acrn hv could boot from multiboot2; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	d008b72fdd	HV: add multiboot2 header info Add multiboot2 header info in HV image so that bootloader could recognize it. Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	19ffaa50dc	HV: init and sanitize acrn multiboot info Initialize and sanitize a acrn specific multiboot info struct with current supported multiboot1 in very early boot stage, which would bring below benifits: - don't need to do hpa2hva convention every time when refering boot_regs; - panic early if failed to sanitize multiboot info, so that don't need to check multiboot info pointer/flags and panic in later boot process; - keep most code unchanged when introduce multiboot2 support in future; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	520a0222d3	HV: re-arch boot component header The patch re-arch boot component header files by: - moving multiboot.h from include/arch/x86/ to boot/include/ and keep this header for multiboot1 protocol data struct only; - moving multiboot related MACROs in cpu_primary.S to multiboot.h; - creating an independent boot.h to store acrn specific boot information for other files' reference; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Victor Sun	708cae7c88	HV: remove DBG_LEVEL_PARSE - It is meaningless to enable debug function in parse_hv_cmdline() because the function run in very eary stage and uart has not been initialized at that time, so remove this debug level definition; - Rewrite parse_hv_cmdline() function to make it compliant with MISRA-C; - Decouple uart16550 stuff from Init.c module and let console.c handle it; Tracked-On: #4419 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-26 09:24:16 +08:00
Yin Fengwei	a46a7b3524	Makefile: Fix build issue if the ld is updated to 2.34 We hit build issue if the ld version is 2.34: error: PHDR segment not covered by LOAD segment One issue was created to binutils bugzilla system: https://sourceware.org/bugzilla/show_bug.cgi?id=25585 From the ld guys comment, this is not an issue of 2.34. It's an issue fixing of the old ld. He suggested to add option --no-dynamic-linker to ld if we don't depend on dynamically linker to loader our binary. Tracked-On: #4415 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2020-02-25 09:14:32 +08:00
Conghui Chen	ad606102d2	hv: sched_bvt: add tick hanlder Count down number will be decreased at each tick, when it comes to zero, it will trigger reschedule. Tracked-On: #4410 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-25 09:11:32 +08:00
Conghui Chen	77c64ecb79	hv: sched_bvt: add pick_next function pick_next function will update the virtual time parameters, and return the vcpu thread with earlest evt. Calculate the count down number for the picked vcpu thread, it means how many mcu a thread can run before the next reschedule occur. Tracked-On: #4410 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-25 09:11:32 +08:00
Conghui Chen	a38f2cc918	hv: sched_bvt: add wakeup and sleep handler In the wakeup handler, the vcpu_thread object will be inserted into the runqueue, and in the sleep handler, it will be removed from the queue. vcpu_thread object is ordered by EVT (effective virtual time). Tracked-On: #4410 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-25 09:11:32 +08:00
Conghui Chen	e05eb42c1e	hv: sched_bvt: add init and deinit function Add init function for bvt scheduler, creating a runqueue and a period timer, the timer interval is default as 1ms. The interval is the minimum charging unit. Tracked-On: #4410 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-25 09:11:32 +08:00
Conghui Chen	a7563cb9bd	hv: sched_bvt: add BVT scheduler BVT (Borrowed virtual time) scheduler is used to schedule vCPUs on pCPU. It has the concept of virtual time, vCPU with earliset virtual time is dispatched first. Main concepts: tick timer: a period tick is used to measure the physcial time in units of MCU (minimum charing unit). runqueue: thread in the runqueue is ordered by virtual time. weight: each thread receives a share of the pCPU in proportion to its weight. context switch allowance: the physcial time by which the current thread is allowed to advance beyond the next runnable thread. warp: a thread with warp enabled will have a change to minus a value (Wi) from virtual time to achieve higher priority. virtual time: AVT: actual virtual time, advance in proportional to weight. EVT: effective virtual time. EVT <- AVT - ( warp ? Wi : 0 ) SVT: scheduler virtual time, the minimum AVT in the runqueue. Tracked-On: #4410 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-25 09:11:32 +08:00
Yonghua Huang	64b874ce4c	hv: rename BOOT_CPU_ID to BSP_CPU_ID 1. Rename BOOT_CPU_ID to BSP_CPU_ID 2. Repace hardcoded value with BSP_CPU_ID when ID of BSP is referenced. Tracked-On: #4420 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-02-25 09:08:14 +08:00
Li Fei1	4adad73cfc	hv: mmio: refine mmio access handle lock granularity Now only PCI MSI-X BAR access need dynamic register/unregister. Others don't need unregister once it's registered. So we don't need to lock the vm level emul_mmio_lock when we handle the MMIO access. Instead, we could use finer granularity lock in the handler to ptotest the shared resource. This patch fixed the dead lock issue when OVMF try to size the BAR size: Becasue OVMF use ECAM to access the PCI configuration space, it will first hold vm emul_mmio_lock, then calls vpci_handle_mmconfig_access. While this tries to size a BAR which is also a MSI-X Table BAR, it will call register_mmio_emulation_handler to register the MSI-X Table BAR MMIO access handler. This will causes the emul_mmio_lock dead lock. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-02-24 16:17:38 +08:00
Li Fei1	fbe57d9f0b	hv: vpci: restrict SOS access assigned PCI device SOS should not access the physical PCI device which is assigned to other guest. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-02-24 16:17:38 +08:00
Li Fei1	e8479f84cd	hv: vPCI: remove passthrough PCI device unuse code Now we split passthrough PCI device from DM to HV, we could remove all the passthrough PCI device unused code. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-24 16:17:38 +08:00
Li Fei1	dafa3da693	vPCI: split passthrough PCI device from DM to HV In this case, we could handle all the passthrough PCI devices in ACRN hypervisor. But we still need DM to initialize BAR resources and Intx for passthrough PCI device for post-launched VM since these informations should been filled into ACPI tables. So 1. we add a HC vm_assign_pcidev to pass the extra informations to replace the old vm_assign_ptdev. 2. we saso remove HC vm_set_ptdev_msix_info since it could been setted by the post-launched VM now same as SOS. 3. remove vm_map_ptdev_mmio call for PTDev in DM since ACRN hypervisor will handle these BAR access. 4. the most important thing is to trap PCI configure space access for PTDev in HV for post-launched VM and bypass the virtual PCI device configure space access to DM. This patch doesn't do the clean work. Will do it in the next patch. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-02-24 16:17:38 +08:00
Li Fei1	fe3182ea05	hv: vPCI: add assign/deassign PCI device HC APIs Add assign/deassign PCI device hypercall APIs to assign a PCI device from SOS to post-launched VM or deassign a PCI device from post-launched VM to SOS. This patch is prepared for spliting passthrough PCI device from DM to HV. The old assign/deassign ptdev APIs will be discarded. Tracked-On: #4371 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-02-24 16:17:38 +08:00
Yin Fengwei	2ca01206f3	Makefile: fix build issue on old gcc The previous fcf-protection fix broke the old gcc (older than gcc 8 which is common on Ubuntu 18.04 and older distributions). We only add fcf-protection=none for gcc8 and newer. Tracked-On: #4358 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2020-02-24 12:22:21 +08:00
Wei Liu	f3a4b2325f	hv: add P2SB device to whitelist for apl-mrb apl-mrb need to access P2SB device, so add 00:0d.0 P2SB device to whitelist for platform pci hidden device. Tracked-On: #3475 Signed-off-by: Wei Liu <weix.w.liu@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Victor Sun <victor.sun@intel.com>	2020-02-24 12:21:29 +08:00
Junming Liu	1303861d26	hv:enable gpu iommu except APL platforms To enable gvt-d,need to allow the GPU IOMMU. While gvt-d hasn't been enabled on APL yet, so let APL disable GPU IOMMU. v2 -> v3: * let APL platforms disable GPU IOMMU. Tracked-On: #4405 Signed-off-by: Junming Liu <junming.liu@intel.com> Reviewed-by: Wu Binbin <binbin.wu@intel.com>	2020-02-24 11:47:10 +08:00
Junming Liu	1f1eb7fdba	hv:disable iommu snoop control to enable gvt-d by an option If one of the enabled VT-d DMAR units doesn’t support snoop control, then bit 11 of leaf PET of EPT is not set, since the field is treated as reserved(0) by VT-d hardware implementations not supporting snoop control. GUP IOMMU doesn’t support snoop control, this patch add an option to disable iommu snoop control for gvt-d. v2 -> v3: * refine the MICRO name and description. Tracked-On: #4405 Signed-off-by: Junming Liu <junming.liu@intel.com> Reviewed-by: Wu Binbin <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-24 11:47:10 +08:00
Shuo A Liu	53de3a727c	hv: reset vcpu events in reset_vcpu On UEFI UP2 board, APs might execute HLT before SOS kernel INIT them. After SOS kernel take over and will re-init the APs directly. The flows from HV perspective is like: HLT trap: wait_event(VCPU_EVENT_VIRTUAL_INTERRUPT) -> sleep_thread SOS kernel INIT, SIPI APs: pause_vcpu(ZOMBIE) -> sleep_thread -> reset_vcpu -> launch_vcpu -> wake_vcpu However, the last wake_vcpu will fail because the cpu event VCPU_EVENT_VIRTUAL_INTERRUPT had not got signaled. This patch will reset all vcpu events in reset_vcpu. If the thread was previously waiting for a event, its waiting status will be cleared and launch_vcpu will wake it to running. Tracked-On: #4402 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-23 16:27:57 +08:00
Zide Chen	cc6f094926	hv: CAT is supposed to be enabled in the system level In platforms that support CAT, when it is enabled by ACRN, i.e. IA32_resourceType_MASK_n registers are programmed with customized values, it has impacts to the whole system. The per guest flag GUEST_FLAG_CLOS_REQUIRED suggests that CAT may be enabled in some guests, but not in others who don't have this flag, which is conceptually incorrect. This patch removes GUEST_FLAG_CLOS_REQUIRED, and adds a new Kconfig entry CAT_ENABLED for CAT enabling. When it's enabled, platform_clos_array[] defines a set of system-wide Class of Service (COS, or CLOS), and the per guest vm_configs[].clos associates the guest with particular CLOS. Tracked-On: #2462 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-02-17 08:51:59 +08:00
Yin Fengwei	8dcede7693	Makefile: disable fcf-protection for some build env In some build env (Ubuntu 19.10 as example), gcc enabled the option -fcf-protection by default. But this option is not compatible with -mindirect-branch. Which could trigger following build error: fail to build with gcc-9 [error: ‘-mindirect-branch’ and ‘-fcf-protection’ are not compatible] -mindirect-branch is mandatory for retpoline mitigation and always enabled for ACRN build. We disable -fcf-protection here for ACRN build. Tracked-On: #4358 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Wu Binbin <binbin.wu@intel.com>	2020-02-17 08:49:38 +08:00
Alexander Merritt	8ddbfc268c	acrn: add pxelinux as known bootloader Tracked-On: #4389 Signed-off-by: Alexander Merritt <alex.merritt@intel.com>	2020-02-17 08:49:02 +08:00
Zide Chen	f3249e77bd	hv: enable early pr_xxx() logs Currently panic() and pr_xxx() statements before init_primary_pcpu_post() won't be printed, which is inconvenient and misleading for debugging. This patch makes pr_xxx() APIs working before init_pcpu_pre(): - clear .bss in init.c, which makes sense to clear .bss at the very beginning of initialization code. Also this makes it possible to call init_logmsg() before init_pcpu_pre(). - move parse_hv_cmdline() and uart16550_init(true) to init.c. - refine ticks_to_us() to handle the case that it's called before calibrate_tsc(). As a side effect, it prints "0us" in early pr_xxx() calls. - call init_debug_pre() in init_primary_pcpu() and after this point, both printf() and pr_xxx() APIs are available. However, this patch doesn't address the issue that pr_xxx() could be called on PCPUs that set_current_pcpu_id() hasn't been called, which implies that the PCPU ID shown in early logs may not be accurate. Tracked-On: #2987 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-02-11 08:53:56 +08:00
Alexander Merritt	920f02706a	acrn: rename param in uart16550_init Tracked-On: #4390 Signed-off-by: Alexander Merritt <alex.merritt@intel.com>	2020-02-10 11:49:34 +08:00
Minggui Cao	10c407cc85	HV: init local variable before it is used. it is better to init bdfs_from_drhds.pci_bdf_map_count before it is passed to other function to do: bdfs_from_drhds->pci_bdf_map_count++ Tracked-On: #3875 Signed-off-by: Minggui Cao <minggui.cao@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com>	2020-01-17 09:21:09 +08:00
Zide Chen	086e0f19d8	hv: fix pcpu_id mask issue in smp_call_function() INVALID_BIT_INDEX has 16 bits only, which removes all pcpu_id that is >= 16 from the destination mask. Tracked-On: #4354 Signed-off-by: Zide Chen <zide.chen@intel.com>	2020-01-17 09:20:53 +08:00
Yonghua Huang	fd4775d044	hv: rename VECTOR_XXX and XXX_IRQ Macros 1. Align the coding style for these MACROs 2. Align the values of fixed VECTORs Tracked-On: #4348 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-01-14 10:21:23 +08:00
Yonghua Huang	b90862921e	hv: rename the ACRN_DBG_XXX Refine this MACRO 'ACRN_DBG_XXX' to 'DBG_LEVEL_XXX' Tracked-On: #4348 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2020-01-14 10:21:23 +08:00
Shuo A Liu	b59e5a870a	hv: Disable HLT and PAUSE-loop exiting emulation in lapic passthrough In lapic passthrough mode, it should passthrough HLT/PAUSE execution too. This patch disable their emulation when switch to lapic passthrough mode. Tracked-On: #4329 Tested-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-13 10:16:30 +08:00
Shuo A Liu	3edde2608c	hv: debug: show vcpu thread status in vcpu_list debug command Due to vcpu and its thread are two different perspective modules, each of them has its own status. Dump both states for better understanding of system status. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-13 10:16:30 +08:00
Shuo A Liu	db708fc3e8	hv: rename is_completion_polling to is_polling_ioreq is_polling_ioreq is more straightforward. Rename it. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-13 10:16:30 +08:00
Yonghua Huang	82b89fd04c	hv: check the validity of 'pdev' in 'set_ptdev_intr_info' This patch checks the validity of 'vdev->pdev' to ensure physical device is linked to 'vdev'. this check is to avoid some potential hypervisor crash when destroying VM with crafted input. Tracked-On: #4336 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com>	2020-01-09 16:04:47 +08:00
Yonghua Huang	0e47f0a8f9	hv: fix potential NULL pointer reference in hc_assgin_ptdev this patch validates input 'vdev->pdev' before reference to avoid potenial hypervisor crash. [v2] update: Combine condition check for 'vdev' and 'vdev->pdev' Tracked-On: #4334 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com>	2020-01-08 11:54:49 +08:00
Yonghua Huang	ddebefb9b4	hv: remove depreciated code for hc_assign/deassign_ptdev 'param' is BDF value instead of GPA when VHM driver issues below 2 hypercalls: - HC_ASSIGN_PTEDEV - HC_DEASSIGN_PTDEV This patch is to remove related code in hc_assign/deassign() functions. Tracked-On: #4334 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com>	2020-01-08 11:54:49 +08:00
Li Fei1	65ed6c3529	hv: vpci: trap PCIe ECAM access for SOS SOS will use PCIe ECAM access PCIe external configuration space. HV should trap this access for security(Now pre-launched VM doesn't want to support PCI ECAM; post-launched VM trap PCIe ECAM access in DM). Besides, update PCIe MMCONFIG region to be owned by hypervisor and expose and pass through platform hide PCI devices by BIOS to SOS. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-01-07 16:05:30 +08:00
Li Fei1	1e50ec8899	hv: pci: use ECAM to access PCIe Configuration Space Use Enhanced Configuration Access Mechanism (MMIO) instead of PCI-compatible Configuration Mechanism (IO port) to access PCIe Configuration Space PCI-compatible Configuration Mechanism (IO port) access is used for UART in debug version. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-01-07 16:05:30 +08:00
Li Fei1	65f3751ea3	hv: pci: add hide pci devices configuration for apl-up2 Other Platforms are not added for now. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2020-01-07 16:05:30 +08:00
Shuo A Liu	3239cb0e1c	hv: Use HLT as the default idle action of service OS This patch overwrites the idle driver of service OS for industry, sdc, sdc2 scenarios. HLT will be used as the default idle action. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Shuo A Liu	4303ccb1a0	hv: HLT emulation in hypervisor HLT emulation is import to CPU resource maximum utilization. vcpu doing HLT means it is idle and can give up CPU proactively. Thus, we pause the vcpu thread in HLT emulation and resume it while event happens. When vcpu enter HLT, its vcpu thread will sleep, but the vcpu state is still 'Running'. VM ID PCPU ID VCPU ID VCPU ROLE VCPU STATE ===== ======= ======= ========= ========== 0 0 0 PRIMARY Running 0 1 1 SECONDARY Running Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Shuo A Liu	a8f6bdd479	hv: Add vlapic_has_pending_intr of apicv to check pending interrupts Sometimes HV wants to know if there are pending interrupts of one vcpu. Add .has_pending_intr interface in acrn_apicv_ops and return the pending interrupts status by check IRRs of apicv. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Shuo A Liu	e3c303363b	hv: vcpu: wait and signal vcpu event support Introduce two kinds of events for each vcpu, VCPU_EVENT_IOREQ: for vcpu waiting for IO request completion VCPU_EVENT_VIRTUAL_INTERRUPT: for vcpu waiting for virtual interrupts events vcpu can wait for such events, and resume to run when the event get signalled. This patch also change IO request waiting/notifying to this way. Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Shuo A Liu	1f23fe3fd8	hv: sched: simple event implemention This simple event implemention can only support exclusive waiting at same time. It mainly used by thread who want to wait for special event happens. Thread A who want to wait for some events calls wait_event(struct sched_event ); Thread B who can give the event signal calls signal_event(struct sched_event ); Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Shuo A Liu	4115dd6241	hv: PAUSE-loop exiting support in hypervisor As we enabled cpu sharing, PAUSE-loop exiting can help vcpu to release its pcpu proactively. It's good for performance. VMX_PLE_GAP: upper bound on the amount of time between two successive executions of PAUSE in a loop. VMX_PLE_WINDOW: upper bound on the amount of time a guest is allowed to execute in a PAUSE loop Tracked-On: #4329 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-07 11:23:32 +08:00
Victor Sun	bfecf30f32	HV: do not offline pcpu when lapic pt disabled In current code, wait_pcpus_offline() and make_pcpu_offline() are called by both shutdown_vm() and reset_vm(), but this is not needed when lapic_pt is not enabled for the vcpus of the VM. The patch merged offline pcpus part code into a common offline_lapic_pt_enabled_pcpus() api for shutdown_vm() and reset_vm() use and called only when lapic_pt is enabled. Tracked-On: #4325 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-06 15:35:08 +08:00
Binbin Wu	41a998fca3	hv: cr: handle control registers related to PCID 1. This patch passes-through CR4.PCIDE to guest VM. 2. This patch handles the invlidation of TLB and the paging-structure caches. According to SDM Vol.3 4.10.4.1, the following instructions invalidate entries in the TLBs and the paging-structure caches: - INVLPG: this instruction is passed-through to guest, no extra handling needed. - INVPCID: this instruction is passed-trhough to guest, no extra handling needed. - CR0.PG from 1 to 0: already handled by current code, change of CR0.PG will do EPT flush. - MOV to CR3: hypervisor doesn't trap this instrcution, no extra handling needed. - CR4.PGE changed: already handled by current code, change of CR4.PGE will no EPT flush. - CR4.PCIDE from 1 to 0: this patch handles this case, will do EPT flush. - CR4.PAE changed: already handled by current code, change of CR4.PAE will do EPT flush. - CR4.SEMP from 1 to 0, already handled by current code, change of CR4.SEMP will do EPT flush. - Task switch: Task switch is not supported in VMX non-root mode. - VMX transitions: already handled by current code with the support of VPID. 3. This patch checks the validatiy of CR0, CR4 related to PCID feature. According to SDM Vol.3 4.10.1, CR.PCIDE can be 1 only in IA-32e mode. - MOV to CR4 causes a general-protection exception (#GP) if it would change CR4.PCIDE from 0 to 1 and either IA32_EFER.LMA = 0 or CR3[11:0] ≠ 000H - MOV to CR0 causes a general-protection exception if it would clear CR0.PG to 0 while CR4.PCIDE = 1 Tracked-On: #4296 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-02 10:47:34 +08:00
Binbin Wu	4ae350a091	hv: vmcs: pass-through instruction INVPCID to VM According to SDM Vol.3 Section 25.3, behavior of the INVPCID instruction is determined first by the setting of the “enable INVPCID” VM-execution control: - If the “enable INVPCID” VM-execution control is 0, INVPCID causes an invalid-opcode exception (#UD). - If the “enable INVPCID” VM-execution control is 1, treatment is based on the setting of the “INVLPG exiting” VM-execution control: * If the “INVLPG exiting” VM-execution control is 0, INVPCID operates normally. * If the “INVLPG exiting” VM-execution control is 1, INVPCID causes a VM exit. In current implementation, hypervisor doesn't set “INVLPG exiting” VM-execution control, this patch sets “enable INVPCID” VM-execution control to 1 when the instruction is supported by physical cpu. If INVPCID is supported by physical cpu, INVPCID will not cause VM exit in VM. If INVPCID is not supported by physical cpu, INVPCID causes an #UD in VM. When INVPCID is passed-through to VM, According to SDM Vol.3 28.3.3.1, INVPCID instruction invalidates linear mappings and combined mappings. They are required to do so only for the current VPID. HV assigned a unique vpid for each vCPU, if guest uses wrong PCID, it would not affect other vCPUs. Tracked-On: #4296 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-02 10:47:34 +08:00
Binbin Wu	d330879ce5	hv: cpuid: expose PCID related capabilities to VMs Pass-through PCID related capabilities to VMs: - The support of PCID (CPUID.01H.ECX[17]) - The support of instruction INVPCID (CPUID.07H.EBX[10]) Tracked-On: #4296 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-02 10:47:34 +08:00
Binbin Wu	96331462b7	hv: vmcs: remove redundant check on vpid ACRN relies on the capability of VPID to avoid EPT flushes during VMX transitions. This capability is checked as a must have hardware capability, otherwise, ACRN will refuse to boot. Also, the current code has already made sure each vpid for a virtual cpu is valid. So, no need to check the validity of vpid for vcpu and enable VPID for vCPU by default. Tracked-On: #4296 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2020-01-02 10:47:34 +08:00
Li Fei1	21b405d109	hv: vpci: an assign PT device should support FLR or PM reset Before we assign a PT device to post-launched VM, we should reset the PCI device first. However, ACRN hypervisor doesn't plan to support PCIe hot-plug and doesn't support PCIe bridge Secondary Bus Reset. So the PT device must support FLR or PM reset. This patch do this check when assigning a PT device to post-launched VM. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-30 13:43:07 +08:00
Li Fei1	e74a9f397d	hv: pci: add PCIe PM reset check Add PCIe PM reset capability check. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-30 13:43:07 +08:00
Li Fei1	26670d7ab3	hv: vpci: revert do FLR and BAR restore Since we restore BAR values when writing Command Register if necessary. We don't need to trap FLR and do the BAR restore then. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-30 13:43:07 +08:00
Li Fei1	6c549d48a8	hv: vpci: restore physical BARs when writing Command Register if necessary When PCIe does Conventinal Reset or FLR, almost PCIe configurations and states will lost. So we should save the configurations and states before do the reset and restore them after the reset. This was done well by BIOS or Guest now. However, ACRN will trap these access and handle them properly for security. Almost of these configurations and states will be written to physical configuration space at last except for BAR values for now. So we should do the restore for BAR values. One way is to do restore after one type reset is detected. This will be too complex. Another way is to do the restore when BIOS or guest tries to write the Command Register. This could work because: 1. The I/O Space Enable bit and Memory Space Enable bits in Command Register will reset to zero. 2. Before BIOS or guest wants to enable these bits, the BAR couldn't be accessed. 3. So we could restore the BAR values before enable these bits if reset is detected. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-30 13:43:07 +08:00
Zide Chen	742abaf2e6	hv: add sanity check for vuart configuration - target vm_id of vuart can't be un-defined VM, nor the VM itself. - fix potential NULL pointer dereference in find_active_target_vuart() Tracked-On: #3854 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-30 09:24:59 +08:00
Victor Sun	c6f7803f06	HV: restore lapic state and apic id upon INIT Per SDM 10.12.5.1 vol.3, local APIC should keep LAPIC state after receiving INIT. The local APIC ID register should also be preserved. Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	ab13228591	HV: ensure valid vcpu state transition The vcpu state machine transition should follow below rule: old vcpu state new vcpu state ============== ============== VCPU_OFFLINE --- create_vcpu --> VCPU_INIT VCPU_INIT --- launch_vcpu --> VCPU_RUNNING VCPU_RUNNING --- pause_vcpu --> VCPU_PAUSED VCPU_PAUSED --- resume_vcpu --> VCPU_RUNNING VCPU_RUNNING/PAUSED --- pause_vcpu --> VCPU_ZOMBIE VCPU_INIT --- pause_vcpu --> VCPU_ZOMBIE VCPU_ZOMBIE --- reset_vcpu --> VCPU_INIT VCPU_ZOMBIE --- offline_vcpu--> VCPU_OFFLINE Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	a5158e2c16	HV: refine reset_vcpu api The patch abstract a vcpu_reset_internal() api for internal usage, the function would not touch any vcpu state transition and just do vcpu reset processing. It will be called by create_vcpu() and reset_vcpu(). The reset_vcpu() will act as a public api and should be called only when vcpu receive INIT or vm reset/resume from S3. It should not be called when do shutdown_vm() or hcall_sos_offline_cpu(), so the patch remove reset_vcpu() in shutdown_vm() and hcall_sos_offline_cpu(). The patch also introduced reset_mode enum so that vcpu and vlapic could do different context operation according to different reset mode; Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	d1a46b8289	HV: rename function of vlapic_xxx_write_handler Rename vlapic_xxx_write_handler() to vlapic_write_xxx() to make code more readable; Tracked-On: #4268 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	9ecac8629a	HV: clean up redundant macro in lapic.h Some MACROs in lapic.h are duplicated with apicreg.h, and some MACROs are never referenced, remove them. Tracked-On: #4268 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	46ed0b1582	HV: correct apic lvt reset value Per SDM 10.4.7.1 vol3, the LVT register should be reset to 0s except for the mask bits are set to 1s. In current code, the lvt_last[] has been set to correct value(i.e. 0x10000) in vlapic_reset() before enforce setting vlapic->lvt_last[i] to 0U, add the loop that set vlapic->lvt_last[i] to 0 would lead to get zero when read LVT regs after reset, which is incompiant with SDM; Tracked-On: #4266 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Li Fei1	58b3a05863	hv: vpci: rename pci_bar to pci_vbar Structure pci_vbar is used to define the virtual BAR rather than physical BAR. It's better to name as pci_vbar. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-26 08:54:23 +08:00
Li Fei1	d2089889d8	hv: pci: minor fix of coding style about pci_read_cap There's no need to check which capability we care at the very beginning. We could do it later step by step. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-26 08:54:23 +08:00
Victor Sun	57939730b7	HV: search rsdp from e820 acpi reclaim region Per ACPI 6.2 spec, chapter 5.2.5.2 "Finding the RSDP on UEFI Enabled Systems": In Unified Extensible Firmware Interface (UEFI) enabled systems, a pointer to the RSDP structure exists within the EFI System Table. The OS loader is provided a pointer to the EFI System Table at invocation. The OS loader must retrieve the pointer to the RSDP structure from the EFI System Table and convey the pointer to OSPM, using an OS dependent data structure, as part of the hand off of control from the OS loader to the OS. So when ACRN boot from direct mode on a UEFI enabled system, hypervisor might be failed to get rsdp by seaching rsdp in legacy EBDA or 0xe0000~0xfffff region, but it still have chance to get rsdp by seaching it in e820 ACPI reclaimable region with some edk2 based BIOS. The patch will search rsdp from e820 ACPI reclaim region When failed to get rsdp from legacy region. Tracked-On: #4301 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-25 13:50:11 +08:00
Zide Chen	fc78013fba	acrn-config: some cleanup for logical partition mode Linux bootargs - commit `69152647` ("hv: Use virtual APIC IDs for Pre-launched VMs") enables virtual APIC IDs for pre-launched VMs thus xapic_phys is no longer needed to force guest xAPIC to work in physical destination mode. - HVC is not available in logical partition mode and "console=hvc0" should be removed from guest Linux bootargs. Tracked-On: #3854 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Victor Sun <victor.sun@intel.com>	2019-12-25 13:46:37 +08:00
Yin Fengwei	e5117bf19a	vm: add severity for vm_config Add severity definitions for different scenarios. The static guest severity is defined according to guest configurations. Also add sanity check to make sure the severity for all guests are correct. Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Yin Fengwei	f7df43e7cd	reset: detect highest severity guest dynamically For guest reset, if the highest severity guest reset will reset system. There is vm flag to call out the highest severity guest in specific scenario which is a static guest severity assignment. There is case that the static highest severity guest is shutdown and the highest severity guest should be transfer to other guest. For example, in ISD scenario, if RTVM (static highest severity guest) is shutdown, SOS should be highest severity guest instead. The is_highest_severity_vm() is updated to detect highest severity guest dynamically. And promote the highest severity guest reset to system reset. Also remove the GUEST_FLAG_HIGHEST_SEVERITY definition. Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Yin Fengwei	bfa19e9104	pm: S5: update the system shutdown logical in ACRN For system S5, ACRN had assumption that SOS shutdown will trigger system shutdown. So the system shutdown logical is: 1. Trap SOS shutdown 2. Wait for all other guest shutdown 3. Shutdown system The new logical is refined as: If all guest is shutdown, shutdown whole system Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Li Fei1	1fddf943d8	hv: vpci: restore PCI BARs when doing AF FLR ACRN hypervisor should trap guest doing PCI AF FLR. Besides, it should save some status before doing the FLR and restore them later, only BARs values for now. This patch will trap guest Conventional PCI Advanced Features Control Register write operation if the device supports Conventional PCI Advanced Features Capability and check whether it wants to do device AF FLR. If it does, call pdev_do_flr to do the job. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-23 10:14:37 +08:00
Li Fei1	a90e0f6c84	hv: vpci: restore PCI BARs when doing PCIe FLR ACRN hypervisor should trap guest doing PCIe FLR. Besides, it should save some status before doing the FLR and restore them later, only BARs values for now. This patch will trap guest Device Capabilities Register write operation if the device supports PCI Express Capability and check whether it wants to do device FLR. If it does, call pdev_do_flr to do the job. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-23 10:14:37 +08:00
Gary	5b5f1735ff	acrnboot: fix the parsing hv_cmdline to correctly handle the case of containing trailing whitespaces The pointer variable 'start' should be checked against NULL right after detected it is not pointer to a space character, otherwise the pointer variable 'end' must hold the wrong address right after NULL if the cmdline containing trailing whitespaces and deference the wrong address out of cmdline string. this parsing code also been optimized and simplified. Tracked-On: projectacrn#4250 Signed-off-by: Gary <gordon.king@intel.com>	2019-12-17 10:58:28 +08:00
Kaige Fu	5f9d1379bc	HV: Remove INIT signal notification related code We don't use INIT signal notification method now. This patch removes them. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	6d1f63aef0	HV: Use NMI to replace INIT signal for lapic-pt VMs S5 We have implemented a new notification method using NMI. So replace the INIT notification method with the NMI one. Then we can remove INIT notification related code later. Tracked-On: #3886 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	a13909cedc	HV: Use NMI-window exiting to address req missing issue There is a window where we may miss the current request in the notification period when the work flow is as the following: CPUx + + CPUr \| \| \| +--+ \| \| \| Handle pending req \| <--+ +--+ \| \| \| Set req flag \| <--+ \| +------------------>---+ \| Send NMI \| \| Handle NMI \| <--+ \| \| \| \| \| +--> vCPU enter \| \| + + So, this patch enables the NMI-window exiting to trigger the next vmexit once there is no "virtual-NMI blocking" after vCPU enter into VMX non-root mode. Then we can process the pending request on time. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	40ba7e8686	HV: Don't make NMI injection req when notifying vCPU The NMI for notification should not be inject to guest. So, this patch drops NMI injection request when we use NMI to notify vCPUs. Meanwhile, ACRN doesn't support vNMI well and there is no well-designed way to check if the NMI is for notification or for guest now. So, we take all the NMIs as notificaton NMI for hard rtvm temporarily. It means that the hard rtvm will never receive NMI with this patch applied. TODO: vNMI support is not ready yet. we will add it later. Tracked-On: #3886 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	72f7f69c47	HV: Use NMI to kick lapic-pt vCPU's thread ACRN hypervisor needs to kick vCPU off VMX non-root mode to do some operations in hypervisor, such as interrupt/exception injection, EPT flush etc. For non lapic-pt vCPUs, we can use IPI to do so. But, it doesn't work for lapic-pt vCPUs as the IPI will be injected to VMs directly without vmexit. Without the way to kick the vCPU off VMX non-root mode to handle pending request on time, there may be fatal errors triggered. 1). Certain operation may not be carried out on time which may further lead to fatal errors. Taking the EPT flush request as an example, once we don't flush the EPT on time and the guest access the out-of-date EPT, fatal error happens. 2). ACRN now will send an IPI with vector 0xF0 to target vCPU to kick the vCPU off VMX non-root mode if it wants to do some operations on target vCPU. However, this way doesn't work for lapic-pt vCPUs. The IPI will be delivered to the guest directly without vmexit and the guest will receive a unexpected interrupt. Consequently, if the guest can't handle this interrupt properly, fatal error may happen. The NMI can be used as the notification signal to kick the vCPU off VMX non-root mode for lapic-pt vCPUs. So, this patch uses NMI as notification signal to address the above issues for lapic-pt vCPUs. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Shiqing Gao	3cee259583	hv: msr: remove redundant check in write_pat_msr Reserved bits in a 8-bit PAT field has been checked in pat_mem_type_invalid. Remove this redundant check "(PAT_FIELD_RSV_BITS & field) != 0UL" in write_pat_msr. Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-12-16 14:32:42 +08:00
Yonghua Huang	d4677a8917	hv:fix crash issue when handling HC_NOTIFY_REQUEST_FINISH Input 'vcpu_id' and the state of target vCPU should be validated properly: - 'vcpu_id' shall be less than 'vm->hw.created_vcpus' instead of 'MAX_VCPUS_PER_VM'. - The state of target vCPU should be "VCPU_PAUSED", and reject all other states. Tracked-On: #4245 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-16 09:44:12 +08:00
Victor Sun	5702619620	HV: kconfig: add range check for memory setting When user use make menuconfig to configure memory related kconfig items, we need add range check to avoid compile error or other potential issues: CONFIG_LOW_RAM_SIZE:(0 ~ 0x10000) the value should be less than 64KB; CONFIG_HV_RAM_SIZE: (0x1000000 ~ 0x10000000) the hypervisor RAM size should be supposed between 16MB to 256MB; CONFIG_PLATFORM_RAM_SIZE: (0x100000000 ~ 0x4000000000) the platform RAM size should be larger than 4GB and less than 256GB; CONFIG_SOS_RAM_SIZE: (0x100000000 ~ 0x4000000000) the SOS RAM size should be larger than 4GB and less than 256GB; CONFIG_UOS_RAM_SIZE: (0 ~ 0x2000000000) the UOS RAM size should be less than 128GB; Tracked-On: #4229 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-12-16 09:36:44 +08:00
Victor Sun	64bbd37fd7	HV: Kconfig: set default Kata num to 1 in SDC Set default CONFIG_KATA_VM_NUM to 1 in SDC scenario so that user could have a try on Kata container without rebuilding hypervisor. Please be aware that vcpu affinity of VM1 in CPU partition mode would be impacted by this patch. Tracked-On: #4232 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-16 09:36:44 +08:00
Yonghua Huang	05682b2bad	hv:bugfix in write protect page hypercall This patch fixes potential hypervisor crash when calling hcall_write_protect_page() with a crafted GPA in 'struct wp_data' instance, e.g. an invalid GPA that is not in the scope of the target VM's EPT address space. To check the validity for this GPA before updating the 'write protect' page. Tracked-On: #4240 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com>	2019-12-13 10:42:31 +08:00
Kaige Fu	2777f23075	HV: Add helper function send_single_nmi This patch adds a helper function send_single_nmi. The fisrt caller will soon come with the following patch. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-13 10:13:09 +08:00
Kaige Fu	525d4d3cd0	HV: Install a NMI handler in acrn IDT This patch installs a NMI handler in acrn IDT to handle NMIs out of dispatch_exception. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-13 10:13:09 +08:00
Kaige Fu	fb346a6c11	HV: refine excp/external_interrupt_save_frame and excp_rsvd There are lines of repeated codes in excp/external_interrupt_save_frame and excp_rsvd. So, this patch defines two .macro, save_frame and restore_frame, to reduce the repeated codes. No functional change. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-13 10:13:09 +08:00
Mingqiang Chi	7f96465407	hv:remove need_cleanup flag in create_vm remove this redundancy flag. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 16:34:13 +08:00
Victor Sun	67ec1b7708	HV: expose port 0x64 read for SOS VM The port 0x64 is the status register of i8042 keyboard controller. When i8042 is defined as ACPI PnP device in BIOS, enforce returning 0xff in read handler would cause infinite loop when booting SOS VM, so expose the physical port read in this case; Tracked-On: #4228 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 13:51:24 +08:00
Victor Sun	a44c1c900c	HV: Kconfig: remove MAX_VCPUS_PER_VM in Kconfig In current architecutre, the maximum vCPUs number per VM could not exceed the pCPUs number. Given the MAX_PCPU_NUM macro is provided in board configurations, so remove the MAX_VCPUS_PER_VM from Kconfig and add a macro of MAX_VCPUS_PER_VM to reference MAX_PCPU_NUM directly. Tracked-On: #4230 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 13:49:28 +08:00
Victor Sun	ea3476d22d	HV: rename CONFIG_MAX_PCPU_NUM to MAX_PCPU_NUM rename the macro since MAX_PCPU_NUM could be parsed from board file and it is not a configurable item anymore. Tracked-On: #4230 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 13:49:28 +08:00
Mingqiang Chi	b6bffd01ff	hv:remove 2 unused variables in vm_arch structure remove 'guest_init_pml4' and 'tmp_pg_array' in vm_arch since they are not used. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-12-12 10:13:11 +08:00
Shiqing Gao	e95b316dd0	hv: vtd: fix improper use of DMAR_GCMD_REG The initialization of "dmar_unit->gcmd" shall be done via reading from Global Status Register rather than Global Command Register. Rationale: According to Chapter 10.4.4 Global Command Register in VT-d spec, Global Command Register is a write-only register to control remapping hardware. Global Status Register is the corresponding read-only register to report remapping hardware status. Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-12-12 09:11:04 +08:00
Vijay Dhanraj	c8a4ca6c78	HV: Extend non-contiguous HPA for hybrid scenario This patch extends non-contiguous HPA allocations for pre-launched VMs in hybrid scenario. Tracked-On: #4217 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 10:12:46 +08:00
Shuo A Liu	b32ae229fb	hv: sched: use hypervisor configuration to choose scheduler For now, we set NOOP scheduler as default. User can choose IORR scheduler as needed. Tracked-On: #4178 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 09:31:39 +08:00
Shuo A Liu	6a144e6e3e	hv: sched: add yield support Add yield support for schedule, which can give up pcpu proactively. Tracked-On: #4178 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 09:31:39 +08:00
Shuo A Liu	6554437cc0	hv: sched_iorr: add some interfaces implementation of sched_iorr Implement .sleep/.wake/.pick_next of sched_iorr. In .pick_next, we count current object's timeslice and pick the next avaiable one. The policy is 1) get the first item in runqueue firstly 2) if object picked has no time_cycles, replenish it pick this one 3) At least take one idle sched object if we have no runnable object after step 1) and 2) In .wake, we start the tick if we have more than one active thread_object in runqueue. In .sleep, stop the tick timer if necessary. Tracked-On: #4178 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-12-11 09:31:39 +08:00
Shuo A Liu	b39630a8e0	hv: sched_iorr: add tick handler and runqueue operations sched_control is per-pcpu, each sched_control has a tick timer running periodically. Every period called a tick. In tick handler, we do 1) compute left timeslice of current thread_object if it's not the idle 2) make a schedule request if current thread_object run out of timeslice For runqueue maintaining, we will keep objects which has timeslice in the front of runqueue and the ones get new replenished in tail. Tracked-On: #4178 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-12-11 09:31:39 +08:00
Shuo A Liu	f44aa4e4c9	hv: sched_iorr: add init functions of sched_iorr We set timeslice to 10ms as default, and set tick interval to 1ms. When init sched_iorr scheduler, we init a periodic timer as the tick and init the runqueue to maintain objects in the sched_control. Destroy the timer in deinit. Tracked-On: #4178 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 09:31:39 +08:00
Shuo A Liu	ed4008630d	hv: sched_iorr: Add IO sensitive Round-robin scheduler IO sensitive Round-robin scheduler aim to schedule threads with round-robin policy. Meanwhile, we also enhance it with some fairness configuration, such as thread will be scheduled out without properly timeslice. IO request on thread will be handled in high priority. This patch only add a skeleton for the sched_iorr scheduler. Tracked-On: #4178 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 09:31:39 +08:00
Gary	3c8d465a11	acrnboot: correct the calculation of the end boundry of _DYNAMIC region The calculation of the end boundry address is corrected by adding the size extracted from _DYNAMIC to start address in type of uint8_t while improving the code by calulating the end boundry address after scanning, also reducing type casts accordingly. Tracked-On: projectacrn#4191 Signed-off-by: Gary <gordon.king@intel.com>	2019-12-11 09:31:24 +08:00
Li Fei1	c2c05a29da	hv: vlapic: kick targeted vCPU off if interrupt trigger mode has changed In APICv advanced mode, an targeted vCPU, running in non-root mode, may get outdated TMR and EOI exit bitmap if another vCPU sends an interrupt to it if the trigger mode of this interrupt has changed. This patch try to kick vCPU off to let it get the latest TMR and EOI exit bitmap when it enters non-root mode again if new coming interrupt trigger mode has changed. Then fill the interrupt to PIR. Tracked-On: #4200 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-10 09:07:54 +08:00
Vijay Dhanraj	ed65ae61c6	HV: Kconfig changes to support server platform. This patch updates kconfig to support server platforms for increased number of VCPUs per VM and PT IRQ number. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Tracked-On: #4196	2019-12-09 11:29:34 +08:00
Vijay Dhanraj	6e8b413689	HV: Add support to assign non-contiguous HPA regions for pre-launched VM On some platforms, HPA regions for Virtual Machine can not be contiguous because of E820 reserved type or PCI hole. In such cases, pre-launched VMs need to be assigned non-contiguous memory regions and this patch addresses it. To keep things simple, current design has the following assumptions, 1. HPA2 always will be placed after HPA1 2. HPA1 and HPA2 don’t share a single ve820 entry. (Create multiple entries if needed but not shared) 3. Only support 2 non-contiguous HPA regions (can extend at a later point for multiple non-contiguous HPA) Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Tracked-On: #4195 Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-12-09 11:28:38 +08:00
Zide Chen	03a1b2a717	hypervisor: handle reboot from non-privileged pre-launched guests To handle reboot requests from pre-launched VMs that don't have GUEST_FLAG_HIGHEST_SEVERITY, we shutdown the target VM explicitly other than ignoring them. Tracked-On: #2700 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-12-09 11:27:32 +08:00
Li Fei1	da3ba68cb6	hv: remove corner case in ptirq_prepare_msix_remap ptirq_prepare_msix_remap was called no matter whether MSI/MSI-X was enabled or not and it passed zero to input parameter virtual MSI/MSI-X data field to indicate MSI/MSI-X was disabled. However, it barely did nothing on this case. Now ptirq_prepare_msix_remap is called only when MSI/MSI-X is enabled. It doesn't need to check whether MSI/MSI-X is enabled or not by checking virtual MSI/MSI-X data field. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-05 16:43:22 +08:00
Li Fei1	c05d9f8086	hv: vmsix: refine vmsix remap Do vMSI-X remap only when Mask Bit in Vector Control Register for MSI-X Table Entry is unmask. The previous implementation also has two issues: 1. It only check whether Message Control Register for MSI-X has been modified when guest writes MSI-X CFG space at Message Control Register offset. 2. It doesn't really disable MSI-X when guest wants to disable MSI-X. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-05 16:43:22 +08:00
Li Fei1	5f5ba1d647	hv: vmsi: refine write_vmsi_cfg implementation 1. disable physical MSI before writing the virtual MSI CFG space 2. do the remap_vmsi if the guest wants to enable MSI or update MSI address or data 3. disable INTx and enable MSI after step 2. The previous Message Control check depends on the guest write MSI Message Control Register at the offset of Message Control Register. However, the guest could access this register at the offset of MSI Capability ID register. This patch remove this constraint. Also, The previous implementation didn't really disable MSI when guest wanted to disable MSI. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-05 16:43:22 +08:00
Shuo A Liu	72644ac2b2	hv: do not sleep a non-RUNNING vcpu It's meaningless to sleep a non-running vcpu. Add a state check before sleep the thread object of the vcpu. Tracked-On: #4178 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-05 11:19:35 +08:00
Shuo A Liu	d624eb5e6c	hv: io: do schedule in IO completion polling loop Now, we support schedule inplace. And with cpu sharing, there might be multi vcpu running on same pcpu. Reschedule request will happen when switch the running vcpu. If the current vcpu is polling on the IO completion, it need to be scheduled back to the polling point. In the polling path, construct a loop for polling, and do schedule in the loop if needed. Tracked-On: #4178 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-05 11:19:35 +08:00
Conghui Chen	d48da2af3a	hv: bugfix for debug commands with smp_call With cpu-sharing enabled, there are more than 1 vcpu on 1 pcpu, so the smp_call handler should switch the vmcs to the target vcpu's vmcs. Then get the info. dump_vcpu_reg and dump_guest_mem should run on certain vmcs, otherwise, there will be #GP error. Renaming: vcpu_dumpreg -> dump_vcpu_reg switch_vmcs -> load_vmcs Tracked-On: #4178 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-05 11:19:35 +08:00
Shuo A Liu	47139bd78c	hv: print current sched_object in acrn logmsg Add a header field in acrnlog message to indicate the current running thread. Tracked-On: #4178 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-05 11:19:35 +08:00
Kaige Fu	aae974b473	HV: trace leaf and subleaf of cpuid We care more about leaf and subleaf of cpuid than vcpu_id. So, this patch changes the cpuid trace-entry to trace the leaf and subleaf of this cpuid vmexit. Tracked-On: #4175 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-03 16:34:14 +08:00
Yonghua Huang	450d2cf2e9	hv: trap RDPMC instruction execution from any guest PMU is hidden from any guest, UD is expected when guest try to execute 'rdpmc' instruction. this patch sets 'RDPMC exiting' in Processorbased VM-execution control. Tracked-On: #3453 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 14:14:27 +08:00
Binbin Wu	3d412266bc	hv: ept: build 4KB page mapping in EPT for RTVM for MCE on PSC Deterministic is important for RTVM. The mitigation for MCE on Page Size Change converts a large page to 4KB pages runtimely during the vmexit triggered by the instruction fetch in the large page. These vmexits increase nondeterminacy, which should be avoided for RTVM. This patch builds 4KB page mapping in EPT for RTVM to avoid these vmexits. Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 09:17:04 +08:00
Binbin Wu	0570993b40	hv: config: add an option to disable mce on psc workaround Add a option MCE_ON_PSC_WORKAROUND_DISABLED to disable the software workaround for the issue Machine Check Error on Page Size Change. Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 09:17:04 +08:00
Binbin Wu	192859ee02	hv: ept: apply MCE on page size change mitigation conditionally Only apply the software workaround on the models that might be affected by MCE on page size change. For these models that are known immune to the issue, the mitigation is turned off. Atom processors are not afftected by the issue. Also check the CPUID & MSR to check whether the model is immune to the issue: CPU is not vulnerable when both CPUID.(EAX=07H,ECX=0H).EDX[29] and IA32_ARCH_CAPABILITIES[IF_PSCHANGE_MC_NO] are 1. Other cases not listed above, CPU may be vulnerable. This patch also changes MACROs for MSR IA32_ARCH_CAPABILITIES bits to UL instead of U since the MSR is 64bit. Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 09:17:04 +08:00
Shuo A Liu	3cb32bb6e3	hv: make init_vmcs as a event of VCPU After changing init_vmcs to smp call approach and do it before launch_vcpu, it could work with noop scheduler. On real sharing scheudler, it has problem. pcpu0 pcpu1 pcpu1 vmBvcpu0 vmAvcpu1 vmBvcpu1 vmentry init_vmcs(vmBvcpu1) vmexit->do_init_vmcs corrupt current vmcs vmentry fail launch_vcpu(vmBvcpu1) This patch mark a event flag when request vmcs init for specific vcpu. When it is running and checking pending events, will do init_vmcs firstly. Tracked-On: #4178 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:20:43 +08:00
Victor Sun	15da33d8af	HV: parse default pci mmcfg base The default PCI mmcfg base is stored in ACPI MCFG table, when CONFIG_ACPI_PARSE_ENABLED is set, acpi_fixup() function will parse and fix up the platform mmcfg base in ACRN boot stage; when it is not set, platform mmcfg base will be initialized to DEFAULT_PCI_MMCFG_BASE which generated by acrn-config tool; Please note we will not support platform which has multiple PCI segment groups. Tracked-On: #4157 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:20:24 +08:00
Yan, Like	0d998d6ac6	hv: sync physical and virtual TSC_DEADLINE when msr interception enabled/disabled Starting with TSC_DEADLINE msr interception disabled, the virtual TSC_DEADLINE msr is always 0. When the interception is enabled, need to sync the physical TSC_DEADLINE value to virtual TSC_DEADLINE. When the interception is disabled, there are 2 cases: - if the timer hasn't expired, sync virtual TSC_DEADLINE to physical TSC_DEADLINE, to make the guest read the same tsc_deadline as it writes. This may change when the timer actually trigger. - if the timer has expired, write 0 to the virtual TSC_DEADLINE. Tracked-On: #4162 Signed-off-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:10:50 +08:00
Yan, Like	97916364fc	hv: fix virtual TSC_DEADLINE msr read/write issues When write to virtual TSC_DEADLINE, if virtual TSC_ADJUST is not zero: - when guest intends to disarm the tsc_deadline timer, should not arm the timer falsely; - when guest intends to arm the tsc_deadline timer, should not disarm the timer falsely. When read from virtual TSC_DEADLINE, if virtual TSC_ADJUST is not zero: - if physical TSC_DEADLINE is not zero, return the virtual TSC_DEADLINE value; - if physical TSC_DEADLINE is zero which means it's not armed (automatically disarmed after timer triggered), return 0 and reset the virtual TSC_DEADLINE. Tracked-On: #4162 Signed-off-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:10:50 +08:00
Conghui Chen	e61412981d	hv: support xsave in context switch xsave area: legacy region: 512 bytes xsave header: 64 bytes extended region: < 3k bytes So, pre-allocate 4k area for xsave. Use certain instruction to save or restore the area according to hardware xsave feature set. Tracked-On: #4166 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 09:31:12 +08:00
Conghui Chen	8ba203a165	hv: change xsave init function name change pcpu_xsave_init to init_pcpu_xsave. Tracked-On: #4166 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 09:31:12 +08:00
Li Fei1	2c4ebdc695	hv: vmsi: name vmsi with verb-object style Name vmsi and vmsix function with verb-object style: For external APIs, using MODULE_NAME_verb-object style; For internal APIs, using verb-object style. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-29 08:53:07 +08:00
Li Fei1	6ee076f7df	hv: assign: rename ptirq_msix_remap to ptirq_prepare_msix_remap ptirq_msix_remap doesn't do the real remap, that's the vmsi_remap and vmsix_remap_entry does. ptirq_msix_remap only did the preparation. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-29 08:53:07 +08:00
Geoffroy Van Cutsem	51a43dab79	hv: add Kconfig parameter to define the Service VM EFI bootloader Add a Kconfig parameter called UEFI_OS_LOADER_NAME to hold the Service VM EFI bootloader to be run by the ACRN hypervisor. A new string manipulation function to convert from (char ) to (CHAR16 ) has been added to facilitate the implementation. The default value is set to systemd-boot (bootloaderx64.efi) Tracked-On: #2793 Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>	2019-11-27 10:38:49 +08:00
Sainath Grandhi	422330d4ab	HV: reimplement PCI device discovery Major changes: 1. Correct handling of device multi-function capability We only check function zero for this feature. If it has it, we continue looking at all remaining functions, ignoring those with invalid vendors. The PCI spec says we are not to probe beyond function zero if it does not exist or indicates it is not a multi-function device. 2a. Walk ALL buses in the PCI space, however, Before walking the PCI hierarchy, post-processed ACPI DMAR info is parsed and a map is created between all device-scopes across all DRHDs and the corresponding IOMMU index. This map is used at the time of walking the PCI hierarchy. If a BDF that ACRN is currently working on, is found in the above-mentioned map, the BDF device is mapped to the corresponding DRHD in the map. If the BDF were a bridge type, realized with "Header Type" in config space, the BDF device along with all its downstream devices are mapped to the corresponding DRHD in the map. To avoid walking previously visited buses, we maintain a bitmap that stores which bus is walked when we handle Bridge type devices. Once ACPI information is included into ACRN about the PCI-Express Root Complexes / PCI Host Bridges, we can avoid the final loop which probes all remainder buses, and instead jump to the next Host Bridge bus. From prior patches, init_pdev returns the pdev structure it created to the caller. This allows us to complete initialization by updating its drhd_idx to the correct DRHD. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Alexander Merritt	94a456ae24	HV: refactor device_to_dmaru On server platforms, DMAR DRHD device scope entries may contain PCI bridges. Bridges in the DRHD device scope indicate this IOMMU translates for all devices on the hierarchy below that bridge. ACRN is unaware of bridge types in the device scope, and adds these directly to its internal representation of a DRHD. When looking up a BDF within these DRHD entries, device_to_dmaru assumes all entries are Endpoints, comparing BDF to BDF. Thus device to DMAR unit fails, because it treats a bridge as an Endpoint type. This change leverages prior patches by converting a BDF to the associated device DRHD index, and uses that index to obtain the correct DRHD state. Handling a bridge in other ways may require maintaining a bus list for each, or replacing each bridge in the dev scope with a set of all device BDFs underneath it. Server platforms can have hundreds of PCI devices, thus making the device scope artificially large is unwieldy. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Sainath Grandhi	c5a87d41df	HV: Cleanup PCI segment usage from VT-d interfaces ACRN does not support multiple PCI segments in its current form. But VT-d module uses segment info in its interfaces and hardcodes it to 0. This patch cleans up everything related to segment to avoid ambiguity. Tracked-On: #4134 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Alexander Merritt	810169ad20	HV: initialize IOMMU before PCI device discovery In later patches we use information from DMAR tables to guide discovery and initialization of PCI devices. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-11-27 09:49:32 +08:00
Alexander Merritt	ea131eea41	HV: add DRHD index to pci_pdev We add new member pci_pdev.drhd_idx associating the DRHD (IOMMU) with this pdev, and a method to convert a pbdf of a device to this index by searching the pdev list. Partial patch: drhd_index initialization handled in subsequent patch. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Alexander Merritt	0b7bcd6408	HV: extra methods for extracting header fields Add some encapsulation of utilities which read PCI header space using wrapper functions. Also contain verification of PCI vendor to its own function, rather than having hard-coded integrals exposed among other code. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Mingqiang Chi	32b8d99f48	hv:panic if there is no memory map in multiboot info add panic if there is no memory map info during booting. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-11-26 16:16:23 +08:00
Mingqiang Chi	bd0dbd274d	hv:add dump_guest_mem add shell command to support dump dump guest memory e.g. dump_guest_mem vm_id, gva, length Tracked-On: #4144 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-26 10:58:19 +08:00
Mingqiang Chi	215bb6ca6c	hv:refine dump_host_mem rename shell_dumpmem to shell_dump_host_mem and refine this api. Tracked-On: #4144 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-26 10:58:19 +08:00
Mingqiang Chi	4c8dde1b9c	hv:remove show_guest_call_trace now this api assumes the guest OS is 64 bits, this patch remove this api and will replace it with dumping guest memory. Tracked-On: #4144 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-26 10:58:19 +08:00
Victor Sun	f657bae0a8	Makefile: do not rm board acpi info header The $(BOARD)_acpi_info.h is generated by acrn-config tool, remove this header in make clean would cause failure when user finish configuring in webUI and start to make acrn-hypervisor by the command "make hypervisor BOARD=xxx SCENARIO=yyy" because we mandatory do make clean before making hypervisor. The patch replace the file removal with a warning string to hint user to check the file validity. Tracked-On: #3779 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-11-21 16:15:23 +08:00
Jidong Xia	26c45a0c70	hv: modify printf "not support the vuart index parameter" in vuart_register_io_handler call vuart_register_io_handler function, when the parameter vuart_idx is greater than or equal to 2, print the vuart index value which will not register the vuart. Tracked-On: #4072 Signed-off-by: Jidong Xia <xiajidong@cmss.chinamobile.com>	2019-11-20 09:45:00 +08:00
Li Fei1	5aa92b85ea	hv: vpci: move vBAR base setting into pci_vdev_write_bar Updating vBAR base when setting vBAR configuration sapce. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-15 13:54:21 +08:00
Li Fei1	5fdb6cc0ac	hv: vpci: remove 64 bits PCI BAR map logic constraint After reshuffle pci_bar structrue we could write ~0U not BAR size mask to BAR configuration space directly when do BAR sizing. In this case, we could know whether the value in BAR configuration space is a valid base address. As a result, we could do BAR re-programming whenever we want. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-15 13:54:21 +08:00
Li Fei1	c049c5c965	hv: vpci: reshuffle pci_bar structure The current code declare pci_bar structure following the PCI bar spec. However, we could not tell whether the value in virtual BAR configuration space is valid base address base on current pci_bar structure. We need to add more fields which are duplicated instances of the vBAR information. Basides these fields which will added, bar_base_mapped is another duplicated instance of the vBAR information. This patch try to reshuffle the pci_bar structure to declare pci_bar structure following the software implement benefit not the PCI bar spec. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-15 13:54:21 +08:00
Li Fei1	f53baadd5a	hv: vpci: refine PCI IO BAR map The current do PCI IO BAR remap in vdev_pt_allow_io_vbar. This patch split this function into vdev_pt_deny_io_vbar and vdev_pt_allow_io_vbar. vdev_pt_deny_io_vbar removes the old IO port mapping, vdev_pt_allow_io_vbar add the new IO port mapping. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-15 13:54:21 +08:00
Sainath Grandhi	22a1bd6948	hv: Fix the definition of struct representing interrupt hw frame In 64-bit mode, processor pushes SS and RSP onto stack unconditionally. Also when dumping the exception info, it makes more sense to dump the RSP at the point of interrupt, rather than the RSP after pushing context (including GPRs) Tracked-On: #4102 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-13 16:06:35 +08:00
Victor Sun	0d52f933da	Makefile: move .mk file to hv scripts folder The *.mk files under misc/acrn-config/library are all rules for hypervisor makefiles only, so move these files to hypervisor/scripts/makefile/ folder. The folder of acrn-config/library/ will be used to store python script lib only. Tracked-On: #3779 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Terry Zou <terry.zou@intel.com>	2019-11-13 16:05:30 +08:00
Victor Sun	acd0deb8a1	Makefile: board specific acpi info header clean up The board specific $(BOARD)_acpi_info.h is generated by acrn-config tool, we should clean it up before build hypervisor, otherwise the file could be referenced by next build process if no config XMLs is specified. Tracked-On: #3779 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-11-13 16:05:30 +08:00
Binbin Wu	fa3888c12a	hv: ept: disable execute right on large pages Issue description: ----------------- Machine Check Error on Page Size Change Instruction fetch may cause machine check error if page size and memory type was changed without invalidation on some processors[1][2]. Malicious guest kernel could trigger this issue. This issue applies to both primary page table and extended page tables (EPT), however the primary page table is controlled by hypervisor only. This patch mitigates the situation in EPT. Mitigation details: ------------------ Implement non-execute huge pages in EPT. This patch series clears the execute permission (bit 2) in the EPT entries for large pages. When EPT violation is triggered by guest instruction fetch, hypervisor converts the large page to smaller 4 KB pages and restore the execute permission, and then re-execute the guest instruction. The current patch turns on the mitigation by default. The follow-up patches will conditionally turn on/off the feature per processor model. [1] Refer to erratum KBL002 in "7th Generation Intel Processor Family and 8th Generation Intel Processor Family for U Quad Core Platforms Specification Update" https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/7th-gen-core-family-spec-update.pdf [2] Refer to erratum SKL002 in "6th Generation Intel Processor Family Specification Update" https://www.intel.com/content/www/us/en/products/docs/processors/core/desktop-6th-gen-core-family-spec-update.html Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-11-13 08:00:36 +08:00
lirui34	70312bfb7e	dm: Add licenses to the scripts. Add licenses to the scripts: ``` devicemodel/samples/apl-mrb/launch_uos.sh devicemodel/samples/apl-up2/launch_uos.sh devicemodel/samples/nuc/launch_hard_rt_vm.sh devicemodel/samples/nuc/launch_uos.sh devicemodel/samples/nuc/launch_vxworks.sh devicemodel/samples/nuc/launch_win.sh devicemodel/samples/nuc/launch_zephyr.sh hypervisor/scripts/genld.sh ``` Tracked-On: #4061 Signed-off-by: lirui34 <ruix.li@intel.com>	2019-11-11 15:35:19 +08:00
Victor Sun	ed8fb94778	Makefile: support make from XML for new board Currently make hypervisor will depend on a $(BOARD).config file to load board defconfig which triggered by oldconfig process, this will block make from XMLs for a new board because $(BOARD).config never exist. This requires us to patch configuration for new board earlier than make oldconfig. Tracked-On: #4067 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-11-11 15:01:50 +08:00
Peter Fang	b7329f10a5	hv: instr_emul: use cs segment when fetching instructions In non-64-bit mode, CS segment base address should be considered when determining the linear address of the vcpu's instruction pointer. Use vie_calculate_gla() for instruction address translation which also takes care of 64-bit mode. Tracked-On: #4064 Signed-off-by: Peter Fang <peter.fang@intel.com>	2019-11-11 13:55:24 +08:00
Mingqiang Chi	8666ba6c01	hv:remove unnecessary wrapper for emulate_instruction remove unnecessary wrapper for this api(emulate_instruction) Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-11-09 11:43:37 +08:00
Yonghua Huang	0eb427f122	hv:refine 'uint64_t' string print format in comm moudle Use "0x%lx" string to format 'uint64_t' type value, instead of "0x%llx". Tracked-On: #4020 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2019-11-09 11:42:38 +08:00
Yonghua Huang	e51386fe04	hv: refine 'uint64_t' string print format in x86 moudle Use "0x%lx" string to format 'uint64_t' type value, instead of "0x%llx". Tracked-On: #4020 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2019-11-09 11:42:38 +08:00
Yonghua Huang	fb29d1f99f	hv: refine 'uint64_t' string print format in debug moudle Use "0x%lx" string to format 'uint64_t' type value, instead of "0x%llx". Tracked-On: #4020 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2019-11-09 11:42:38 +08:00
Victor Sun	3411f00b5b	HV: fix misra violation on platform clos array MISRA C requires specified bounds for arrays declaration, previous declaration of platform_clos_array in board.h does not meet the requirement. Tracked-On: #3987 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-11-08 16:40:14 +08:00
Victor Sun	c77d275e9d	HV: clean up DMAR MACROs for sample platform acpi info Remove redundant DMAR MACROs for given platform_acpi_info files because CONFIG_ACPI_PARSE_ENABLED is enabled for all boards by default. The DMAR info for nuc7i7dnb is kept as reference in the case that ACPI_PARSE_ENABLED is not set in Kconfig. As DMAR info is not provided for apl-mrb, the platform_acpi_info.h under apl-mrb config folder is meaningless, so also remove this file and let hypervisor parse ACPI for apl-mrb; Tracked-On: #3977 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 16:40:14 +08:00
Victor Sun	9e92f3cdf5	HV: move dmar info definition to board.c The DMAR info is board specific so move the structure definition to board.c. As a configruation file, the whole board.c could be generated by acrn-config tool for each board. Please note we only provide DMAR info MACROs for nuc7i7dnb board. For other boards, ACPI_PARSE_ENABLED must be set to y in Kconfig to let hypervisor parse DMAR info, or use acrn-config tool to generate DMAR info MACROs if user won't enable ACPI parse code for FuSa consideration. The patch also moves the function of get_dmar_info() to vtd.c, so dmar_info.c could be removed. Tracked-On: #3977 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 16:40:14 +08:00
Victor Sun	589be88cf6	HV: link CONFIG_MAX_IOMMU_NUM and MAX_DRHDS to DRHD_COUNT The value of CONFIG_MAX_IOMMU and MAX_DRHDS are identical to DRHD_COUNT which defined in platform ACPI table, so remove CONFIG_MAX_IOMMU_NUM from Kconfig and link these three MACROs together. Tracked-On: #3977 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 16:40:14 +08:00
Conghui Chen	75f512ce8c	hv: rename vuart operations fifo_reset -> reset_fifo vuart_fifo_init -> init_fifo vuart_setup - > setup_vuart vuart_init -> init_vuart vuart_deinit -> deinit_vuart vuart_lock_init -> init_vuart_lock vuart_lock -> obtain_vuart_lock vuart_unlock -> release_vuart_lock vuart_deinit_connect -> vuart_deinit_connection Tracked-On: #4017 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 09:01:01 +08:00
Kaige Fu	20c1ad1b3a	HV: correct the formatting flag of hypcall_id hypcall_id has a type of uint64_t and should use 'llx' as formatting flag instead of '%d'. Otherwise, we will get a confusing error log when not-allowed hypercall occurs. Without this patch: [96707209us][cpu=1][sev=3][seq=2386]:hypercall -2147483548 is only allowed from SOS_VM! With this patch: [84613395us][cpu=1][sev=3][seq=2136]:hypercall 0x80000064 is only allowed from SOS_VM! So, we can figure out which not-allowed hypercall has been triggered more conveniently. BTW, this patch adds hypcall_id which triggered from non-ring0 into error log. Tracked-On: #4012 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-11-07 15:01:21 +08:00
Li Fei1	8189d1f01c	hv: mmu: fliter e820 which is over top address space Now the default board memory size is 16 GB. However, ACRN support more and more boards which may have memory size large than 16 GB. This patch try to filter e820 table which is over top address space. Tracked-On: #4007 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-07 08:47:02 +08:00
Li Fei1	620a1c5215	hv: mmu: rename e820 to hv_e820 Now the e820 structure store ACRN HV memory layout, not the physical memory layout. Rename e820 to hv_hv_e820 to show this explicitly. Tracked-On: #4007 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-07 08:47:02 +08:00
Yonghua Huang	8227804b09	hv:Unmap AP trampoline region from service VM's EPT AP trampoline code should be accessible to hypervisor only, this patch is to unmap this region from service VM's EPT for security reason. Tracked-On: #3992 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-05 15:14:13 +08:00
Yonghua Huang	d74497eb17	hv:refine modify_or_del_pte/pde/pdpte()function 1. Print warning message instead of ASSERT when the caller try to modify the attribute for memory region that is not present. 2. To avoid above warning message for memory region below 1M,its attribute may be updated by Service VM when updating MTTR setting. Tracked-On: #3992 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-05 15:14:13 +08:00
Yonghua Huang	6ae2d9f22b	hv: refine 'get_direct_boot_ap_trampoline()' Currently, memory with size of 'CONFIG_LOW_RAM_SIZE' will be allocated when 'get_direct_boot_ap_trampoline()' is called. This patch refine the implementation of of above function, it returns the base address of trampoline buffer when called, and the memory is allocated when vboot module is initialized. Tracked-On: #3992 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com>	2019-11-05 15:14:13 +08:00
Kaige Fu	c22f899a5e	HV: Fix poweroff issue of hard RTVM We should use INIT signal to notify the vcpu threads when powering off the hard RTVM. To achive this, we should set the vcpu->thread_obj.notify_mode as SCHED_NOTIFY_INIT. Patch (`27163df9` hv: sched: add sleep/wake for thread object) tries to set the notify_mode according `is_lapic_pt_enabled(vcpu)` in function prepare_vcpu. But at this point, the is_lapic_pt_enabled(vcpu) will always return false. Consequently, it will set notify_mode as SCHED_NOTIFY_IPI. Then leads to the failure of powering off hard RTVM. This patch fixes it by: - Initialize the notify_mode as SCHED_NOTIFY_IPI in prepare_vcpu. - Set notify_mode as SCHED_NOTIFY_INIT after guest is trying to enable x2apic mode of passthru lapic. Tracked-On: #3975 Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-11-04 10:28:16 +08:00
Li, Fei1	9d26dab6d6	hv: mmio: add a lock to protect mmio_node access After adding PCI BAR remap support, mmio_node may unregister when there's others access it. This patch add a lock to protect mmio_node access. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-11-01 14:44:11 +08:00
Li, Fei1	21cb120bcc	hv: vpci: add a global PCI lock for each VM Concurrent access on PCI device may happened if UOS try to access PCI configuration space on different vCPUs through IO port. This patch just adds a global PCI lock for each VM to prevent the concurrent access. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-11-01 14:44:11 +08:00
Li, Fei1	f711d3a639	hv: vpci: define PCI CONFIG_ADDRESS Register as its physical layout Refine PCI CONFIG_ADDRESS Register definition as its physical layout. In this case, we could read/write PCI CONFIG_ADDRESS Register atomically. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-11-01 14:44:11 +08:00
Li, Fei1	6f310d1ab2	hv: mmio: move EPT operation out of register_mmio_emulation_handler register_mmio_emulation_handler should only register handler for mmio emulation. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-10-31 11:46:10 +08:00
Li, Fei1	4f6653dc9c	hv: vpci: do unmap/map in vdev_pt_write_vbar explicitly Unmap old mappings in vdev_pt_write_vbar explicitly before set_vbar_base. Then map new mappings explicitly in vdev_pt_write_vbar. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-10-31 11:46:10 +08:00
Huihuang Shi	5d662ea11f	hv: fixed by replace ull to ul. ul is used as immediate integer suffix with type uint64_t. Tracked-On: #3214 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>	2019-10-31 09:02:59 +08:00
Li, Fei1	2c158d5ad4	hv: io: add unregister_mmio_emulation_handler API Since guest could re-program PCI device MSI-X table BAR, we should add mmio emulation handler unregister. However, after add unregister_mmio_emulation_handler API, emul_mmio_regions is no longer accurate. Just replace it with max_emul_mmio_regions which records the max index of the emul_mmio_node. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-10-29 14:49:55 +08:00
Li, Fei1	dc1e2adaec	hv: vpci: add PCI BAR re-program address check In theory, guest could re-program PCI BAR address to any address. However, ACRN hypervisor only support [0, top_address_space) EPT memory mapping. So we need to check whether the PCI BAR re-program address is within this scope. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-29 14:49:55 +08:00
Sainath Grandhi	f01aad7e77	hv: Let trampoline execution use 1GB pages ACRN currently uses 2MB large pages in the page tables setup for trampoline code and data. This patch lets ACRN use 1GB large pages instead. When it comes to fixing symbols in trampoline code, fixing pointers in PDPT is no more needed as PDPT PTEs contain Physical Address. Tracked-On: #3899 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-28 13:44:32 +08:00
Kaige Fu	d5c3523d30	hv: Update industry scenarios configuration This patch makes the following changes: - Remove the 4th VM - Make the default vcpu num of RTVM as 2 --- v1 -> v2: Modify CONFIG_MAX_VM_NUM to 3U + KATA Tracked-On: #3925 Signed-off-by: Yan, Like <like.yan@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-10-25 15:23:16 +08:00
Shuo A Liu	5f8e7a6cb7	hv: sched: add kick_thread to support notification kick means to notify one thread_object. If the target thread object is running, send a IPI to notify it; if the target thread object is runnable, make reschedule on it. Also add kick_vcpu API in vcpu layer to notify vcpu. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Conghui Chen	810305be98	hv: sched: disable interrupt when grab schedule spinlock After moving softirq to following interrupt path, softirq handler might break in the schedule spinlock context and try to grab the lock again, then deadlock. Disable interrupt with schedule spinlock context. For the IRQ disable/restore operations: CPU_INT_ALL_DISABLE(&rflag) CPU_INT_ALL_RESTORE(rflag) each takes 50~60 cycles. renaming: get_schedule_lock -> obtain_schedule_lock Tracked-On: #3813 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Shuo A Liu	15c6a3e31f	hv: sched: remove do_switch Clean up do_swtich and do switch related things in schedule(). Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Shuo A Liu	f04c491259	hv: sched: decouple scheduler from schedule framework This patch decouple some scheduling logic and abstract into a scheduler. Then we have scheduler, schedule framework. From modulization perspective, schedule framework provides some APIs for other layers to use, also interact with scheduler through scheduler interaces. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Shuo A Liu	cad195c018	hv: sched: add pcpu_id in sched_control To get pcpu_id from sched_control quickly and easier. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-10-25 13:00:21 +08:00

... 8 9 10 11 12 ...

3311 Commits