acrn-hypervisor

mirror of https://github.com/projectacrn/acrn-hypervisor.git synced 2025-09-24 02:08:04 +00:00

Author	SHA1	Message	Date
Junjie Mao	9781873e77	HV: treewide: fix violations of coding guideline C-TY-12 The coding guideline rule C-TY-12 requires that 'all type conversions shall be explicit'. Especially implicit cases on the signedness of variables shall be avoided. This patch either adds explicit type casts or adjust local variable types to make sure that Booleans, signed and unsigned integers are not used mixedly. This patch has no semantic changes. Tracked-On: #6776 Signed-off-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-11-04 18:15:47 +08:00
Junjie Mao	db20e277b6	HV: treewide: fix violations of coding guideline C-TY-02 The coding guideline rule C-TY-02 requires that 'the operands of bit operations shall be unsigned'. This patch adds explicit casts or literal suffixes to make explicit the type of values involved in bit operations. Explicit casts to widen integers before shift operations are also introduced to make explicit that the variables are expanded BEFORE it is shifted (which is already so in C99 but implicitly). This patch has no semantic changes. Tracked-On: #6776 Signed-off-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-11-04 18:15:47 +08:00
Junjie Mao	fba343bd05	HV: treewide: fix violations of coding guideline C-FN-06 The coding guideline rule C-FN-06 requires that 'a parameter passed by value to a function shall not be modified directly'. This patch rewrites two functions which does modify its parameters today. This patch has no semantic impact. Tracked-On: #6776 Signed-off-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-11-04 18:15:47 +08:00
Junjie Mao	3f3f4be642	HV: treewide: fix violations of coding guideline C-EP-05 The coding guideline rule C-EP-05 requires that 'parentheses shall be used to set the operator precedence explicitly'. This patch adds the missing parentheses detected by the static analyzer. Tracked-On: #6776 Signed-off-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-11-04 18:15:47 +08:00
Junjie Mao	4cf6c288cd	HV: treewide: fix warnings raised by Clang This patch fixes the following warnings detected by the LLVM/Clang compiler: 1. Unused static functions in C sources, which are fixed by explicitly tagging them with __unused 2. Duplicated parentheses around branch conditions 3. Assigning 64-bit constants to 32-bit variables, which is fixed by promoting the variables to uint64_t 4. Using { '\0' } to zero-fill an array, which is fixed by replacing it with { 0 } 5. Taking a bit out of a variable using && (which should be & instead) Most changes do not have a semantic impact, except item 5 which is probably a real code issue. Tracked-On: #6776 Signed-off-by: Junjie Mao <junjie.mao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-11-04 18:15:47 +08:00
Yifan Liu	10963b04d1	hv: Fix vcpu signaling racing problem in lock instruction emulation In lock instruction emulation, we use vcpu_make_request and signal_event pairs to shoot down/release other vcpus. However, vcpu_make_request is async and does not guarantee an execution of wait_event on target vcpu, and we want wait_event to be consistent with signal_event. Consider following scenarios: 1, When target vcpu's state has not yet turned to VCPU_RUNNING, vcpu_make_request on ACRN_REQUEST_SPLIT_LOCK does not make sense, and will not result in wait_event. 2, When target vcpu is already requested on ACRN_REQUEST_SPLIT_LOCK (i.e., the corresponding bit in pending_req is set) but not yet handled, the vcpu_make_request call does not result in wait_event as 1 bit is not enough to cache multiple requests. This patch tries to add checks in vcpu_kick_lock_instr_emulation and vcpu_complete_lock_instr_emulation to resolve these issues. Tracked-On: #6502 Signed-off-by: Yifan Liu <yifan1.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2021-11-02 15:01:20 +08:00
Liu Long	3f4ea38158	ACRN: misc: Unify terminology for service vm/user vm Rename SOS_VM type to SERVICE_VM rename UOS to User VM in XML description rename uos_thread_pid to user_vm_thread_pid rename devname_uos to devname_user_vm rename uosid to user_vmid rename UOS_ACK to USER_VM_ACK rename SOS_VM_CONFIG_CPU_AFFINITY to SERVICE_VM_CONFIG_CPU_AFFINITY rename SOS_COM to SERVICE_VM_COM rename SOS_UART1_VALID_NUM" to SERVICE_VM_UART1_VALID_NUM rename SOS_BOOTARGS_DIFF to SERVICE_VM_BOOTARGS_DIFF rename uos to user_vm in launch script and xml Tracked-On: #6744 Signed-off-by: Liu Long <long.liu@linux.intel.com> Reviewed-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>	2021-11-02 10:00:55 +08:00
Liu Long	e9c4ced460	ACRN: hv: Unify terminology for user vm Rename gpa_uos to gpa_user_vm rename base_gpa_in_uos to base_gpa_in_user_vm rename UOS_VIRT_PCI_MMCFG_BASE to USER_VM_VIRT_PCI_MMCFG_BASE rename UOS_VIRT_PCI_MMCFG_START_BUS to USER_VM_VIRT_PCI_MMCFG_START_BUS rename UOS_VIRT_PCI_MMCFG_END_BUS to USER_VM_VIRT_PCI_MMCFG_END_BUS rename UOS_VIRT_PCI_MEMBASE32 to USER_VM_VIRT_PCI_MEMBASE32 rename UOS_VIRT_PCI_MEMLIMIT32 to USER_VM_VIRT_PCI_MEMLIMIT32 rename UOS_VIRT_PCI_MEMBASE64 to USER_VM_VIRT_PCI_MEMBASE64 rename UOS_VIRT_PCI_MEMLIMIT64 to USER_VM_VIRT_PCI_MEMLIMIT64 rename UOS in comments message to User VM. Tracked-On: #6744 Signed-off-by: Liu Long <long.liu@linux.intel.com> Reviewed-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>	2021-11-02 10:00:55 +08:00
Liu Long	92b7d6a9a3	ACRN: hv: Terminology modification in hv code Rename sos_vm to service_vm. rename sos_vmid to service_vmid. rename sos_vm_ptr to service_vm_ptr. rename get_sos_vm to get_service_vm. rename sos_vm_gpa to service_vm_gpa. rename sos_vm_e820 to service_vm_e820. rename sos_efi_info to service_vm_efi_info. rename sos_vm_config to service_vm_config. rename sos_vm_hpa2gpa to service_vm_hpa2gpa. rename vdev_in_sos to vdev_in_service_vm. rename create_sos_vm_e820 to create_service_vm_e820. rename sos_high64_max_ram to service_vm_high64_max_ram. rename prepare_sos_vm_memmap to prepare_service_vm_memmap. rename post_uos_sworld_memory to post_user_vm_sworld_memory rename hcall_sos_offline_cpu to hcall_service_vm_offline_cpu. rename filter_mem_from_sos_e820 to filter_mem_from_service_vm_e820. rename create_sos_vm_efi_mmap_desc to create_service_vm_efi_mmap_desc. rename HC_SOS_OFFLINE_CPU to HC_SERVICE_VM_OFFLINE_CPU. rename SOS to Service VM in comments message. Tracked-On: #6744 Signed-off-by: Liu Long <long.liu@linux.intel.com> Reviewed-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>	2021-11-02 10:00:55 +08:00
Liu Long	26e507a06e	ACRN: hv: Unify terminology for service vm Rename is_sos_vm to is_service_vm Tracked-On: #6744 Signed-off-by: Liu Long <longliu@intel.com>	2021-11-02 10:00:55 +08:00
dongshen	dcafcadaf9	hv: rename some C preprocessor macros Rename some C preprocessor macros: NUM_GUEST_MSRS --> NUM_EMULATED_MSRS CAT_MSR_START_INDEX --> FLEXIBLE_MSR_INDEX NUM_VCAT_MSRS --> NUM_CAT_MSRS NUM_VCAT_L2_MSRS --> NUM_CAT_L2_MSRS NUM_VCAT_L3_MSRS --> NUM_CAT_L3_MSRS Tracked-On: #5917 Signed-off-by: Eddie Dong <eddie.dong@Intel.com>	2021-10-28 19:12:29 +08:00
dongshen	c0d95558c1	hv: vCAT: propagate vCBM to other vCPUs that share cache with vcpu Implement the propagate_vcbm() function: Set vCBM to to all the vCPUs that share cache with vcpu to mimic hardware CAT behavior Tracked-On: #5917 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-10-28 19:12:29 +08:00
dongshen	a7014f4654	hv: vCAT: implementing the vCAT MSRs write handler Implement the write_vcbm() function to handle the MSR_IA32_type_MASK_n vCBM MSRs write request Call write_vclosid() to handle MSR_IA32_PQR_ASSOC MSR write request Several vCAT P2V (physical to virtual) and V2P (virtual to physical) mappings exist: struct acrn_vm_config *vm_config = get_vm_config(vm_id) max_pcbm = vm_config->max_type_pcbm (type: l2 or l3) mask_shift = ffs64(max_pcbm) vclosid = vmsr - MSR_IA32_type_MASK_0 pclosid = vm_config->pclosids[vclosid] pmsr = MSR_IA32_type_MASK_0 + pclosid pcbm = vcbm << mask_shift vcbm = pcbm >> mask_shift Where MSR_IA32_type_MASK_n: L2 or L3 mask msr address for CLOSIDn, from 0C90H through 0D8FH (inclusive). max_pcbm: a bitmask that selects all the physical cache ways assigned to the VM vclosid: virtual CLOSID, always starts from 0 pclosid: corresponding physical CLOSID for a given vclosid vmsr: virtual msr address, passed to vCAT handlers by the caller functions rdmsr_vmexit_handler()/wrmsr_vmexit_handler() pmsr: physical msr address vcbm: virtual CBM, passed to vCAT handlers by the caller functions rdmsr_vmexit_handler()/wrmsr_vmexit_handler() pcbm: physical CBM Tracked-On: #5917 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-10-28 19:12:29 +08:00
dongshen	3ab50f2ef5	hv: vCAT: implementing the vCAT MSRs read handlers Implement the read_vcbm() and read_vclosid() functions to handle the MSR_IA32_PQR_ASSOC and MSR_IA32_type_MASK_n vCAT MSRs read request. Tracked-On: #5917 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-10-28 19:12:29 +08:00
dongshen	be855d2352	hv: vCAT: expose CAT capabilities to vCAT-enabled VM Expose CAT feature to vCAT VM by reporting the number of cache ways/CLOSIDs via the 04H/10H cpuid instructions, so that the VM can take advantage of CAT to prioritize and partition cache resource for its own tasks. Add the vcat_pcbm_to_vcbm() function to map pcbm to vcbm Tracked-On: #5917 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-10-28 19:12:29 +08:00
dongshen	77ae989379	hv: vCAT: initialize vCAT MSRs during vmcs init Initialize vCBM MSRs Initialize vCLOSID MSR Add some vCAT functions: Retrieve max_vcbm and max_pcbm Check if vCAT is configured or not for the VM Map vclosid to pclosid write_vclosid: vCLOSID MSR write handler write_vcbm: vCBM MSR write handler Tracked-On: #5917 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-10-28 19:12:29 +08:00
Fei Li	1905ed6124	hv: vMSR: minor fix about rdmsr_vmexit_handler Specifying a reserved or unimplemented MSR address in ECX for rdmsr will cause a general protection exception. In this case, we should not change the contents of registers EDX:EAX. Tracked-On: #4550 Signed-off-by: Fei Li <fei1.li@intel.com>	2021-10-27 08:23:43 +08:00
dongshen	39461ef9dd	hv: vCAT: initialize the emulated_guest_msrs array for CAT msrs during platform initialization Initialize the emulated_guest_msrs[] array at runtime for MSR_IA32_type_MASK_n and MSR_IA32_PQR_ASSOC msrs, there is no good way to do this initialization statically at build time Tracked-On: #5917 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-10-26 11:48:27 +08:00
dongshen	cb2bb78b6f	hv/config_tools: amend the struct acrn_vm_config to make it compatible with vCAT For vCAT, it may need to store more than MAX_VCPUS_PER_VM of closids, change clos in vm_config.h to a pointer to accommodate this situation Rename clos to pclosids pclosids now is a pointer to an array of physical CLOSIDs that is defined in vm_configurations.c by vmconfig. The number of elements in the array must be equal to the value given by num_pclosids Add max_type_pcbm (type: l2 or l3) to struct acrn_vm_config, which stores a bitmask that selects/covers all the physical cache ways assigned to the VM Change vmsr.c to accommodate this amended data structure Change the config-tools to generate vm_configurations.c, and fill in the num_closids and clos pointers based on the information from the scenario file. Now vm_configurations.c.xsl generates all the clos related code so remove the same code from misc_cfg.h.xsl. Examples: Scenario file: <RDT> <RDT_ENABLED>y</RDT_ENABLED> <CDP_ENABLED>n</CDP_ENABLED> <VCAT_ENABLED>y</VCAT_ENABLED> <CLOS_MASK>0x7ff</CLOS_MASK> <CLOS_MASK>0x7ff</CLOS_MASK> <CLOS_MASK>0x7ff</CLOS_MASK> <CLOS_MASK>0xff800</CLOS_MASK> <CLOS_MASK>0xff800</CLOS_MASK> <CLOS_MASK>0xff800</CLOS_MASK> <CLOS_MASK>0xff800</CLOS_MASK> <CLOS_MASK>0xff800</CLOS_MASK> /RDT> <vm id="0"> <guest_flags> <guest_flag>GUEST_FLAG_VCAT_ENABLED</guest_flag> </guest_flags> <clos> <vcpu_clos>3</vcpu_clos> <vcpu_clos>4</vcpu_clos> <vcpu_clos>5</vcpu_clos> <vcpu_clos>6</vcpu_clos> <vcpu_clos>7</vcpu_clos> </clos> </vm> <vm id="1"> <clos> <vcpu_clos>1</vcpu_clos> <vcpu_clos>2</vcpu_clos> </clos> </vm> vm_configurations.c (generated by config-tools) with the above vCAT config: static uint16_t vm0_vcpu_clos[5U] = {3U, 4U, 5U, 6U, 7U}; static uint16_t vm1_vcpu_clos[2U] = {1U, 2U}; struct acrn_vm_config vm_configs[CONFIG_MAX_VM_NUM] = { { .guest_flags = (GUEST_FLAG_VCAT_ENABLED), .pclosids = vm0_vcpu_clos, .num_pclosids = 5U, .max_l3_pcbm = 0xff800U, }, { .pclosids = vm1_vcpu_clos, .num_pclosids = 2U, }, }; Tracked-On: #5917 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-10-26 11:48:27 +08:00
Yonghua Huang	c8e2060d37	hv: unmap IOMMU register pages from service VM EPT IOMMU hardware resource is owned by hypervisor, while IOMMU capability is reported to service VM in its ACPI table. In this case, Service VM may access IOMMU hardware resource, which is not expected. This patch unmaps all Intel IOMMU register pages for service VM EPT. Tracked-On: #6677 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Reviewed-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-10-22 09:31:10 +08:00
Fei Li	df7ffab441	hv: remove CONFIG_HV_RAM_SIZE It's difficult to configure CONFIG_HV_RAM_SIZE properly at once. This patch not only remove CONFIG_HV_RAM_SIZE, but also we use ld linker script to dynamically get the size of HV RAM size. Tracked-On: #6663 Signed-off-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-10-14 15:04:36 +08:00
Zide Chen	e48962faa6	hv: optimize run_vcpu() for nested This patch implements a separate path for L2 VMEntry in run_vcpu(), which has several benefits: - keep run_vcpu() clean, to reduce the number of is_vcpu_in_l2_guest() statements: - current code has three is_vcpu_in_l2_guest() already. - supposed to have another 2 statement so that nested VMEntry won't hit the "Starting vCPU" and "vCPU launched" pr_info() and a few other statements in the VM launch path. - save few other things in run_vcpu() that are not needed for nested. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-10-13 15:55:31 +08:00
Zide Chen	89bbc44962	hv: inject external interrupts only if LAPIC is not passthru Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-10-08 09:18:34 +08:00
Zide Chen	228b052fdb	hv: operations on vcpu->reg_cached/reg_updated don't need LOCK prefix In run time, one vCPU won't read or write a register on other vCPUs, thus we don't need the LOCK prefixed instructions on reg_cached and reg_updated. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-10-08 09:11:10 +08:00
Zide Chen	2b683f8f5b	hv: call vcpu_inject_exception() only when ACRN_REQUEST_EXCP is set move the bitmap test call out of vcpu_inject_exception(), then we call the expensive bitmap_test_and_clear_lock() only pending_req_bits is non-zero and call vcpu_inject_exception() only if needed. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-10-07 20:48:43 +08:00
Zide Chen	f801ba4ed7	hv: update guest RIP only if vcpu->arch.inst_len is non zero In very large number of VM extis, the VM-exit instruction length could be zero, and it's no need to update VMX_GUEST_RIP. Some examples: - all external interrupt VM exits in non LAPIC passthru setup. - for all the nested VM-exits that are reflecting to L1 hypervisor. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-10-07 20:47:07 +08:00
Zide Chen	b7e9a68923	hv: code cleanup in run_vcpu() - wrap a new function exec_vmentry() to reduce code duplication. - remove exec_vmread(VMX_GUEST_RSP) since ACRN doesn't need to know guest RSP in run time. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-10-07 20:47:07 +08:00
Zide Chen	ee12daff84	hv: nested: refine vmcs12_read/write_field APIs Change "uint64_t vmcs_hva" to "void *vmcs_hva" in the input argument, list, so that no type casting is needed when calling them from pointers. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-10-07 20:45:34 +08:00
Liu,Junming	4105ca2cb4	hv: deny the launch of VM if pass-thru PIO bar isn't identical mapping In current design, when pass-thru dev, for the PIO bar, need to ensure the guest PIO start address equals to host PIO start address. Then set the VMCS io bitmap to pass-thru the corresponding port io to guest for performance. ACRN-DM and acrn-config should ensure the identical mapping of PIO bar. If ACRN-DM or acrn-config failed to achieve this, we should deny the launch of VM Tracked-On: #6508 Signed-off-by: Liu,Junming <junming.liu@intel.com> Reviewed-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com>	2021-09-28 08:49:01 +08:00
Zide Chen	a62dd6ad8a	hv: nested: fixed vmxoff_vmexit_handler() issue In VMXOFF vmexit handler, it's supposed to remove VMCS shadowing. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com>	2021-09-26 08:49:35 +08:00
Zide Chen	45b036e028	hv: nested: enable multiple active VMCS12 support This patch changes the size of vvmcs[] array from 1 to PER_VCPU_ACTIVE_VVMCS_NUM, and actually enables multiple active VMCS12 support in ACRN. The basic operations: - if L1 VMPTRLDs a VMCS12 without previously VMCLEAR the current VMCS12, ACRN no longer unconditionally flushes the current VMCS12 back to L1. Instead, it tries to keep both the current and the newly loaded VMCS12 in the nested->vvmcs[] array, unless: - if there is no more available vvmcs[] entry, ACRN flushes one active VMCS12 to make room for this new VMCS12. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-09-26 08:49:35 +08:00
Mingqiang Chi	f39c882359	hv:change log level for check_vmx_ctrl Some processors don't support VMX_PROCBASED_CTLS_TERTIARY bit and VMX_PROCBASED_CTLS2_UWAIT_PAUSE bit in MSRs (IA32_VMX_PROCBASED_CTLS & IA32_VMX_PROCBASED_CTLS2), HV will output error log which will cause confusion, change the log level from pr_err to pr_info. Tracked-On: #6397 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2021-09-24 10:17:19 +08:00
Jie Deng	064fd7647f	hv: add priority based scheduler This patch adds a new priority based scheduler to support vCPU scheduling based on their pre-configured priorities. A vCPU can be running only if there is no higher priority vCPU running on the same pCPU. Tracked-On: #6571 Signed-off-by: Jie Deng <jie.deng@intel.com>	2021-09-24 09:32:18 +08:00
Zide Chen	94cbe909ee	hv: irq: identical vector mapping if LAPIC passthough In local APIC passthrough case, when devices triggered a INTx interrupt, this interrupt would be delivered to vCPU directly. For this case, need to set the virtual vector in the 'Interrupt Vector' field of physical IOxAPIC I/O REDIRECTION TABLE REGISTER (bits 7:0) and 'Vector' field of vt-d Interrupt Remapping Table Entry (IRTE) for Remapped Interrupts. Assumption: (a) IOAPIC pins won't be shared between LAPIC PT guest and other guests; (b) The guest would not trigger this IRQ before it switched to x2 APIC mode. Tracked-On: #5923 Signed-off-by: Zide Chen <zide.chen@intel.com>	2021-09-18 09:42:44 +08:00
Zide Chen	0466d7055f	hv: nested: move the VMCS12 dirty flags to struct acrn_vvmcs These dirty flags are supposed to be per VMCS12, so move them from the per vCPU acrn_nested struct to the newly added acrn_vvmcs struct. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-09-17 10:58:43 +08:00
Zide Chen	4e54c3880b	hv: nested: remove vcpu->arch.nested.current_vmcs12_ptr This variable represents the L1 GPA of the current VMCS12. But it's no longer needed in the multiple active VMCS12 case, which uses the following variables for this purpose. - nested->current_vvmcs refers to the vvmcs[] entry which contains the cached current VMCS12, its associated VMCS02, and other context info. - nested->current_vvmcs->vmcs12_gpa refers to the L1 GPA of this current VMCS12. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-09-17 10:58:43 +08:00
Zide Chen	799a4d332a	hv: nested: initial implementation of struct acrn_vvmcs Add an array of struct acrn_vvmcs to struct acrn_nested, so it is possible to cache multiple active VMCS12s. This patch declares the size of this array to 1, meaning that there is only one active VMCS12. This is to minimize the logical code changes. Add pointer current_vvmcs to struct acrn_nested, which refers to the current vvmcs[] entry. In this patch, if any VMCS12 is active, it always points to vvmcs[0]. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-09-17 10:58:43 +08:00
Zide Chen	cf697e753d	hv: nested: some API signature changes No any logical changes, this patch is preparing for multiple active VMCS12 support. - currently it's easy to get the vmcs12 pointer from the vcpu pointer. In multiple active vmcs12 case, we need to explicitly add "struct acrn_vmcs12 *vmcs12" to certain APIs' input argument list, in order to get the desired vmcs12 pointer. - merge flush_current_vmcs12() into clear_vmcs02() for multiple reasons: a) it's called only once; b) we don't wrap the opposite operation (loading vmcs12) in an API; c) this API has simple and clear logic. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-09-17 10:58:43 +08:00
Zide Chen	e9eb72d319	hv: nested: flush L2 VPID only when it could conflict with L1 VPIDs By changing the way to assign L1 VPID from bottom-up to top-down, the possibilities for VPID conflicts between L1 and L2 guests are small. Then we can flush VPID just in case of conflicting. Tracked-On: #6289 Signed-off-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-09-16 09:26:10 +08:00
Zide Chen	1ab65825ba	hv: nested: merge gpa_field_dirty and control_field_dirty flag In run time, it's rare for L1 to write to the intercepted non host-state VMCS fields, and using multiple dirty flags is not necessary. This patch uses one single dirty flag to manage all non host-state VMCS fields. This helps to simplify current code and in the future we may not need to declare new dirty flags when we intercept more VMCS fields. Tracked-On: #5923 Signed-off-by: Zide Chen <zide.chen@intel.com>	2021-09-13 15:50:01 +08:00
Zide Chen	6376d5a0d3	hv: nested: fix bug in syncing EPTP from VMCS12 to VMCS02 Currently vmptrld_vmexit_handler() doesn't sync VMX_EPT_POINTER_FULL from vmcs12 to vmcs02, instead it sets gpa_field_dirty and relies on nested_vmentry() to sync EPTP in next nested VMentry. This creates readability issue since all other intercepted VMCS fields are synced in sync_vmcs12_to_vmcs02(). Another issue is that other VMCS fields managed by gpa_field_dirty are repeatedly synced in both vmptrld and nested vmentry handler. This patch moves get_nept_desc() ahead of sync_vmcs12_to_vmcs02(), such that shadow_eptp is allocated before sync_vmcs12_to_vmcs02() which can sync EPTP properly. BTW, in nested_vmexit_handler(), don't need to read from VMCS to get the exit reason, since vcpu->arch.exit_reason has it already. Tracked-On: #5923 Signed-off-by: Zide Chen <zide.chen@intel.com>	2021-09-13 15:50:01 +08:00
Zide Chen	11c2f3eabb	hv: check bitmap before calling bitmap_test_and_clear_lock() The locked btr instruction is expensive. This patch changes the logic to ensure that the bitmap is non-zero before executing bitmap_test_and_clear_lock(). The VMX transition time gets significant improvement. SOS running on TGL, the CPUID roundtrip reduces from ~2400 cycles to ~2000 cycles. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-09-02 16:09:33 +08:00
Zide Chen	aeb3690b6f	hv: simplify is_lapic_pt_enabled() is_lapic_pt_enabled() is called at least twice in one loop of the vCPU thread, and it's called in vmexit_handler() frequently if LAPIC is not pass-through. Thus the efficiency of this function has direct impact to the system performance. Since the LAPIC mode is not changed in run time, we don't have to calculate it on the fly in is_lapic_pt_enabled(). BTW, removed the unused lapic_mask from struct acrn_vcpu_arch. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-08-26 09:52:10 +08:00
Shiqing Gao	d90dbc0d91	hv: check the capability of XSAVES/XRSTORS instructions before execution For platforms that do not support XSAVES/XRSTORS instructions, like QEMU, executing these instructions causes #UD. This patch adds the check before the execution of XSAVES/XRSTORS instructions. It also refines the logic inside rstore_xsave_area for the following reason: If XSAVES/XRSTORS instructions are supported, restore XSAVE area if any of the following conditions is met: 1. "vcpu->launched" is false (state initialization for guest) 2. "vcpu->arch.xsave_enabled" is true (state restoring for guest) * Before vCPU is launched, condition 1 is satisfied. * After vCPU is launched, condition 2 is satisfied because is_valid_xsave_combination() guarantees that "vcpu->arch.xsave_enabled" is consistent with pcpu_has_cap(X86_FEATURE_XSAVES). Therefore, the check against "vcpu->launched" and "vcpu->arch.xsave_enabled" can be eliminated here. Tracked-On: #6481 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2021-08-26 09:42:23 +08:00
Zide Chen	cbf3825140	hv: Pass-through IA32_TSC_AUX MSR to L1 guest Use an unused MSR on host to save ACRN pcpu ID and avoid saving and restoring TSC AUX MSR on VMX transitions. Tracked-On: #6289 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2021-08-26 09:25:54 +08:00
Yifan Liu	d33c76f701	hv: quirks: SMBIOS passthrough for prelaunched-VM This feature is guarded under config CONFIG_SECURITY_VM_FIXUP, which by default should be disabled. This patch passthrough native SMBIOS information to prelaunched VM. SMBIOS table contains a small entry point structure and a table, of which the entry point structure will be put in 0xf0000-0xfffff region in guest address space, and the table will be put in the ACPI_NVS region in guest address space. v2 -> v3: uuid_is_equal moved to util.h as inline API result -> pVendortable, in function efi_search_guid recalc_checksum -> generate_checksum efi_search_smbios -> efi_search_smbios_eps scan_smbios_eps -> mem_search_smbios_eps EFI GUID definition kept Tracked-On: #6320 Signed-off-by: Yifan Liu <yifan1.liu@intel.com>	2021-08-26 09:24:50 +08:00
Zide Chen	6d7eb6d7b6	hv: emulate IA32_EFER and adjust Load EFER VMX controls This helps to improve performance: - Don't need to execute VMREAD in vcpu_get_efer(), which is frequently called. - VMX_EXIT_CTLS_SAVE_EFER can be removed from VM-Exit Controls. - If the value of IA32_EFER MSR is identical between the host and guest (highly likely), adjust the VMX controls not to load IA32_EFER on VMExit and VMEntry. It's convenient to continue use the exiting vcpu_s/get_efer() APIs, other than the common vcpu_s/get_guest_msr(). Tracked-On: #6289 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com>	2021-08-24 11:16:53 +08:00
Liang Yi	2b3620de7d	hv: mask off LA57 in cpuid Mask off support of 57-bit linear addresses and five-level paging. ICX-D has LA57 but ACRN doesn't support 5-level paging yet. Tracked-On: #6357 Signed-off-by: Liang Yi <yi.liang@intel.com> Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2021-08-20 11:02:21 +08:00
Shiqing Gao	651d44432c	hv: initialize the XSAVE related processor state for guest If SOS is using kernel 5.4, hypervisor got panic with #GP. Here is an example on KBL showing how the panic occurs when kernel 5.4 is used: Notes: * Physical MSR_IA32_XSS[bit 8] is 1 when physical CPU boots up. * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is initialized to 0. Following thread switches would happen at run time: 1. idle thread -> vcpu thread context_switch_in happens and rstore_xsave_area is called. At this moment, vcpu->arch.xsave_enabled is false as vcpu is not launched yet and init_vmcs is not called yet (where xsave_enabled is set to true). Thus, physical MSR_IA32_XSS is not updated with the value of guest MSR_IA32_XSS. States at this point: * Physical MSR_IA32_XSS[bit 8] is 1. * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is 0. 2. vcpu thread -> idle thread context_switch_out happens and save_xsave_area is called. At this moment, vcpu->arch.xsave_enabled is true. Processor state is saved to memory with XSAVES instruction. As physical MSR_IA32_XSS[bit 8] is 1, ectx->xs_area.xsave_hdr.hdr.xcomp_bv[bit 8] is set to 1 after the execution of XSAVES instruction. States at this point: * Physical MSR_IA32_XSS[bit 8] is 1. * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is 0. * ectx->xs_area.xsave_hdr.hdr.xcomp_bv[bit 8] is 1. 3. idle thread -> vcpu thread context_switch_in happens and rstore_xsave_area is called. At this moment, vcpu->arch.xsave_enabled is true. Physical MSR_IA32_XSS is updated with the value of guest MSR_IA32_XSS, which is 0. States at this point: * Physical MSR_IA32_XSS[bit 8] is 0. * vcpu_get_guest_msr(vcpu, MSR_IA32_XSS)[bit 8] is 0. * ectx->xs_area.xsave_hdr.hdr.xcomp_bv[bit 8] is 1. Processor state is restored from memory with XRSTORS instruction afterwards. According to SDM Vol1 13.12 OPERATION OF XRSTORS, a #GP occurs if XCOMP_BV sets a bit in the range 62:0 that is not set in XCR0 \| IA32_XSS. So, #GP occurs once XRSTORS instruction is executed. Such issue does not happen with kernel 5.10. Because kernel 5.10 writes to MSR_IA32_XSS during initialization, while kernel 5.4 does not do such write. Once guest writes to MSR_IA32_XSS, it would be trapped to hypervisor, then, physical MSR_IA32_XSS and the value of MSR_IA32_XSS in vcpu->arch.guest_msrs are updated with the value specified by guest. So, in the point 2 above, correct processor state is saved. And #GP would not happen in the point 3. This patch initializes the XSAVE related processor state for guest. If vcpu is not launched yet, the processor state is initialized according to the initial value of vcpu_get_guest_msr(vcpu, MSR_IA32_XSS), ectx->xcr0, and ectx->xs_area. With this approach, the physical processor state is consistent with the one presented to guest. Tracked-On: #6434 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Reviewed-by: Li Fei1 <fei1.li@intel.com>	2021-08-20 09:46:09 +08:00
Zide Chen	2e6cf2b85b	hv: nested: fix bugs in init_vmx_msrs() Currently init_vmx_msrs() emulates same value for the IA32_VMX_xxx_CTLS and IA32_VMX_TRUE_xxx_CTLS MSRs. But the value of physical MSRs could be different between the pair, and we need to adjust the emulated value accordingly. Tracked-On: #6289 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2021-08-20 09:40:50 +08:00

1 2 3 4 5 ...

1283 Commits