acrn-hypervisor

mirror of https://github.com/projectacrn/acrn-hypervisor.git synced 2025-07-02 18:22:55 +00:00

Author	SHA1	Message	Date
Zide Chen	742abaf2e6	hv: add sanity check for vuart configuration - target vm_id of vuart can't be un-defined VM, nor the VM itself. - fix potential NULL pointer dereference in find_active_target_vuart() Tracked-On: #3854 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-30 09:24:59 +08:00
Victor Sun	c6f7803f06	HV: restore lapic state and apic id upon INIT Per SDM 10.12.5.1 vol.3, local APIC should keep LAPIC state after receiving INIT. The local APIC ID register should also be preserved. Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	ab13228591	HV: ensure valid vcpu state transition The vcpu state machine transition should follow below rule: old vcpu state new vcpu state ============== ============== VCPU_OFFLINE --- create_vcpu --> VCPU_INIT VCPU_INIT --- launch_vcpu --> VCPU_RUNNING VCPU_RUNNING --- pause_vcpu --> VCPU_PAUSED VCPU_PAUSED --- resume_vcpu --> VCPU_RUNNING VCPU_RUNNING/PAUSED --- pause_vcpu --> VCPU_ZOMBIE VCPU_INIT --- pause_vcpu --> VCPU_ZOMBIE VCPU_ZOMBIE --- reset_vcpu --> VCPU_INIT VCPU_ZOMBIE --- offline_vcpu--> VCPU_OFFLINE Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	a5158e2c16	HV: refine reset_vcpu api The patch abstract a vcpu_reset_internal() api for internal usage, the function would not touch any vcpu state transition and just do vcpu reset processing. It will be called by create_vcpu() and reset_vcpu(). The reset_vcpu() will act as a public api and should be called only when vcpu receive INIT or vm reset/resume from S3. It should not be called when do shutdown_vm() or hcall_sos_offline_cpu(), so the patch remove reset_vcpu() in shutdown_vm() and hcall_sos_offline_cpu(). The patch also introduced reset_mode enum so that vcpu and vlapic could do different context operation according to different reset mode; Tracked-On: #4267 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	d1a46b8289	HV: rename function of vlapic_xxx_write_handler Rename vlapic_xxx_write_handler() to vlapic_write_xxx() to make code more readable; Tracked-On: #4268 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	9ecac8629a	HV: clean up redundant macro in lapic.h Some MACROs in lapic.h are duplicated with apicreg.h, and some MACROs are never referenced, remove them. Tracked-On: #4268 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Victor Sun	46ed0b1582	HV: correct apic lvt reset value Per SDM 10.4.7.1 vol3, the LVT register should be reset to 0s except for the mask bits are set to 1s. In current code, the lvt_last[] has been set to correct value(i.e. 0x10000) in vlapic_reset() before enforce setting vlapic->lvt_last[i] to 0U, add the loop that set vlapic->lvt_last[i] to 0 would lead to get zero when read LVT regs after reset, which is incompiant with SDM; Tracked-On: #4266 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-27 12:27:08 +08:00
Yin Fengwei	e5117bf19a	vm: add severity for vm_config Add severity definitions for different scenarios. The static guest severity is defined according to guest configurations. Also add sanity check to make sure the severity for all guests are correct. Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Yin Fengwei	f7df43e7cd	reset: detect highest severity guest dynamically For guest reset, if the highest severity guest reset will reset system. There is vm flag to call out the highest severity guest in specific scenario which is a static guest severity assignment. There is case that the static highest severity guest is shutdown and the highest severity guest should be transfer to other guest. For example, in ISD scenario, if RTVM (static highest severity guest) is shutdown, SOS should be highest severity guest instead. The is_highest_severity_vm() is updated to detect highest severity guest dynamically. And promote the highest severity guest reset to system reset. Also remove the GUEST_FLAG_HIGHEST_SEVERITY definition. Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Yin Fengwei	bfa19e9104	pm: S5: update the system shutdown logical in ACRN For system S5, ACRN had assumption that SOS shutdown will trigger system shutdown. So the system shutdown logical is: 1. Trap SOS shutdown 2. Wait for all other guest shutdown 3. Shutdown system The new logical is refined as: If all guest is shutdown, shutdown whole system Tracked-On: #4270 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-12-23 15:15:09 +08:00
Li Fei1	a90e0f6c84	hv: vpci: restore PCI BARs when doing PCIe FLR ACRN hypervisor should trap guest doing PCIe FLR. Besides, it should save some status before doing the FLR and restore them later, only BARs values for now. This patch will trap guest Device Capabilities Register write operation if the device supports PCI Express Capability and check whether it wants to do device FLR. If it does, call pdev_do_flr to do the job. Tracked-On: #3465 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-23 10:14:37 +08:00
Kaige Fu	5f9d1379bc	HV: Remove INIT signal notification related code We don't use INIT signal notification method now. This patch removes them. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	6d1f63aef0	HV: Use NMI to replace INIT signal for lapic-pt VMs S5 We have implemented a new notification method using NMI. So replace the INIT notification method with the NMI one. Then we can remove INIT notification related code later. Tracked-On: #3886 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	a13909cedc	HV: Use NMI-window exiting to address req missing issue There is a window where we may miss the current request in the notification period when the work flow is as the following: CPUx + + CPUr \| \| \| +--+ \| \| \| Handle pending req \| <--+ +--+ \| \| \| Set req flag \| <--+ \| +------------------>---+ \| Send NMI \| \| Handle NMI \| <--+ \| \| \| \| \| +--> vCPU enter \| \| + + So, this patch enables the NMI-window exiting to trigger the next vmexit once there is no "virtual-NMI blocking" after vCPU enter into VMX non-root mode. Then we can process the pending request on time. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	40ba7e8686	HV: Don't make NMI injection req when notifying vCPU The NMI for notification should not be inject to guest. So, this patch drops NMI injection request when we use NMI to notify vCPUs. Meanwhile, ACRN doesn't support vNMI well and there is no well-designed way to check if the NMI is for notification or for guest now. So, we take all the NMIs as notificaton NMI for hard rtvm temporarily. It means that the hard rtvm will never receive NMI with this patch applied. TODO: vNMI support is not ready yet. we will add it later. Tracked-On: #3886 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Kaige Fu	72f7f69c47	HV: Use NMI to kick lapic-pt vCPU's thread ACRN hypervisor needs to kick vCPU off VMX non-root mode to do some operations in hypervisor, such as interrupt/exception injection, EPT flush etc. For non lapic-pt vCPUs, we can use IPI to do so. But, it doesn't work for lapic-pt vCPUs as the IPI will be injected to VMs directly without vmexit. Without the way to kick the vCPU off VMX non-root mode to handle pending request on time, there may be fatal errors triggered. 1). Certain operation may not be carried out on time which may further lead to fatal errors. Taking the EPT flush request as an example, once we don't flush the EPT on time and the guest access the out-of-date EPT, fatal error happens. 2). ACRN now will send an IPI with vector 0xF0 to target vCPU to kick the vCPU off VMX non-root mode if it wants to do some operations on target vCPU. However, this way doesn't work for lapic-pt vCPUs. The IPI will be delivered to the guest directly without vmexit and the guest will receive a unexpected interrupt. Consequently, if the guest can't handle this interrupt properly, fatal error may happen. The NMI can be used as the notification signal to kick the vCPU off VMX non-root mode for lapic-pt vCPUs. So, this patch uses NMI as notification signal to address the above issues for lapic-pt vCPUs. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-17 09:45:52 +08:00
Shiqing Gao	3cee259583	hv: msr: remove redundant check in write_pat_msr Reserved bits in a 8-bit PAT field has been checked in pat_mem_type_invalid. Remove this redundant check "(PAT_FIELD_RSV_BITS & field) != 0UL" in write_pat_msr. Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-12-16 14:32:42 +08:00
Victor Sun	5702619620	HV: kconfig: add range check for memory setting When user use make menuconfig to configure memory related kconfig items, we need add range check to avoid compile error or other potential issues: CONFIG_LOW_RAM_SIZE:(0 ~ 0x10000) the value should be less than 64KB; CONFIG_HV_RAM_SIZE: (0x1000000 ~ 0x10000000) the hypervisor RAM size should be supposed between 16MB to 256MB; CONFIG_PLATFORM_RAM_SIZE: (0x100000000 ~ 0x4000000000) the platform RAM size should be larger than 4GB and less than 256GB; CONFIG_SOS_RAM_SIZE: (0x100000000 ~ 0x4000000000) the SOS RAM size should be larger than 4GB and less than 256GB; CONFIG_UOS_RAM_SIZE: (0 ~ 0x2000000000) the UOS RAM size should be less than 128GB; Tracked-On: #4229 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-12-16 09:36:44 +08:00
Victor Sun	64bbd37fd7	HV: Kconfig: set default Kata num to 1 in SDC Set default CONFIG_KATA_VM_NUM to 1 in SDC scenario so that user could have a try on Kata container without rebuilding hypervisor. Please be aware that vcpu affinity of VM1 in CPU partition mode would be impacted by this patch. Tracked-On: #4232 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-16 09:36:44 +08:00
Kaige Fu	2777f23075	HV: Add helper function send_single_nmi This patch adds a helper function send_single_nmi. The fisrt caller will soon come with the following patch. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-13 10:13:09 +08:00
Kaige Fu	525d4d3cd0	HV: Install a NMI handler in acrn IDT This patch installs a NMI handler in acrn IDT to handle NMIs out of dispatch_exception. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-13 10:13:09 +08:00
Kaige Fu	fb346a6c11	HV: refine excp/external_interrupt_save_frame and excp_rsvd There are lines of repeated codes in excp/external_interrupt_save_frame and excp_rsvd. So, this patch defines two .macro, save_frame and restore_frame, to reduce the repeated codes. No functional change. Tracked-On: #3886 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-13 10:13:09 +08:00
Mingqiang Chi	7f96465407	hv:remove need_cleanup flag in create_vm remove this redundancy flag. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 16:34:13 +08:00
Victor Sun	67ec1b7708	HV: expose port 0x64 read for SOS VM The port 0x64 is the status register of i8042 keyboard controller. When i8042 is defined as ACPI PnP device in BIOS, enforce returning 0xff in read handler would cause infinite loop when booting SOS VM, so expose the physical port read in this case; Tracked-On: #4228 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 13:51:24 +08:00
Victor Sun	a44c1c900c	HV: Kconfig: remove MAX_VCPUS_PER_VM in Kconfig In current architecutre, the maximum vCPUs number per VM could not exceed the pCPUs number. Given the MAX_PCPU_NUM macro is provided in board configurations, so remove the MAX_VCPUS_PER_VM from Kconfig and add a macro of MAX_VCPUS_PER_VM to reference MAX_PCPU_NUM directly. Tracked-On: #4230 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 13:49:28 +08:00
Victor Sun	ea3476d22d	HV: rename CONFIG_MAX_PCPU_NUM to MAX_PCPU_NUM rename the macro since MAX_PCPU_NUM could be parsed from board file and it is not a configurable item anymore. Tracked-On: #4230 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-12 13:49:28 +08:00
Shiqing Gao	e95b316dd0	hv: vtd: fix improper use of DMAR_GCMD_REG The initialization of "dmar_unit->gcmd" shall be done via reading from Global Status Register rather than Global Command Register. Rationale: According to Chapter 10.4.4 Global Command Register in VT-d spec, Global Command Register is a write-only register to control remapping hardware. Global Status Register is the corresponding read-only register to report remapping hardware status. Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-12-12 09:11:04 +08:00
Vijay Dhanraj	c8a4ca6c78	HV: Extend non-contiguous HPA for hybrid scenario This patch extends non-contiguous HPA allocations for pre-launched VMs in hybrid scenario. Tracked-On: #4217 Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 10:12:46 +08:00
Shuo A Liu	b32ae229fb	hv: sched: use hypervisor configuration to choose scheduler For now, we set NOOP scheduler as default. User can choose IORR scheduler as needed. Tracked-On: #4178 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-11 09:31:39 +08:00
Li Fei1	c2c05a29da	hv: vlapic: kick targeted vCPU off if interrupt trigger mode has changed In APICv advanced mode, an targeted vCPU, running in non-root mode, may get outdated TMR and EOI exit bitmap if another vCPU sends an interrupt to it if the trigger mode of this interrupt has changed. This patch try to kick vCPU off to let it get the latest TMR and EOI exit bitmap when it enters non-root mode again if new coming interrupt trigger mode has changed. Then fill the interrupt to PIR. Tracked-On: #4200 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-10 09:07:54 +08:00
Vijay Dhanraj	ed65ae61c6	HV: Kconfig changes to support server platform. This patch updates kconfig to support server platforms for increased number of VCPUs per VM and PT IRQ number. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Tracked-On: #4196	2019-12-09 11:29:34 +08:00
Vijay Dhanraj	6e8b413689	HV: Add support to assign non-contiguous HPA regions for pre-launched VM On some platforms, HPA regions for Virtual Machine can not be contiguous because of E820 reserved type or PCI hole. In such cases, pre-launched VMs need to be assigned non-contiguous memory regions and this patch addresses it. To keep things simple, current design has the following assumptions, 1. HPA2 always will be placed after HPA1 2. HPA1 and HPA2 don’t share a single ve820 entry. (Create multiple entries if needed but not shared) 3. Only support 2 non-contiguous HPA regions (can extend at a later point for multiple non-contiguous HPA) Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com> Tracked-On: #4195 Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-12-09 11:28:38 +08:00
Zide Chen	03a1b2a717	hypervisor: handle reboot from non-privileged pre-launched guests To handle reboot requests from pre-launched VMs that don't have GUEST_FLAG_HIGHEST_SEVERITY, we shutdown the target VM explicitly other than ignoring them. Tracked-On: #2700 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-12-09 11:27:32 +08:00
Li Fei1	da3ba68cb6	hv: remove corner case in ptirq_prepare_msix_remap ptirq_prepare_msix_remap was called no matter whether MSI/MSI-X was enabled or not and it passed zero to input parameter virtual MSI/MSI-X data field to indicate MSI/MSI-X was disabled. However, it barely did nothing on this case. Now ptirq_prepare_msix_remap is called only when MSI/MSI-X is enabled. It doesn't need to check whether MSI/MSI-X is enabled or not by checking virtual MSI/MSI-X data field. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-12-05 16:43:22 +08:00
Shuo A Liu	72644ac2b2	hv: do not sleep a non-RUNNING vcpu It's meaningless to sleep a non-running vcpu. Add a state check before sleep the thread object of the vcpu. Tracked-On: #4178 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-05 11:19:35 +08:00
Conghui Chen	d48da2af3a	hv: bugfix for debug commands with smp_call With cpu-sharing enabled, there are more than 1 vcpu on 1 pcpu, so the smp_call handler should switch the vmcs to the target vcpu's vmcs. Then get the info. dump_vcpu_reg and dump_guest_mem should run on certain vmcs, otherwise, there will be #GP error. Renaming: vcpu_dumpreg -> dump_vcpu_reg switch_vmcs -> load_vmcs Tracked-On: #4178 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-05 11:19:35 +08:00
Kaige Fu	aae974b473	HV: trace leaf and subleaf of cpuid We care more about leaf and subleaf of cpuid than vcpu_id. So, this patch changes the cpuid trace-entry to trace the leaf and subleaf of this cpuid vmexit. Tracked-On: #4175 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-12-03 16:34:14 +08:00
Yonghua Huang	450d2cf2e9	hv: trap RDPMC instruction execution from any guest PMU is hidden from any guest, UD is expected when guest try to execute 'rdpmc' instruction. this patch sets 'RDPMC exiting' in Processorbased VM-execution control. Tracked-On: #3453 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 14:14:27 +08:00
Binbin Wu	3d412266bc	hv: ept: build 4KB page mapping in EPT for RTVM for MCE on PSC Deterministic is important for RTVM. The mitigation for MCE on Page Size Change converts a large page to 4KB pages runtimely during the vmexit triggered by the instruction fetch in the large page. These vmexits increase nondeterminacy, which should be avoided for RTVM. This patch builds 4KB page mapping in EPT for RTVM to avoid these vmexits. Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 09:17:04 +08:00
Binbin Wu	0570993b40	hv: config: add an option to disable mce on psc workaround Add a option MCE_ON_PSC_WORKAROUND_DISABLED to disable the software workaround for the issue Machine Check Error on Page Size Change. Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 09:17:04 +08:00
Binbin Wu	192859ee02	hv: ept: apply MCE on page size change mitigation conditionally Only apply the software workaround on the models that might be affected by MCE on page size change. For these models that are known immune to the issue, the mitigation is turned off. Atom processors are not afftected by the issue. Also check the CPUID & MSR to check whether the model is immune to the issue: CPU is not vulnerable when both CPUID.(EAX=07H,ECX=0H).EDX[29] and IA32_ARCH_CAPABILITIES[IF_PSCHANGE_MC_NO] are 1. Other cases not listed above, CPU may be vulnerable. This patch also changes MACROs for MSR IA32_ARCH_CAPABILITIES bits to UL instead of U since the MSR is 64bit. Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-03 09:17:04 +08:00
Shuo A Liu	3cb32bb6e3	hv: make init_vmcs as a event of VCPU After changing init_vmcs to smp call approach and do it before launch_vcpu, it could work with noop scheduler. On real sharing scheudler, it has problem. pcpu0 pcpu1 pcpu1 vmBvcpu0 vmAvcpu1 vmBvcpu1 vmentry init_vmcs(vmBvcpu1) vmexit->do_init_vmcs corrupt current vmcs vmentry fail launch_vcpu(vmBvcpu1) This patch mark a event flag when request vmcs init for specific vcpu. When it is running and checking pending events, will do init_vmcs firstly. Tracked-On: #4178 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:20:43 +08:00
Victor Sun	15da33d8af	HV: parse default pci mmcfg base The default PCI mmcfg base is stored in ACPI MCFG table, when CONFIG_ACPI_PARSE_ENABLED is set, acpi_fixup() function will parse and fix up the platform mmcfg base in ACRN boot stage; when it is not set, platform mmcfg base will be initialized to DEFAULT_PCI_MMCFG_BASE which generated by acrn-config tool; Please note we will not support platform which has multiple PCI segment groups. Tracked-On: #4157 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:20:24 +08:00
Yan, Like	0d998d6ac6	hv: sync physical and virtual TSC_DEADLINE when msr interception enabled/disabled Starting with TSC_DEADLINE msr interception disabled, the virtual TSC_DEADLINE msr is always 0. When the interception is enabled, need to sync the physical TSC_DEADLINE value to virtual TSC_DEADLINE. When the interception is disabled, there are 2 cases: - if the timer hasn't expired, sync virtual TSC_DEADLINE to physical TSC_DEADLINE, to make the guest read the same tsc_deadline as it writes. This may change when the timer actually trigger. - if the timer has expired, write 0 to the virtual TSC_DEADLINE. Tracked-On: #4162 Signed-off-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:10:50 +08:00
Yan, Like	97916364fc	hv: fix virtual TSC_DEADLINE msr read/write issues When write to virtual TSC_DEADLINE, if virtual TSC_ADJUST is not zero: - when guest intends to disarm the tsc_deadline timer, should not arm the timer falsely; - when guest intends to arm the tsc_deadline timer, should not disarm the timer falsely. When read from virtual TSC_DEADLINE, if virtual TSC_ADJUST is not zero: - if physical TSC_DEADLINE is not zero, return the virtual TSC_DEADLINE value; - if physical TSC_DEADLINE is zero which means it's not armed (automatically disarmed after timer triggered), return 0 and reset the virtual TSC_DEADLINE. Tracked-On: #4162 Signed-off-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 16:10:50 +08:00
Conghui Chen	e61412981d	hv: support xsave in context switch xsave area: legacy region: 512 bytes xsave header: 64 bytes extended region: < 3k bytes So, pre-allocate 4k area for xsave. Use certain instruction to save or restore the area according to hardware xsave feature set. Tracked-On: #4166 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 09:31:12 +08:00
Conghui Chen	8ba203a165	hv: change xsave init function name change pcpu_xsave_init to init_pcpu_xsave. Tracked-On: #4166 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-12-02 09:31:12 +08:00
Li Fei1	6ee076f7df	hv: assign: rename ptirq_msix_remap to ptirq_prepare_msix_remap ptirq_msix_remap doesn't do the real remap, that's the vmsi_remap and vmsix_remap_entry does. ptirq_msix_remap only did the preparation. Tracked-On: #3475 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-29 08:53:07 +08:00
Geoffroy Van Cutsem	51a43dab79	hv: add Kconfig parameter to define the Service VM EFI bootloader Add a Kconfig parameter called UEFI_OS_LOADER_NAME to hold the Service VM EFI bootloader to be run by the ACRN hypervisor. A new string manipulation function to convert from (char ) to (CHAR16 ) has been added to facilitate the implementation. The default value is set to systemd-boot (bootloaderx64.efi) Tracked-On: #2793 Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>	2019-11-27 10:38:49 +08:00
Alexander Merritt	94a456ae24	HV: refactor device_to_dmaru On server platforms, DMAR DRHD device scope entries may contain PCI bridges. Bridges in the DRHD device scope indicate this IOMMU translates for all devices on the hierarchy below that bridge. ACRN is unaware of bridge types in the device scope, and adds these directly to its internal representation of a DRHD. When looking up a BDF within these DRHD entries, device_to_dmaru assumes all entries are Endpoints, comparing BDF to BDF. Thus device to DMAR unit fails, because it treats a bridge as an Endpoint type. This change leverages prior patches by converting a BDF to the associated device DRHD index, and uses that index to obtain the correct DRHD state. Handling a bridge in other ways may require maintaining a bus list for each, or replacing each bridge in the dev scope with a set of all device BDFs underneath it. Server platforms can have hundreds of PCI devices, thus making the device scope artificially large is unwieldy. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Sainath Grandhi	c5a87d41df	HV: Cleanup PCI segment usage from VT-d interfaces ACRN does not support multiple PCI segments in its current form. But VT-d module uses segment info in its interfaces and hardcodes it to 0. This patch cleans up everything related to segment to avoid ambiguity. Tracked-On: #4134 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-11-27 09:49:32 +08:00
Alexander Merritt	810169ad20	HV: initialize IOMMU before PCI device discovery In later patches we use information from DMAR tables to guide discovery and initialization of PCI devices. Tracked-On: #4134 Signed-off-by: Alexander Merritt <alex.merritt@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-11-27 09:49:32 +08:00
Mingqiang Chi	32b8d99f48	hv:panic if there is no memory map in multiboot info add panic if there is no memory map info during booting. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-11-26 16:16:23 +08:00
Binbin Wu	fa3888c12a	hv: ept: disable execute right on large pages Issue description: ----------------- Machine Check Error on Page Size Change Instruction fetch may cause machine check error if page size and memory type was changed without invalidation on some processors[1][2]. Malicious guest kernel could trigger this issue. This issue applies to both primary page table and extended page tables (EPT), however the primary page table is controlled by hypervisor only. This patch mitigates the situation in EPT. Mitigation details: ------------------ Implement non-execute huge pages in EPT. This patch series clears the execute permission (bit 2) in the EPT entries for large pages. When EPT violation is triggered by guest instruction fetch, hypervisor converts the large page to smaller 4 KB pages and restore the execute permission, and then re-execute the guest instruction. The current patch turns on the mitigation by default. The follow-up patches will conditionally turn on/off the feature per processor model. [1] Refer to erratum KBL002 in "7th Generation Intel Processor Family and 8th Generation Intel Processor Family for U Quad Core Platforms Specification Update" https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/7th-gen-core-family-spec-update.pdf [2] Refer to erratum SKL002 in "6th Generation Intel Processor Family Specification Update" https://www.intel.com/content/www/us/en/products/docs/processors/core/desktop-6th-gen-core-family-spec-update.html Tracked-On: #4101 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-11-13 08:00:36 +08:00
Peter Fang	b7329f10a5	hv: instr_emul: use cs segment when fetching instructions In non-64-bit mode, CS segment base address should be considered when determining the linear address of the vcpu's instruction pointer. Use vie_calculate_gla() for instruction address translation which also takes care of 64-bit mode. Tracked-On: #4064 Signed-off-by: Peter Fang <peter.fang@intel.com>	2019-11-11 13:55:24 +08:00
Mingqiang Chi	8666ba6c01	hv:remove unnecessary wrapper for emulate_instruction remove unnecessary wrapper for this api(emulate_instruction) Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-11-09 11:43:37 +08:00
Yonghua Huang	0eb427f122	hv:refine 'uint64_t' string print format in comm moudle Use "0x%lx" string to format 'uint64_t' type value, instead of "0x%llx". Tracked-On: #4020 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2019-11-09 11:42:38 +08:00
Yonghua Huang	e51386fe04	hv: refine 'uint64_t' string print format in x86 moudle Use "0x%lx" string to format 'uint64_t' type value, instead of "0x%llx". Tracked-On: #4020 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2019-11-09 11:42:38 +08:00
Victor Sun	3411f00b5b	HV: fix misra violation on platform clos array MISRA C requires specified bounds for arrays declaration, previous declaration of platform_clos_array in board.h does not meet the requirement. Tracked-On: #3987 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-11-08 16:40:14 +08:00
Victor Sun	c77d275e9d	HV: clean up DMAR MACROs for sample platform acpi info Remove redundant DMAR MACROs for given platform_acpi_info files because CONFIG_ACPI_PARSE_ENABLED is enabled for all boards by default. The DMAR info for nuc7i7dnb is kept as reference in the case that ACPI_PARSE_ENABLED is not set in Kconfig. As DMAR info is not provided for apl-mrb, the platform_acpi_info.h under apl-mrb config folder is meaningless, so also remove this file and let hypervisor parse ACPI for apl-mrb; Tracked-On: #3977 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 16:40:14 +08:00
Victor Sun	9e92f3cdf5	HV: move dmar info definition to board.c The DMAR info is board specific so move the structure definition to board.c. As a configruation file, the whole board.c could be generated by acrn-config tool for each board. Please note we only provide DMAR info MACROs for nuc7i7dnb board. For other boards, ACPI_PARSE_ENABLED must be set to y in Kconfig to let hypervisor parse DMAR info, or use acrn-config tool to generate DMAR info MACROs if user won't enable ACPI parse code for FuSa consideration. The patch also moves the function of get_dmar_info() to vtd.c, so dmar_info.c could be removed. Tracked-On: #3977 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 16:40:14 +08:00
Victor Sun	589be88cf6	HV: link CONFIG_MAX_IOMMU_NUM and MAX_DRHDS to DRHD_COUNT The value of CONFIG_MAX_IOMMU and MAX_DRHDS are identical to DRHD_COUNT which defined in platform ACPI table, so remove CONFIG_MAX_IOMMU_NUM from Kconfig and link these three MACROs together. Tracked-On: #3977 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 16:40:14 +08:00
Conghui Chen	75f512ce8c	hv: rename vuart operations fifo_reset -> reset_fifo vuart_fifo_init -> init_fifo vuart_setup - > setup_vuart vuart_init -> init_vuart vuart_deinit -> deinit_vuart vuart_lock_init -> init_vuart_lock vuart_lock -> obtain_vuart_lock vuart_unlock -> release_vuart_lock vuart_deinit_connect -> vuart_deinit_connection Tracked-On: #4017 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-08 09:01:01 +08:00
Kaige Fu	20c1ad1b3a	HV: correct the formatting flag of hypcall_id hypcall_id has a type of uint64_t and should use 'llx' as formatting flag instead of '%d'. Otherwise, we will get a confusing error log when not-allowed hypercall occurs. Without this patch: [96707209us][cpu=1][sev=3][seq=2386]:hypercall -2147483548 is only allowed from SOS_VM! With this patch: [84613395us][cpu=1][sev=3][seq=2136]:hypercall 0x80000064 is only allowed from SOS_VM! So, we can figure out which not-allowed hypercall has been triggered more conveniently. BTW, this patch adds hypcall_id which triggered from non-ring0 into error log. Tracked-On: #4012 Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-11-07 15:01:21 +08:00
Li Fei1	8189d1f01c	hv: mmu: fliter e820 which is over top address space Now the default board memory size is 16 GB. However, ACRN support more and more boards which may have memory size large than 16 GB. This patch try to filter e820 table which is over top address space. Tracked-On: #4007 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-07 08:47:02 +08:00
Li Fei1	620a1c5215	hv: mmu: rename e820 to hv_e820 Now the e820 structure store ACRN HV memory layout, not the physical memory layout. Rename e820 to hv_hv_e820 to show this explicitly. Tracked-On: #4007 Signed-off-by: Li Fei1 <fei1.li@intel.com>	2019-11-07 08:47:02 +08:00
Yonghua Huang	8227804b09	hv:Unmap AP trampoline region from service VM's EPT AP trampoline code should be accessible to hypervisor only, this patch is to unmap this region from service VM's EPT for security reason. Tracked-On: #3992 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-05 15:14:13 +08:00
Yonghua Huang	d74497eb17	hv:refine modify_or_del_pte/pde/pdpte()function 1. Print warning message instead of ASSERT when the caller try to modify the attribute for memory region that is not present. 2. To avoid above warning message for memory region below 1M,its attribute may be updated by Service VM when updating MTTR setting. Tracked-On: #3992 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Fei Li <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-11-05 15:14:13 +08:00
Kaige Fu	c22f899a5e	HV: Fix poweroff issue of hard RTVM We should use INIT signal to notify the vcpu threads when powering off the hard RTVM. To achive this, we should set the vcpu->thread_obj.notify_mode as SCHED_NOTIFY_INIT. Patch (`27163df9` hv: sched: add sleep/wake for thread object) tries to set the notify_mode according `is_lapic_pt_enabled(vcpu)` in function prepare_vcpu. But at this point, the is_lapic_pt_enabled(vcpu) will always return false. Consequently, it will set notify_mode as SCHED_NOTIFY_IPI. Then leads to the failure of powering off hard RTVM. This patch fixes it by: - Initialize the notify_mode as SCHED_NOTIFY_IPI in prepare_vcpu. - Set notify_mode as SCHED_NOTIFY_INIT after guest is trying to enable x2apic mode of passthru lapic. Tracked-On: #3975 Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-11-04 10:28:16 +08:00
Li, Fei1	9d26dab6d6	hv: mmio: add a lock to protect mmio_node access After adding PCI BAR remap support, mmio_node may unregister when there's others access it. This patch add a lock to protect mmio_node access. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-11-01 14:44:11 +08:00
Huihuang Shi	5d662ea11f	hv: fixed by replace ull to ul. ul is used as immediate integer suffix with type uint64_t. Tracked-On: #3214 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>	2019-10-31 09:02:59 +08:00
Li, Fei1	2c158d5ad4	hv: io: add unregister_mmio_emulation_handler API Since guest could re-program PCI device MSI-X table BAR, we should add mmio emulation handler unregister. However, after add unregister_mmio_emulation_handler API, emul_mmio_regions is no longer accurate. Just replace it with max_emul_mmio_regions which records the max index of the emul_mmio_node. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-10-29 14:49:55 +08:00
Li, Fei1	dc1e2adaec	hv: vpci: add PCI BAR re-program address check In theory, guest could re-program PCI BAR address to any address. However, ACRN hypervisor only support [0, top_address_space) EPT memory mapping. So we need to check whether the PCI BAR re-program address is within this scope. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-29 14:49:55 +08:00
Sainath Grandhi	f01aad7e77	hv: Let trampoline execution use 1GB pages ACRN currently uses 2MB large pages in the page tables setup for trampoline code and data. This patch lets ACRN use 1GB large pages instead. When it comes to fixing symbols in trampoline code, fixing pointers in PDPT is no more needed as PDPT PTEs contain Physical Address. Tracked-On: #3899 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-28 13:44:32 +08:00
Shuo A Liu	5f8e7a6cb7	hv: sched: add kick_thread to support notification kick means to notify one thread_object. If the target thread object is running, send a IPI to notify it; if the target thread object is runnable, make reschedule on it. Also add kick_vcpu API in vcpu layer to notify vcpu. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Shuo A Liu	f04c491259	hv: sched: decouple scheduler from schedule framework This patch decouple some scheduling logic and abstract into a scheduler. Then we have scheduler, schedule framework. From modulization perspective, schedule framework provides some APIs for other layers to use, also interact with scheduler through scheduler interaces. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-25 13:00:21 +08:00
Yonghua Huang	2e62ad9574	hv[v2]: remove registration of default port IO and MMIO handlers - The default behaviors of PIO & MMIO handlers are same for all VMs, no need to expose dedicated APIs to register default hanlders for SOS and prelaunched VM. Tracked-On: #3904 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2019-10-24 13:21:19 +08:00
Mingqiang Chi	d81872ba18	hv:Change the function parameter for init_ept_mem_ops Currently the parameter of init_ept_mem_ops is 'struct acrn_vm vm' for this api,change it to 'struct memory_ops mem_ops' and 'vm_id' to avoid the reversed dependency, page.c is hardware layer and vm structure is its upper-layer stuff. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:48:30 +08:00
Shuo A Liu	0f70a5ca3a	hv: sched: decouple idle stuff from schedule module Let init thread end with run_idle_thread(), then idle thread take over and start to do scheduling. Change enter_guest_mode() to init_guest_mode() as run_idle_thread() is removed out of it. Also add run_thread() in schedule module to run thread_object's thread loop directly. rename: switch_to_idle -> run_idle_thread Tracked-On: #3813 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	27163df9b1	hv: sched: add sleep/wake for thread object sleep one thread_object means to prevent it from being scheduled. wake one thread_object is an opposite operation of sleep. This patch also add notify_mode in thread_object to indicate how to deliver the request. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	fafd5cf063	hv: sched: move schedule initialization to each pcpu init schedule infrastructure is per pcpu, so move its initialization to each pcpu's initialization. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	dadcdcefa0	hv: sched: support vcpu context switch on one pcpu To support cpu sharing, multiple vcpu can run on same pcpu. We need do necessary vcpu context switch. This patch add below actions in context switch. 1) fxsave/fxrstor; 2) save/restore MSRs: MSR_IA32_STAR, MSR_IA32_LSTAR, MSR_IA32_FMASK, MSR_IA32_KERNEL_GS_BASE; 3) switch vmcs. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	7e66c0d4fa	hv: sched: use get_running_vcpu to replace per_cpu vcpu with cpu sharing With cpu sharing enabled, per_cpu vcpu cannot work properly as we might has multiple vcpus running on one pcpu. Add a schedule API sched_get_current to get current thread_object on specific pcpu, also add a vcpu API get_running_vcpu to get corresponding vcpu of the thread_object. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	891e46453d	hv: sched: move pcpu_id from acrn_vcpu to thread_object With cpu sharing enabled, we will map acrn_vcpu to thread_object in scheduling. From modulization perspective, we'd better hide the pcpu_id in acrn_vcpu and move it to thread_object. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-23 12:47:08 +08:00
Shuo A Liu	f85106d1ed	hv: Do not reset vcpu thread's stack when reset_vcpu vcpu thread's stack shouldn't follow reset_vcpu to reset. There is also a bug here: while vcpu B thread set vcpu->running to false, other vcpu A thread will treat the vcpu B is paused while it has not been switch out completely, then reset_vcpu will reset the vcpu B thread's stack and corrupt its running context. This patch will remove the vcpu thread's stack reset from reset_vcpu. With the change, we need do init_vmcs between vcpu startup address be settled and scheduled in. And switch_to_idle() is not needed anymore as S3 thread's stack will not be reset. Tracked-On: #3813 Signed-off-by: Fengwei Yin <fengwei.yin@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-10-23 12:47:08 +08:00
Jian Jun Chen	1d194ede61	hv: support reference time enlightenment Two time related synthetic MSRs are implemented in this patch. Both of them are partition wide MSR. - HV_X64_MSR_TIME_REF_COUNT is read only and it is used to return the partition's reference counter value in 100ns units. - HV_X64_MSR_REFERENCE_TSC is used to set/get the reference TSC page, a sequence number, an offset and a multiplier are defined in this page by hypervisor and guest OS can use them to calculate the normalized reference time since partition creation, in 100ns units. Tracked-On: #3831 Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-10-22 10:09:16 +08:00
wenwumax	048155d3d6	hv: support minimum set of TLFS This patch implements the minimum set of TLFS functionality. It includes 6 vCPUID leaves and 3 vMSRs. - 0x40000001 Hypervisor Vendor-Neutral Interface Identification - 0x40000002 Hypervisor System Identity - 0x40000003 Hypervisor Feature Identification - 0x40000004 Implementation Recommendations - 0x40000005 Hypervisor Implementation Limits - 0x40000006 Implementation Hardware Features - HV_X64_MSR_GUEST_OS_ID Reporting the guest OS identity - HV_X64_MSR_HYPERCALL Establishing the hypercall interface - HV_X64_MSR_VP_INDEX Retrieve the vCPU ID from hypervisor Tracked-On: #3832 Signed-off-by: wenwumax <wenwux.ma@intel.com> Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-10-22 10:09:16 +08:00
Mingqiang Chi	292d1a15f9	hv:Wrap some APIs related with guest pm -- change some APIs to static -- combine two APIs to init_guest_pm Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-10-21 10:13:02 +08:00
Jian Jun Chen	e1a2ed1727	hv: fix a bug that tpr threshold is not updated Consider the following case when TPR shadow is used with vlapic basic mode: 1) 2 interrupts are pending in vlapic. INTa's priority > TPR and INTb's priority <= TPR. 2) TPR threshold is set to zero and INTa is injected to guest. 3) Guest set TPR to the priority of INTa. 4) EOI of INTa. PPR is updated to TPR which equals INTa's priority. INTb cannot be injected because its priority <= PPR. 5) Guest set TPR to zero. Because TPR threshold is still zero, there is no TPR threshold vmexit. But since both TPR and ISRV are zero at this time, the PPR is zero as well. INTb still cannot be injected. This is a bug. By adding vcpu_make_request(vlapic->vcpu, ACRN_REQUEST_EVENT) in EOI, TPR threshold will be updated before vm_resume. Tracked-On: #3795 Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-16 16:40:29 +08:00
Shuo A Liu	de157ab96c	hv: sched: remove runqueue from current schedule logic Currently we are using a 1:1 mapping logic for pcpu:vcpu. So don't need a runqueue for it. Removing it as preparation work to abstract scheduler framework. Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-16 10:25:53 +08:00
Shuo A Liu	837e4d8788	hv: sched: rename schedule related structs and vars prepare_switch_out -> switch_out prepare_switch_in -> switch_in prepare_switch -> do_switch run_thread_t -> thread_entry_t sched_object -> thread_object sched_object.thread -> thread_object.thread_entry sched_obj -> thread_obj sched_context -> sched_control sched_ctx -> sched_ctl Tracked-On: #3813 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-10-16 10:25:53 +08:00
Binbin Wu	d19592a33e	hv: vmsr: disable prmrr related msrs in vm PRMRR related MSRs need to be configured by platform BIOS / bootloader. These settings are not allowed to be changed by guest. VMs currently have no requirement to access these MSRs even when vSGX is enabled. So, this patch disables PRMRR related MSRs in VM. Tracked-On: #3739 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-10-15 15:13:11 +08:00
Mingqiang Chi	de0a5a48d6	hv:remove some unnecessary includes --remove unnecessary includes --remove unnecssary forward-declaration for 'struct vhm_request' Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-10-15 14:40:39 +08:00
Peter Fang	28b50463c9	hv: vm: properly reset pCPUs with LAPIC PT enabled during VM shutdown/reset When a VM is configured with LAPIC PT mode and its vCPU is in x2APIC mode, the corresponding pCPU needs to be reset during VM shutdown/reset as its physical LAPIC was used by its guest. This commit fixes an issue where this reset never happens. is_lapic_pt_enabled() needs to be called before reset_vcpu() to be able to correctly reflect a vCPU's APIC mode. A vCPU with LAPIC PT mode but in xAPIC mode does not require such reset, since its physical LAPIC was not touched by its guest directly. v2 -> v3: - refine edge case detection logic v1 -> v2: - use a separate function to return the bitmap of LAPIC PT enabled pCPUs Tracked-On: #3708 Signed-off-by: Peter Fang <peter.fang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Jack Ren <jack.ren@intel.com>	2019-09-29 15:12:25 +08:00
Mingqiang Chi	187fa97e52	hv:fixed compilation error in Ubuntu it uses builtin function(__builtin_popcountl)in bitmap_weight(), it will use the 'popcnt' instruction, this patch enable 'popcnt' instruction support in Makefile Tracked-On: #3663 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-09-26 14:03:51 +08:00
Shiqing Gao	c8bcab9006	hv: pci: update function "bdf_is_equal" - update the function argument type to union Declaring argument as pointer is not necessary since it only does the comparison. Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-09-25 13:45:39 +08:00
Shiqing Gao	658fff27b4	hv: pci: update "union pci_bdf" - add one more filed in "union pci_bdf" - remove following interfaces: * pci_bus * pci_slot * pci_func * pci_devfn Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>	2019-09-25 13:45:39 +08:00
Shuo A Liu	2096c43e5c	hv: create all VCPUs for guest when create VM To enable static configuration of different scenarios, we configure VMs in HV code and prepare all nesserary resources for this VM in create VM hypercall. It means when we create one VM through hypercall, HV will read all its configuration and run it automatically. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	9a23ec6b5a	hv: remove unused pcpu assignment functions As we introduced vcpu_affinity[] to assign vcpus to different pcpus, the old policy and functions are not needed. Remove them. Tracked-On: #3663 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	1c526e6d16	hv: use vcpu_affinity[] in vm_config to support vcpu assignment Add this vcpu_affinity[] for each VM to indicate the assignment policy. With it, pcpu_bitmap is not needed, so remove it from vm_config. Instead, vcpu_affinity is a must for each VM. This patch also add some sanitize check of vcpu_affinity[]. Here are some rules: 1) only one bit can be set for each vcpu_affinity of vcpu. 2) two vcpus in same VM cannot be set with same vcpu_affinity. 3) vcpu_affinity cannot be set to the pcpu which used by pre-launched VM. v4: config SDC with CONFIG_MAX_KATA_VM_NUM v5: config SDC with CONFIG_MAX_PCPU_NUM Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	ca2540fe8c	hv: return pre-defined vcpu_num from HV to upper layer There is plan that define each VM configuration statically in HV and let DM just do VM creating and destroying. So DM need get vcpu_num information when VM creating. This patch return the vcpu_num via the API param. And also initial the VMs' cpu_num for existing scenarios. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	59e39c5fbb	hv: move MAX_PCPU_NUM from Kconfig to header file MAX_PCPU_NUM is different on various BOARDs. So we move the generic definition from Kconfig to each board's config header file. Tracked-On: #3663 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-09-24 11:58:45 +08:00
Shuo A Liu	f4ce9cc4a2	hv: make hypercall HC_CREATE_VCPU empty Now, we create vcpus while VM being created in hypervisor. The create vcpu hypercall will not be used any more. For compatbility, keep the hypercall HC_CREATE_VCPU do nothing. v4: Don't remove HC_CREATE_VCPU hypercall, let it do nothing. Tracked-On: #3663 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-09-24 11:58:45 +08:00
Mingqiang Chi	489937f7b8	hv:check pcpu numbers during init_pcpu_pre it will panic if phys_cpu_num > CONFIG_MAX_PCPU_NUM during init_pcpu_pre,after that no need to check it again. Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-09-24 09:02:05 +08:00
Victor Sun	153a5992f5	Makefile: add build tag for acrn-config tool in version.h Add " with acrn-config" tag in build info when user build hypervisor with acrn-config xmls would be helpful to identify the hypervisor configuration in current build is from acrn-config xml or from source code. Tracked-On: #3602 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-09-20 19:39:22 +08:00
Qi Yadong	3ebeecf060	hv: save/restore TSC in host's suspend/resume path TSC would be reset to 0 when enter suspend state on some platform. This will fail the secure timer checking in secure world because secure world leverage the TSC as source of secure timer which should be increased monotonously. This patch save/restore TSC in host suspend/resume path to guarantee the mono increasing TSC. Note: There should no timer setup before TSC resumed. Tracked-On: #3697 Signed-off-by: Qi Yadong <yadong.qi@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-19 13:50:50 +08:00
Andy	04d5638745	Fix the second problem: The Extended Model ID needs to be examined only when the Family ID is 06H or 0FH Tracked-On:#3675 Signed-off-by: Andy <andyx.liu@intel.com>	2019-09-19 08:44:45 +08:00
Andy	857cdb0c4f	Fix the first problem: CPUID(EAX = 10H, ECX = ResID=1 or 2).EAX Bits 04 - 00: Length of the capacity bit mask for the corresponding ResID using minus-one notation Tracked-On:#3675 Signed-off-by: Andy <andyx.liu@intel.com>	2019-09-19 08:44:45 +08:00
Victor Sun	398137990e	HV: add memmap param for hvlog in sos cmdline Reserve memory for hv sbuf to avoid its possible overwriting on kernel memory. For apl-up2, move hv_log address to 0x5de00000 to avoid possible conflict with HV_RAM which start from 0x5e000000; For nuc6cayh, move HV_RAM_START to 0x20000000 to avoid possible conflict with hv_log which start from 0x1fe00000; Tracked-On: #3533 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com>	2019-09-17 09:12:03 +08:00
Mingqiang Chi	60adef33d3	hv:move down structures run_context and ext_context Now the structures(run_context & ext_context) are defined in vcpu.h,and they are used in the lower-layer modules(wakeup.S), this patch move down the structures from vcpu.h to cpu.h to avoid reversed dependency. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-16 14:51:36 +08:00
Mingqiang Chi	4f98cb03a7	hv:move down the structure intr_source Now the structures(union source & struct intr_source) are defined in ptdev.h,they are used in vtd.c and assign.c, vtd is the hardware layer and ptdev is the upper-layer module from the modularization perspective, this patch move down these structures to avoid reversed dependency. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-16 14:51:36 +08:00
Shuo A Liu	4742d1c747	hv: ptdev: move softirq_dev_entry_list from vm structure to per_cpu region Using per_cpu list to record ptdev interrupts is more reasonable than recording them per-vm. It makes dispatching such interrupts more easier as we now do it in softirq which happens following interrupt context of each pcpu. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-16 09:36:52 +08:00
Shuo A Liu	2cc45534d6	hv: move pcpu offline request and vm shutdown request from schedule From modulization perspective, it's not suitable to put pcpu and vm related request operations in schedule. So move them to pcpu and vm module respectively. Also change need_offline return value to bool. Tracked-On: #3663 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Signed-off-by: Yu Wang <yu1.wang@intel.com> Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>	2019-09-16 09:36:52 +08:00
Yin Fengwei	6b6aa80600	hv: pm: fix coding style issue This patch fix the coding style issue introduced by previous two patches. Tracked-On: #3564 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-09-11 17:30:24 +08:00
Yin Fengwei	f039d75998	hv: pm: enhencement platform S5 entering operation Now, we have assumption that SOS control whether the platform should enter S5 or not. So when SOS tries enter S5, we just forward the S5 request to native port which make sure platform S5 is totally aligned with SOS S5. With higher serverity guest introduced,this assumption is not true any more. We need to extend the platform S5 process to handle higher severity guest: - For DM launched RTVM, we need to make sure these guests is off before put the whole platfrom to S5. - For pre-launched VM, there are two cases: * if os running in it support S5, we wait for guests off. * if os running in it doesn't support S5, we expect it will invoke one hypercall to notify HV to shutdown it. NOTE: this case is not supported yet. Will add it in the future. Tracked-On: #3564 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2019-09-11 17:30:24 +08:00
Yin Fengwei	ce9375874c	hv: pm: correct the function name do_acpi_s3 actually not limit to do s3 operation. It depends on the paramters pm1a_cnt_val and pm1b_cnt_val. It could be s3/s5. Update the function name from xx_s3 to xx_sx. Tracked-On: #3564 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-11 17:30:24 +08:00
Victor Sun	d188afbc59	HV: add acpi info header for nuc7i7dnb Currently nuc7i7dnb board is using default platform acpi info file so causes S3/S5 not working properly. This patch updates the correct ACPI info for nuc7i7dnb board. Tracked-On: #3609 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>	2019-09-11 14:00:53 +08:00
Li, Fei1	8b9aa11030	hv: mmu: remove strict check for deleting page table mapping When we support PCI MSI-X table BAR remapping, we may re-delete the MSI-X table BAR region. This patch removes strict check for deleting page table mapping. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-09-10 15:28:07 +08:00
Li, Fei1	127c73c3be	hv: mmu: add strict check for adding page table mapping The current implement only do "only add a page table mapping for a region when it's not mapped" check when this page table entry is a PTE entry. However, it need to do this check for PDPTE entry and PDE entry too. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-09-10 15:28:07 +08:00
Mingqiang Chi	c691c5bd3c	hv:add volatile keyword for some variables pcpu_active_bitmap was read continuously in wait_pcpus_offline(), acrn_vcpu->running was read continuously in pause_vcpu(), add volatile keyword to ensure that such accesses are not optimised away by the complier. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-09-10 11:26:35 +08:00
Yin Fengwei	81435f5504	vm reset: refine platform reset We did following to do platform reset: 1. Try ACPI reset first if it's available 2. Then try 0xcf9 reset method 3. if 2 fails, try keyboard reset method This introduces some timing concern which needs be handled carefully. We change it by following: assume the platforms which ACRN could be run on must support either ACPI reset or 0xcf9 reset. And simplify platform reset operation a little bit: If ACPI reset register is generated try ACPI reset else try 0xcf9 reset method Tracked-On: #3609 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>	2019-09-09 09:49:59 +08:00
Mingqiang Chi	cd40980d5f	hv:change function parameter for invept change the input parameter from vcpu to eptp in order to let this api more generic, no need to care normal world or secure world. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-09-05 16:32:30 +08:00
Binbin Wu	cd1ae7a89e	hv: cat: isolate hypervisor from rtvm Currently, the clos id of the cpu cores in vmx root mode is the same as non-root mode. For RTVM, if hypervisor share the same clos id with non-root mode, the cacheline may be polluted due to the hypervisor code execution when vmexit. The patch adds hv_clos in vm_configurations.c Hypervisor initializes clos setting according to hv_clos during physical cpu cores initialization. For RTVM, MSR auto load/store areas are used to switch different settings for VMX root/non-root mode for RTVM. Tracked-On: #2462 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-09-05 09:59:13 +08:00
Mingqiang Chi	38ca8db19f	hv:tiny cleanup -- remove some unnecessary includes -- fix a typo -- remove unnecessary void before launch_vms Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-09-05 09:58:47 +08:00
Yan, Like	f15a3600ec	hv: fix tsc_deadline correctness issue Fix tsc_deadline issue by trapping TSC_DEADLINE msr write if VMX_TSC_OFFSET is not 0. Because there is an assupmtion in the ACRN vART design that pTSC_Adjust and vTSC_Adjust are both 0. We can leave the TSC_DEADLINE write pass-through without correctness issue becuase there is no offset between the pTSC and vTSC, and there is no write to vTSC or vTSC_Adjust write observed in the RTOS so far. This commit fix the potential correctness issue, but the RT performance will be badly affected if vTSC or vTSC_Adjust was not zero, which we will address if such case happened. Tracked-On: #3636 Signed-off-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-05 09:58:16 +08:00
Yan, Like	3f84acda09	hv: add "invariant TSC" cap detection ACRN HV is designed/implemented with "invariant TSC" capability, which wasn't checked at boot time. This commit adds the "invairant TSC" detection, ACRN fails to boot if there wasn't "invariant TSC" capability. Tracked-On: #3636 Signed-off-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-09-05 09:58:16 +08:00
Shiqing Gao	f9945484a7	hv: vtd: fix MACRO typos ROOT_ENTRY_LOWER_CTP_MASK shall be (0xFFFFFFFFFFFFFUL << ROOT_ENTRY_LOWER_CTP_POS) rather than (0xFFFFFFFFFFFFFUL). Rationale: CTP is bits 63:12 in a root entry according to Chapter 9.1 Root Entry in VT-d spec. Similarly, update ROOT_ENTRY_LOWER_PRESENT_MASK to keep the coding style consistent. CTX_ENTRY_UPPER_DID_MASK shall be (0xFFFFUL << CTX_ENTRY_UPPER_DID_POS) rather than (0x3FUL << CTX_ENTRY_UPPER_DID_POS). Rationale: DID is bits 87:72 in a context entry according to Chapter 9.3 Context Entry in VT-d spec. It takes 16 bits rather than 6 bits. Tracked-On: #3626 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-29 12:41:53 +08:00
dongshen	295701cc55	hv: remove mptable code for pre-launched VMs Now that ACPI is enabled for pre-launched VMs, we can remove all mptable code. Tracked-On: #3601 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-08-29 10:12:25 +08:00
dongshen	b447ce3d86	hv: add ACPI support for pre-launched VMs Statically define the per vm RSDP/XSDT/MADT ACPI template tables in vacpi.c, RSDP/XSDT tables are copied to guest physical memory after checksum is calculated. For MADT table, first fix up process id/lapic id in its lapic subtable, then the MADT table's checksum is calculated before it is copies to guest physical memory. Add 8-bit checksum function in util.h Tracked-On: #3601 Signed-off-by: dongshen <dongsheng.x.zhang@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-08-29 10:12:25 +08:00
Binbin Wu	4a71a16a13	hv: vtd: remove global cache invalidation per vm Cacheline is flushed on EPT entry change, no need to invalidate cache globally when VM created per VM. Tracked-On: #3607 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com>	2019-08-26 10:47:17 +08:00
Binbin Wu	5c81659713	hv: ept: flush cache for modified ept entries EPT tables are shared by MMU and IOMMU. Some IOMMUs don't support page-walk coherency, the cpu cache of EPT entires should be flushed to memory after modifications, so that the modifications are visible to the IOMMUs. This patch adds a new interface to flush the cache of modified EPT entires. There are different implementations for EPT/PPT entries: - For PPT, there is no need to flush the cpu cache after update. - For EPT, need to call iommu_flush_cache to make the modifications visible to IOMMUs. Tracked-On: #3607 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com>	2019-08-26 10:47:17 +08:00
Binbin Wu	2abd8b34ef	hv: vtd: export iommu_flush_cache VT-d shares the EPT tables as the second level translation tables. For the IOMMUs that don't support page-walk coherecy, cpu cache should be flushed for the IOMMU EPT entries that are modified. For the current implementation, EPT tables for translating from GPA to HPA for EPT/IOMMU are not modified after VM is created, so cpu cache invlidation is done once per VM before starting execution of VM. However, this may be changed, runtime EPT modification is possible. When cpu cache of EPT entries is invalidated when modification, there is no need invalidate cpu cache globally per VM. This patch exports iommu_flush_cache for EPT entry cache invlidation operations. - IOMMUs share the same copy of EPT table, cpu cache should be flushed if any of the IOMMU active doesn't support page-walk coherency. - In the context of ACRN, GPA to HPA mapping relationship is not changed after VM created, skip flushing iotlb to avoid potential performance penalty. Tracked-On: #3607 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com>	2019-08-26 10:47:17 +08:00
Mingqiang Chi	2310d99ebf	hv: cleanup vmcs.h -- move 'RFLAGS_AC' to cpu.h -- move 'VMX_SUPPORT_UNRESTRICTED_GUEST' to msr.h and rename it to 'MSR_IA32_MISC_UNRESTRICTED_GUEST' -- move 'get_vcpu_mode' to vcpu.h -- remove deadcode 'vmx_eoi_exit()' Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-22 14:13:15 +08:00
Mingqiang Chi	bd09f471a6	hv:move some APIs related host reset to pm.c move some data structures and APIs related host reset from vm_reset.c to pm.c, these are not related with guest. Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-08-22 14:09:18 +08:00
Victor Sun	2736b6c4cd	HV: add vCOM2 setting for hybrid and industry scenario The vCOM2 of each VM is designed for VM communication, one VM could send command or request to another VM through this channel. The feature will be used for system S3/S5 implementation. On Hybird scenario, vCOM2 of pre-launched VM will connect to vCOM2 of SOS_VM; On Industry scenario, vCOM2 of post-launched RTVM will connect to vCOM2 of SOS_VM. Tracked-On: #3602 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-08-22 13:12:54 +08:00
Victor Sun	c8cdc7e807	HV: move vCOM setting from Kconfig to board configs The settings of SOS VM COM1 which is used for console is board specific, and this result in SOS VM COM2 which used for VM communication is also board specific, so move the configure method from Kconfig to board configs folder. The MACRO definition will be handled by acrn-config tool in future. Tracked-On: #3602 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-08-22 13:12:54 +08:00
Victor Sun	5a1842afb8	HV: set sos root dev of apl-up2 to mmcblk0p3 Set sos root device of apl-up2 to mmcblk0p3 and let UP2 uefi variant and sbl variant share one config for now. Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-08-22 09:10:38 +08:00
Victor Sun	6c99f76404	HV: prepare ve820 for apl up2 We need ve820 table to enable prelaunched VM for apl-up2 board; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-08-22 09:10:38 +08:00
Yin Fengwei	6beb34c3cb	vm_load: update init gdt preparation Now, we use native gdt saved in boot context for guest and assume it could be put to same address of guest. But it may not be true after the pre-launched VM is introduced. The gdt for guest could be overwritten by guest images. This patch make 32bit protect mode boot not use saved boot context. Insteadly, we use predefined vcpu_regs value for protect guest to initialize the guest bsp registers and copy pre-defined gdt table to a safe place of guest memory to avoid gdt table overwritten by guest images. Tracked-On: #3532 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-20 09:22:20 +08:00
Yonghua Huang	700a37856f	hv: remove 'flags' field in struct vm_io_range Currently, 'flags' is defined and set but never be used in the flow of handling i/o request after then. Tracked-On: #861 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-19 10:19:54 +08:00
Yonghua Huang	f791574f0e	hv: refine the function pointer type of port I/O request handlers In the definition of port i/o handler, struct acrn_vm * pointer is redundant as input, as context of acrn_vm is aleady linked in struct acrn_vcpu * by vcpu->vm, 'vm' is not required as input. this patch removes argument 'vm' from 'io_read_fn_t' & 'io_write_fn_t', use 'vcpu' for them instead. Tracked-On: #861 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-08-16 11:44:27 +08:00
Jie Deng	866935a53f	hv: vcr: check guest cr3 before loading pdptrs Check whether the address area pointed by the guest cr3 is valid or not before loading pdptrs. Inject #GP(0) to guest if there are any invalid cases. Tracked-On: #3572 Signed-off-by: Jie Deng <jie.deng@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-16 11:43:17 +08:00
huihuang.shi	f147c388a5	hv: fix Violations touched ACRN Coding Guidelines fix violations touched below: 1.Cast operation on a constant value 2.signed/unsigned implicity conversion 3.return value unused. V1->V2: 1.bitmap api will return boolean type, not need to check "!= 0", deleted. 2.The behaves ~(uint32_t)X and (uint32_t)~X are not defined in ACRN hypervisor Coding Guidelines, removed the change of it. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Reviewed-by: Junjie Mao <junjie.mao@intel.com>	2019-08-15 09:47:11 +08:00
Shiqing Gao	062fe19800	hv: move vmx_rdmsr_pat/vmx_wrmsr_pat from vmcs.c to vmsr.c This patch moves vmx_rdmsr_pat/vmx_wrmsr_pat from vmcs.c to vmsr.c, so that these two functions would become internal functions inside vmsr.c. This approach improves the modularity. v1 -> v2: * remove 'vmx_rdmsr_pat' * rename 'vmx_wrmsr_pat' with 'write_pat_msr' Tracked-On: #1842 Signed-off-by: Shiqing Gao <shiqing.gao@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-08-14 10:51:35 +08:00
Li, Fei1	d82a00a128	hv: vpci: remove pBDF configure for emulated device Since now we use vBDF to search the device for PCI vdev. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-08-12 10:00:44 +08:00
Li, Fei1	4c8e60f1d0	hv: vpci: add each vdev_ops for each emulated PCI device Add a field (vdev_ops) in struct acrn_vm_pci_dev_config to configure a PCI CFG operation for an emulated PCI device. Use pci_pt_dev_ops for PCI_DEV_TYPE_PTDEV by default if there's no such configure. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-08-09 14:19:49 +08:00
Li, Fei1	ff54fa2325	hv: vpci: add emulated PCI device configure for SOS Add emulated PCI device configure for SOS to prepare for add support for customizing special pci operations for each emulated PCI device. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-08-09 14:19:49 +08:00
Li, Fei1	5471473f60	hv: vpci: create iommu domain in vpci_init for all guests Create an iommu domain for all guest in vpci_init no matter if there's a PTDev in it. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com>	2019-08-06 11:51:02 +08:00
Li, Fei1	eb21f205e4	hv: vm_config: build pci device configure for SOS Align SOS pci device configure with pre-launched VM and filter pre-launched VM's PCI PT device from SOS pci device configure. Tracked-On: #3475 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-08-06 11:51:02 +08:00
Victor Sun	901a65cb53	HV: inject exception for invalid vmcall For non-trusty hypercalls, HV should inject #GP(0) to vCPU if they are from non-ring0 or inject #UD if they are from ring0 of non-SOS. Also we should not modify RAX of vCPU for these invalid vmcalls. Tracked-On: #3497 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-08-01 16:07:57 +08:00
Conghui Chen	c4f6681045	softirq: disable interrupt when modify timer_list In current code, the timer_list for per cpu can be accessed both in vmexit and softirq handler. There is a case that, the timer_list is modifying in vmexit, but an interrupt occur, the timer_list is also modified in softirq handler. So the time_list may in unpredictable state. In some platforms, the hv console may hang as its timer handler is not invoked because of the corruption for timer_list. So, to fix the issue, disable the interrupt before modifying the timer_list. Tracked-On: #3512 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2019-08-01 15:45:02 +08:00
Victor Sun	363daf6aa2	HV: return extended info in vCPUID leaf 0x40000001 In some case, guest need to get more information under virtual environment, like guest capabilities. Basically this could be done by hypercalls, but hypercalls are designed for trusted VM/SOS VM, We need a machenism to report these information for normal VMs. In this patch, vCPUID leaf 0x40000001 will be used to satisfy this needs that report some extended information for guest by CPUID. Tracked-On: #3498 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Zhao Yakui <yakui.zhao@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-31 14:13:39 +08:00
Kaige Fu	accdadce98	HV: Enable vART support by intercepting TSC_ADJUST MSR The policy of vART is that software in native can run in VM too. And in native side, the relationship between the ART hardware and TSC is: pTSC = (pART * M) / N + pAdjust The vART solution is: - Present the ART capability to guest through CPUID leaf 15H for M/N which identical to the physical values. - PT devices see the pART (vART = pART). - Guest expect: vTSC = vART * M / N + vAdjust. - VMCS.OFFSET = vTSC - pTSC = vAdjust - pAdjust. So to support vART, we should do the following: 1. if vAdjust and vTSC are changed by guest, we should change VMCS.OFFSET accordingly. 2. Make the assumption that the pAjust is never touched by ACRN. For #1, commit "a958fea hv: emulate IA32_TSC_ADJUST MSR" has implementation it. And for #2, acrn never touch pAdjust. -- v2 -> v3: - Add comment when handle guest TSC_ADJUST and TSC accessing. - Initialize the VMCS.OFFSET = vAdjust - pAdjust. v1 -> v2 Refine commit message to describe the whole vART solution. Tracked-On: #3501 Signed-off-by: Kaige Fu <kaige.fu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-31 13:29:51 +08:00
Victor Sun	9139f94ec9	HV: correct CONFIG_BOARD string of apl up2 The CONFIG_BOARD value in defconfig should match with Makefile, otherwise the build might be failed in some condition. Tracked-On: #2291 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-07-30 09:50:10 +08:00
Victor Sun	555a03db99	HV: add board specific cpu state table to support Px Cx Currently the Px Cx supported SoCs which listed in cpu_state_tbl.c is limited, and it is not a wise option to build a huge state table data base to support Px/Cx for other SoCs. This patch give a alternative solution that build a board specific cpu state table in board.c which could be auto-generated by offline tool, then the CPU Px/Cx of customer board could be enabled; Hypervisor will search the cpu state table in cpu_state_tbl[] first, if not found then go check board_cpu_state_tbl. If no matched cpu state table is found then Px/Cx will not be supported; Tracked-On: #3477 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-07-29 20:25:16 +08:00
Victor Sun	cd3b8ed7f1	HV: fix MISRA violation of cpu state table Per MISRA C, the dimention of a array must be specified. Tracked-On: #3477 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-07-29 20:25:16 +08:00
Li, Fei1	4a27d08360	hv: schedule: schedule to idel after SOS resume form S3 After "commit `f0e1c5e` init vcpu host stack when reset vcpu", SOS resume form S3 wants to schedule to vcpu_thread not the point where SOS enter S3. So we should schedule to idel first then reschedule to execute vcpu_thread. Tracked-On: #3387 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-29 09:53:18 +08:00
Zhao Yakui	7b22456786	HV: Remove the mixed usage of inline assembly in wait_sync_change When monitor/mwait is not supported, it still uses the inline assembly in wait_sync_change. As it is not allowed based on MISRA-C, the asm wrapper is used for pause scenario in wait_sync_change. Tracked-On: #3442 Suggested-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>	2019-07-26 10:55:58 +08:00
Zhao Yakui	baf7d90fdf	HV: Refine the usage of monitor/mwait to avoid the possible lockup Based on SDM Vol2 the monitor uses the RAX register to setup the address monitored by HW. The mwait uses the rax/rcx as the hints that the process will enter. It is incorrect that the same value is used for monitor/mwait. The ecx in mwait specifies the optional externsions. At the same time it needs to check whether the the value of monitored addr is already expected before entering mwait. Otherwise it will have possible lockup. V1->V2: Add the asm wrappper of monitor/mwait to avoid the mixed usage of inline assembly in wait_sync_change v2-v3: Remove the unnecessary line break in asm_monitor/asm_mwait. Follow Fei's comment to remove the mwait ecx hint setting that treats the interrupt as break event. It only needs to check whether the value of psync_change is already expected. Tracked-On: #3442 Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-07-26 10:55:58 +08:00
Li, Fei1	11cf9a4a8a	hv: mmu: add hpa2hva_early API for earlt boot When need hpa and hva translation before init_paging, we need hpa2hva_early and hva2hpa_early since init_paging may modify hva2hpa to not be identical mapping. Tracked-On: #2987 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-26 09:10:06 +08:00
Li, Fei1	40475e22b8	hv: debug: use printf to debug on early boot 1) Using printf to warn if platform ram size configuration is wrong. 2) Using printf to warn if the platform is not supported by ACRN hypervisor. Tracked-On: #2987 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-26 09:10:06 +08:00
Li, Fei1	cc47dbe769	hv: uart: enable early boot uart Enable uart as early as possible to make things easier for debugging. After this we could use printf to output information to the uart. As for pr_xxx APIs, they start to work when init_logmsg is called. Tracked-On: #2987 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-26 09:10:06 +08:00
Yonghua Huang	49e60ae151	hv: refine handler to 'rdpmc' vmexit PMC is hidden from guest and hypervisor should inject UD to guest when 'rdpmc' vmexit. Tracked-On: #3453 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-07-24 15:05:46 +08:00
Victor Sun	a7b6fc74e5	HV: allow write 0 to MSR_IA32_MCG_STATUS Per SDM, writing 0 to MSR_IA32_MCG_STATUS is allowed, HV should not return -EACCES on this case; Tracked-On: #3454 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-23 15:24:50 +08:00
Victor Sun	3cf1daa480	HV: move vbar info to board specific pci_devices.h The vbar info which hard-coded in scenarios/logical_partition/pt_dev.c is board specific actually, so move these information to arch/x86/configs/$(CONFIG_BOARD)/pci_devices.h. Please be aware that the memory range of vBAR should exactly match with the e820 layout of VM. Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-07-23 09:12:50 +08:00
Victor Sun	a27ce27a2e	HV: rename nuc7i7bnh to nuc7i7dnb NUC7i7BNH is not a board name but a product name of KBL NUC, and it is outdated to support LOGICAL_PARTITION scenario and HYBRID scenario. NUC7i7DNH is the product name of KBL NUC that ACRN currently supported, but its official board name is NUC7i7DNB, so change the folder name from "nuc7i7bnh" to "nuc7i7dnb" under arch/x86/configs/. Please refer more details on below documentation: Intel® NUC Board/Kit NUC7i7DN Technical Product Specification Tracked-On: #3446 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Xiangyang Wu <xiangyang.wu@linux.intel.com>	2019-07-22 16:21:12 +08:00
Yonghua Huang	dde20bdb03	HV:refine the handler for 'invept' vmexit 'invept' is not expected in guest and hypervisor should inject UD when 'invept' VM exit happens. Tracked-On: #3444 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>	2019-07-22 13:23:47 +08:00
Yin Fengwei	f0e1c5e55f	vcpu: init vcpu host stack when reset vcpu Otherwise, the previous local variables in host stack is not reset. Tracked-On: #3387 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-07-22 09:55:06 +08:00
Yin, Fengwei	11e67f1c4a	softirq: move softirq from hv_main to interrupt context softirq shouldn't be bounded to vcpu thread. One issue for this is shell (based on timer) can't work if we don't start any guest. This change also is trying best to make softirq handler running with irq enabled. Also update the irq disable/enabel in vmexit handler to align with the usage in vcpu_thread. Tracked-On: #3387 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-07-22 09:55:06 +08:00
Yan, Like	a4abeaf980	hv: enforce no interrupt to RT VM via vlapic once lapic pt Because we depend on guest OS to switch x2apic mode to enable lapic pass-thru, vlapic is working at the early stage of booting, eg: in virtual boot loader. After lapic pass-thru enabled, no interrupt should be injected via vlapic any more. This commit resets the vlapic to clear the pending status and adds ptapic_ops to enforce that no more interrupt accepted/injected via vlapic. Tracked-On: #3227 Signed-off-by: Yan, Like <like.yan@intel.com>	2019-07-19 16:47:06 +08:00
Yan, Like	97f6097f04	hv: add ops to vlapic structure This commit adds ops to vlapic structure, and add an *ops parameter to vlapic_reset(). At vlapic reset, the ops is set to the global apicv_ops, and may be assigned to other ops later. Tracked-On: #3227 Signed-off-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-19 16:47:06 +08:00
fuzhongl	a90a6a1059	HV: add SDC2 config in hypervisor/arch/x86/Kconfig Per community requirement;up to three post-launched VM might be needed for some automotive SDC system, so add SDC2 scenario to satisfy this requirement. Tracked-On: #3429 Signed-off-by: fuzhongl <fuzhong.liu@intel.com> Reviewed-by: Victor Sun <victor.sun@intel.com>	2019-07-18 15:03:14 +08:00
Victor Sun	600aa8ea5a	HV: change param type of init_pcpu_pre When initialize secondary pcpu, pass INVALID_CPU_ID as param of init_pcpu_pre() looks weird, so change the param type to bool to represent whether the pcpu is a BSP or AP. Tracked-On: #3420 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-17 13:48:00 +08:00
Li, Fei1	b39526f759	hv: schedule: vCPU schedule state setting don't need to be atomic vCPU schedule state change is under schedule lock protection. So there's no need to be atomic. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-17 09:20:54 +08:00
Li, Fei1	8af334cbb2	hv: vcpu: operation in vcpu_create don't need to be atomic For pre-launched VMs and SOS, vCPUs are created on BSP one by one; For post-launched VMs, vCPUs are created under vmm_hypercall_lock protection. So vcpu_create is called sequentially. Operation in vcpu_create don't need to be atomic. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-17 09:20:54 +08:00
Li, Fei1	540841ac5d	hv: vlapic: EOI exit bitmap should set or clear atomically For per-vCPU, EOI exit bitmap is a global parameter which should set or clear atomically since there's no lock to protect this critical variable. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>	2019-07-17 09:20:54 +08:00
Li, Fei1	e69b3dcf67	hv: schedule: remove runqueue_lock in sched_context Now sched_object and sched_context are protected by scheduler_lock. There's no chance to use runqueue_lock to protect schedule runqueue if we have no plan to support schedule migration. Signed-off-by: Li, Fei1 <fei1.li@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>	2019-07-17 09:20:54 +08:00
Li, Fei1	b1dd3e26f5	hv: cpu: pcpu_active_bitmap should be set atomically It's a global parameter and could be set concurrently. So it should be set atomically. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Reviewed-by: Yin Fenwgei <fengwei.yin@intel.com>	2019-07-17 09:20:54 +08:00
Victor Sun	5b1852e482	HV: add kata support on sdc scenario In current design, devicemodel passes VM UUID to create VMs and hypervisor would check the UUID whether it is matched with the one in VM configurations. Kata container would maintain few UUIDs to let ACRN launch the VM, so hypervisor need to add these UUIDs in VM configurations for Kata running. In the hypercall of hcall_get_platform_info(), hypervisor will report the maximum Kata container number it will support. The patch will add a Kconfig to indicate the maximum Kata container number that SOS could support. In current stage, only one Kata container is supported by SOS on SDC scenario so add one UUID for Kata container in SDC VM configuration. If we want to support Kata on other scenarios in the future, we could follow the example of this patch; Tracked-On: #3402 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-12 16:34:31 +08:00
Tianhua Sun	2d4809e3b1	hv: fix some potential array overflow risk 'pcpu_id' should be less than CONFIG_MAX_PCPU_NUM, else 'per_cpu_data' will overflow. This commit fixes this potential overflow issue. Tracked-On: #3397 Signed-off-by: Tianhua Sun <tianhuax.s.sun@intel.com> Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>	2019-07-12 09:41:15 +08:00
Huihuang Shi	304ae38161	HV: fix "use -- or ++ operations" ACRN coding guidelines banned -- or ++ operations. V1->V2: add comments to struct stack_frame Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>	2019-07-12 09:26:15 +08:00
Victor Sun	1884bb0551	HV: modify HV RAM and serial config for apl-nuc - To support grub multiboot for nuc6cayh, we should put hv ram start at a suitable address; - Enable HSUART controller at PCI 0:18.0 as HV serail port; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-11 14:48:26 +08:00
Victor Sun	f18dfcf522	HV: prepare ve820 for apl nuc Add ve820 table for apl nuc board to enable prelaunched VM on it; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-11 14:48:26 +08:00
Huihuang Shi	79d033027b	HV: fix vmptable "Casting operation to a pointer" ACRN Coding guidelines requires two different types pointer can't convert to each other, except void *. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>	2019-07-11 13:57:21 +08:00
Huihuang Shi	9063504bc8	HV: ve820 fix "Casting operation to a pointer" ACRN Coding guidelines requires two different types pointer can't convert to each other, except void *. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>	2019-07-11 13:57:21 +08:00
Huihuang Shi	714162fb8b	HV: fix violations touched type conversion ACRN Coding guidelines requires type conversion shall be explicity. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-11 09:16:09 +08:00
Li, Fei1	5d6c9c33ca	hv: vlapic: clear up where needs atomic operation in vLAPIC In almost case, vLAPIC will only be accessed by the related vCPU. There's no synchronization issue in this case. However, other vCPUs could deliver interrupts to the current vCPU, in this case, the IRR (for APICv base situation) or PIR (for APICv advanced situation) and TMR for both cases could be accessed by more than one vCPUS simultaneously. So operations on IRR or PIR should be atomical and visible to other vCPUs immediately. In another case, vLAPIC could be accessed by another vCPU when create vCPU or reset vCPU which could be supposed to be consequently. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-11 09:15:47 +08:00
Li, Fei1	05a4ee8074	hv: cpu: refine secondary cpu start up 1) add a write memory barrier after setting pcpu_sync to one to let this change visible to AP immediately. 2) there's only BSP will set pcpu_sync, so there's no memory order issue between CPUs. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-11 09:15:47 +08:00
Yonghua Huang	1ea3052f80	HV: check security mitigation support for SSBD Hypervisor exposes mitigation technique for Speculative Store Bypass(SSB) to guests and allows a guest to determine whether to enable SSBD mitigation by providing direct guest access to IA32_SPEC_CTRL. Before that, hypervisor should check the SSB mitigation support on underlying processor, this patch is to add this capability check. Tracked-On: #3385 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-07-10 10:55:34 +08:00
Huihuang Shi	4b6dc0255f	HV: fix vmptable misc violations Fix the violations list below: 1.Function should have one return entry. 2.Do not use -- or ++ operation. 3.For loop should be simple, shall not use comma operations. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-09 10:36:44 +08:00
Mingqiang Chi	e4d1c321ad	hv:fix "no prototype for non-static function" change some APIs to static or include header file Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>	2019-07-09 10:36:03 +08:00
Shuo A Liu	4129b72b2e	hv: remove unnecessary cancel_event_injection related stuff cancel_event_injection is not need any more if we do 'scheudle' prior to acrn_handle_pending_request. Commit "921288a6672: hv: fix interrupt lost when do acrn_handle_pending_request twice" bring 'schedule' forward, so remove cancel_event_injection related stuff. Tracked-On: #3374 Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-07-09 09:23:12 +08:00
Huihuang Shi	9a7043e83f	HV: remove instr_emul.c dead code ACRN Coding guidelines requires no dead code. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2019-07-09 09:22:53 +08:00
Yonghua Huang	3164f3976a	hv: Mitigation for CPU MDS vulnerabilities. Microarchitectural Data Sampling (MDS) is a hardware vulnerability which allows unprivileged speculative access to data which is available in various CPU internal buffers. 1. Mitigation on ACRN: 1) Microcode update is required. 2) Clear CPU internal buffers (store buffer, load buffer and load port) if current CPU is affected by MDS, when VM entry to avoid any information leakage to guest thru above buffers. 3) Mitigation is not needed if ARCH_CAP_MDS_NO bit (bit5) is set in IA32_ARCH_CAPABILITIES MSR (10AH), in this case, current processor is no affected by MDS vulnerability, in other cases mitigation for MDS is required. 2. Methods to clear CPU buffers (microcode update is required): 1) L1D cache flush 2) VERW instruction Either of above operations will trigger clearing all CPU internal buffers if this CPU is affected by MDS. Above mechnism is enumerated by: CPUID.(EAX=7H, ECX=0):EDX[MD_CLEAR=10]. 3. Mitigation details on ACRN: if (processor is affected by MDS) if (processor is not affected by L1TF OR L1D flush is not launched on VM Entry) execute VERW instruction when VM entry. endif endif 4. Referrence: Deep Dive: Intel Analysis of Microarchitectural Data Sampling https://software.intel.com/security-software-guidance/insights/ deep-dive-intel-analysis-microarchitectural-data-sampling Deep Dive: CPUID Enumeration and Architectural MSRs https://software.intel.com/security-software-guidance/insights/ deep-dive-cpuid-enumeration-and-architectural-msrs Tracked-On: #3317 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Anthony Xu <anthony.xu@intel.com> Reviewed-by: Jason CJ Chen <jason.cj.chen@intel.com>	2019-07-05 15:17:27 +08:00
Yonghua Huang	076a30b555	hv: refine security capability detection function. ACRN hypervisor always print CPU microcode update warning message on KBL NUC platform, even after BIOS was updated to the latest. 'check_cpu_security_cap()' returns false if no ARCH_CAPABILITIES MSR support on current platform, but this MSR may not be available on some platforms. This patch is to remove this pre-condition. Tracked-On: #3317 Signed-off-by: Yonghua Huang <yonghua.huang@intel.com> Reviewed-by: Jason CJ Chen <jason.cj.chen@intel.com>	2019-07-05 15:17:27 +08:00
Cai Yulong	127c98f5db	hv: vioapic: fix interrupt lost and redundant interrupt 1. reset polarity of ptirq_remapping_info to zero. this help to set correct initial pin state, and fix the interrupt lost issue when assign a ptirq to uos. 2. since vioapic_generate_intr relys on rte, we should build rte before generating an interrput, this fix the redundant interrupt. Tracked-On: #3362 Signed-off-by: Cai Yulong <yulongc@hwtc.com.cn>	2019-07-05 10:22:56 +08:00
Binbin Wu	f3ffce4be1	hv: vmexit: ecx should be checked instead of rcx when xsetbv According to SDM, xsetbv writes the contents of registers EDX:EAX into the 64-bit extended control register (XCR) specified in the ECX register. (On processors that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.) In current code, RCX is checked, should ingore the high-order 32bits. Tracked-On: #3360 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>	2019-07-05 09:48:53 +08:00
Li, Fei1	09a63560f4	hv: vm_manage: minor fix about triple_fault_shutdown_vm The current implement will trigger shutdown vm request on the BSP VCPU on the VM, not the VCPU will trap out because triple fault. However, if the BSP VCPU on the VM is handling another IO emulation, it may overwrite the triple fault IO request on the vhm_request_buffer in function acrn_insert_request. The atomic operation of get_vhm_req_state can't guarantee the vhm_request_buffer will not access by another IO request if it is not running on the corresponding VCPU. So it should trigger triple fault shutdown VM IO request on the VCPU which trap out because of triple fault exception. Besides, rt_vm_pm1a_io_write will do the right thing which we shouldn't do it in triple_fault_shutdown_vm. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-07-03 17:44:45 +08:00
Li, Fei1	ebf5c5eb5d	hv: cpu: remove CPU up count Since there's no one uses it. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-07-03 17:44:45 +08:00
Binbin Wu	4a22801dd1	hv: ept: mask EPT leaf entry bit 52 to bit 63 in gpa2hpa According to SDM, bit N (physical address width) to bit 63 should be masked when calculate host page frame number. Currently, hypervisor doesn't set any of these bits, so gpa2hpa can work as expectd. However, any of these bit set, gpa2hpa return wrong value. Hypervisor never sets bit N to bit 51 (reserved bits), for simplicity, just mask bit 52 to bit 63. Tracked-On: #3352 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com>	2019-07-03 09:39:41 +08:00
Huihuang Shi	74b788988d	HV:fix vcpu more than one return entry ACRN coding guideline requires function shall have only one return entry. Fix it. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-28 13:28:26 +08:00
Huihuang Shi	198e01716a	HV:fix vcpu violations vcpu is never scan because of scan tool will be crashed! After modulization, the vcpu can be scaned by the scan tool. Clean up the violations in vcpu.c. Fix the violations: 1.No brackets to then/else. 2.Function return value not checked. 3.Signed/unsigned coversion without cast. V1->V2: change the type of "vcpu->arch.irq_window_enabled" to bool. V2->V3: add "void *" prefix on the 1st parameter of memset. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-28 13:28:26 +08:00
Conghui Chen	4d88b2bb65	hv: bugfix for sbuf reset The sbuf is allocated for each pcpu by hypercall from SOS. Before launch Guest OS, the script will offline cpus, which will trigger vcpu reset and then reset sbuf pointer. But sbuf only initiate once by SOS, so these cpus for Guest OS has no sbuf to use. Thus, when run 'acrntrace' on SOS, there is no trace data for Guest OS. To fix the issue, only reset the sbuf for SOS. Tracked-On: #3335 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Reviewed-by: Eddie Dong <eddie.dong@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com>	2019-06-27 15:40:19 +08:00
Li, Fei1	e793b5d091	hv: vlapic: remove ISR vector stack The current implement will cache each ISR vector in ISR vector stack and do ISR vector stack check when updating PPR. However, there is no need to do this because: 1) We will not touch vlapic->isrvec_stk[0] except doing vlapic_reset: So we don't need to do vlapic->isrvec_stk[0] check. 2) We only deliver higher priority interrupt from IRR to ISR: So we don't need to check whether vlapic->isrvec_stk interrupts is always increasing. 3) There're only 15 different priority interrupt, It will not happened that more that 15 interrupts could been delivered to ISR: So we don't need to check whether vlapic->isrvec_stk_top will larger than ISRVEC_STK_SIZE which is 16. This patch try to remove ISR vector stack and use isrv to cache the vector number for the highest priority bit that is set in the ISR. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-06-27 15:27:37 +08:00
Huihuang Shi	3a61530d4e	HV:fix simple violations Fix the violations not touched the logical. 1.Function return value not checked. 2.Logical conjuctions need brackets. 3.No brackets to then/else. 4.Type conversion without cast. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>	2019-06-25 20:09:21 +08:00
Sainath Grandhi	0a8bf6cee4	hv: Avoid run-time buffer overflows with IOAPIC data structures Remove couple of run-time ASSERTs in ioapic module by checking for the number of interrupt pins per IO-APICs against the configured MAX_IOAPIC_LINES in the initialization flow. Also remove the need for two MACROs specifying the max. number of interrupt lines per IO-APIC and add a config item MAX_IOAPIC_LINES for the same. Tracked-On: #3299 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-24 11:41:10 +08:00
Mingqiang Chi	c1e23f1a4a	hv:Fix MISRA-C violations for static inline MISRA-C requires inline functions should be declared static, these APIs are external interfaces,remove inline Tracked-On: #861 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> modified: arch/x86/guest/vcpu.c	2019-06-24 08:31:32 +08:00
Huihuang Shi	e3ee9cf20e	HV: fix expression is not boolean MISRA-C standard requires the type of result of expression in if/while pattern shall be boolean. Tracked-On: #861 Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>	2019-06-21 09:04:44 +08:00
Kaige Fu	8740232ad6	HV: Allow pause RTVM when its state is VM_CREATED There are a lot of works to do between create_vm (HV will mark vm's state as VM_CREATED at this stage) and vm_run (HV will mark vm's state as VM_STARTED), like building mptable/acpi table, initializing mevent and vdevs. If there is something goes wrong between create_vm and vm_run, the devicemodel will jumps to the deinit process and will try to destroy the vm. For example, if the vm_init_vdevs failed, the devicemodel will jumps to dev_fail and then destroy the vm. For normal vm in above situation, it is fine to destroy vm. And we can create and start it next time. But for RTVM, we can't destroy the vm as the vm's state is VM_CREATED. And we can only destroy vm when its state is VM_POWERING_OFF. So, the vm will stay at VM_CREATED state and we will never have chance to destroy it. Consequently, we can't create and start the vm next time. This patch fixes it by allowing to pause and then destroy RTVM when its state is VM_CREATED. Tracked-On: #3069 Acked-by: Eddie Dong <eddie.dong@intel.com> Signed-off-by: Kaige Fu <kaige.fu@intel.com>	2019-06-20 22:24:38 +08:00
Yuan Liu	f8934df355	HV: implement wbinvd instruction emulation wbinvd is used to write back all modified cache lines in the processor's internal cache to main memory and invalidates(flushes) the internal caches. Using clflushopt instructions to emulate wbinvd to flush each guest vm memory, if CLFLUSHOPT is not supported, boot will fail. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-20 09:32:55 +08:00
Yuan Liu	ea699af861	HV: Add has_rt_vm API The has_rt_vm walk through all VMs to check RT VM flag and if there is no any RT VM, then return false otherwise return true. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com>	2019-06-20 09:32:55 +08:00
Yuan Liu	7018a13cb6	HV: Add ept_flush_leaf_page API The ept_flush_leaf_page API is used to flush address space from a ept page entry, user can use it to match walk_ept_mr to flush VM address space. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-20 09:32:55 +08:00
Yuan Liu	f320130d58	HV: Add walk_ept_table and get_ept_entry APIs The walk_ept_table API is used to walk through EPT table for getting all of present pages, user can get each page entry and its size from the walk_ept_table callback. The get_ept_entry is used to getting EPT pointer of the vm, if current context of mv is secure world, return secure world EPT pointer, otherwise return normal world EPT pointer. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-20 09:32:55 +08:00
Yuan Liu	f81585eb3d	HV: Add flush_address_space API. flush_address_space is used to flush address space by clflushopt instruction. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-20 09:32:55 +08:00
Yuan Liu	6fd397e82b	HV: Add CLFLUSHOPT instruction. CLFLUSHOPT is used to invalidate from every level of the cache hierarchy in the cache coherence domain the cache line that contains the linear address specified with memory operand. If that cache line contains modified date at any level of the cache hierarchy, that data is written back to memory. If the platform does not support CLFLUSHOPT instruction, boot will fail. Signed-off-by: Jack Ren <jack.ren@intel.com> Signed-off-by: Yuan Liu <yuan1.liu@intel.com> Reviewed-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-20 09:32:55 +08:00
Li, Fei1	0e046c7a0a	hv: vlapic: clear which access type we support for APIC-Access VM Exit The current implement doesn't clear which access type we support for APIC-Access VM Exit: 1) linear access for an instruction fetch -- APIC-access page is mapped as UC which doesn't support fetch 2) linear access (read or write) during event delivery -- Which is not happened in normal case except the guest went wrong, such as, set the IDT table in APIC-access page. In this case, we don't need to support. 3) guest-physical access during event delivery; guest-physical access for an instruction fetch or during instruction execution -- Do we plan to support enable APIC in real mode ? I don't think so. Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-06-20 08:53:25 +08:00
Li, Fei1	9960ff98c5	hv: ept: unify EPT API name to verb-object style Rename ept_mr_add to ept_add_mr Rename ept_mr_modify to ept_modify_mr Rename ept_mr_del to ept_del_mr Tracked-On: #1842 Signed-off-by: Li, Fei1 <fei1.li@intel.com>	2019-06-14 14:40:25 +08:00
Mingqiang Chi	8338cd463b	hv: move 3 files to lib & arch folder move stack_protector.c/retpoline-thunk.S into lib folder move vmptable.c into arch/x86/config Tracked-On: #1842 Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com> modified: Makefile renamed: arch/x86/retpoline-thunk.S -> arch/x86/lib/retpoline-thunk.S renamed: common/stack_protector.c -> lib/stack_protector.c renamed: dm/vmptable.c -> arch/x86/configs/vmptable.c	2019-06-14 14:22:51 +08:00
Sainath Grandhi	7d44cd5c28	hv: Introduce check_vm_vlapic_state API This patch introduces check_vm_vlapic_state API instead of is_lapic_pt_enabled to check if all the vCPUs of a VM are using x2APIC mode and LAPIC pass-through is enabled on all of them. When the VM is in VM_VLAPIC_TRANSITION or VM_VLAPIC_DISABLED state, following conditions apply. 1) For pass-thru MSI interrupts, interrupt source is not programmed. 2) For DM emulated device MSI interrupts, interrupt is not delivered. 3) For IPIs, it will work only if the sender and destination are both in x2APIC mode. Tracked-On: #3253 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-14 13:55:26 +08:00
Sainath Grandhi	f3627d4839	hv: Add update_vm_vlapic_state API to sync the VM vLAPIC state This patch introduces vLAPIC state for a VM. The VM vLAPIC state can be one of the following * VM_VLAPIC_X2APIC - All the vCPUs/vLAPICs (Except for those in Disabled mode) of this VM use x2APIC mode * VM_VLAPIC_XAPIC - All the vCPUs/vLAPICs (Except for those in Disabled mode) of this VM use xAPIC mode * VM_VLAPIC_DISABLED - All the vCPUs/vLAPICs of this VM are in Disabled mode * VM_VLAPIC_TRANSITION - Some of the vCPUs/vLAPICs of this VM (Except for those in Disabled mode) are in xAPIC and the others in x2APIC Upon a vCPU updating the IA32_APIC_BASE MSR to switch LAPIC mode, this API is called to sync the vLAPIC state of the VM. Upon VM creation and reset, vLAPIC state is set to VM_VLAPIC_XAPIC, as ACRN starts the vCPUs vLAPIC in XAPIC mode. Tracked-On: #3253 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-14 13:55:26 +08:00
Sainath Grandhi	a3fdc7a496	hv: Add is_xapic_enabled API to check vLAPIC moe is_xapic_enabled API returns true if vLAPIC is in xAPIC mode. In all other cases, it returns false. Tracked-On: #3253 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-14 13:55:26 +08:00
Sainath Grandhi	7cb71a317e	hv: Make is_x2apic_enabled API visible across source code Remove static and inline attributes to the API is_x2apic_enabled and declare a prototype in vlapic.h. Also fix the check performed on guest APICBASE_MSR value to query vLAPIC mode. Tracked-On: #3253 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-14 13:55:26 +08:00
Sainath Grandhi	1026f1754c	hv: Shuffle logic in vlapic_set_apicbase API implementation This patch changes the code in vlapic_set_apicbase for the following reasons 1) Better readability as it first checks if the new value programmed into MSR is any different from the existing value cached in guest structures 2) Check if both bits 11:10 are set before enabling x2APIC mode for guest. Current code does not check if Bit 11 is set. 3) Add TODO in the comments, to detail about the current gaps in IA32_APIC_BASE MSR emulation. Tracked-On: #3253 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-14 13:55:26 +08:00
Arindam Roy	2321fcdf78	HV:Modularize vpic code to remove usage of acrn_vm V1:Initial Patch Modularize vpic. The current patch reduces the usage of acrn_vm inside the vpic.c file. Due to the global natire of register_pio_handler, where acrn_vm is being passed, some usage remains. These needs to be a separate "interface" file. That will come in smaller newer patch provided this patch is accepted. V2: Incorporated comments from Jason. V3: Fixed some MISRA-C Violations. Tracked-On: #1842 Signed-off-by: Arindam Roy <arindam.roy@intel.com> Reviewed-by: Xu, Anthony <anthony.xu@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-13 09:54:52 +08:00
Yan, Like	32239cf55f	hv: reduce cyclomatic complexity of create_vm() This commit extract pm io handler registration code to register_pm_io_handler() to reduce the cyclomatic complexity of create_vm() in order to be complied with MISRA-C rules. Tracked-On: #3227 Signed-off-by: Yan, Like <like.yan@intel.com>	2019-06-12 14:29:50 +08:00
Conghui Chen	ac6c5dce81	HV: Clean vpic and vioapic logic when lapic is pt When the lapic is passthru, vpic and vioapic cannot be used anymore. In current code, user can still inject vpic interrupt to Guest OS, this is not allowed. This patch remove the vpic and vioapic initiate functions during creating VM with lapic passthru. But the APIs in vpic and vioapic are called in many places, for these APIs, follow the below principles: 1. For the APIs which will access uninitiated variables, and may case hypervisor hang, add @pre to make sure user should call them after vpic or vioapic is initiated. 2. For the APIs which only return some static value, do noting with them. 3. For the APIs which user will called to inject interrupt, such as vioapic_set_irqline_lock or vpic_set_irqline, add condition in these APIs to make sure it only inject interrupt when vpic or vioapic is initiated. This change is to make sure the vuart or hypercall need not to care whether lapic is passthru or the vpic and vioapic is initiated or not. Tracked-On: #3227 Signed-off-by: Conghui Chen <conghui.chen@intel.com>	2019-06-12 14:29:50 +08:00
Victor Sun	f83ddd393f	HV: introduce relative vm id for hcall api On SDC scenario, SOS VM id is fixed to 0 so some hypercalls from guest are using hardcoded "0" to represent SOS VM, this would bring issues for HYBRID scenario which SOS VM id is non-zero. Now introducing a new VM id concept for DM/VHM hypercall APIs, that return a relative VM id which is from SOS view when create VM for post- launched VMs. DM/VHM could always treat their own vm id is "0". When they make hypercalls, hypervisor will convert the VM id to the absolute id when dispatch the hypercalls. Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2019-06-12 11:00:40 +08:00
Victor Sun	3d3de6bd38	HV: specify dispatch hypercall for sos or trusty Changes: - In current design, the hypercall is only allowed calling from SOS or trusty VM, so separate the trusty hypercalls from dispatch_hypercall(). The vm parameter which referenced by hcall_xxx() should be SOS VM; - do not inject #UD for hypercalls from non-SOS, just return -ENODEV; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@Intel.com>	2019-06-12 11:00:40 +08:00
Yin Fengwei	6b7233446f	xsave: inject GP when guest tries to write 1 to XCR0 reserved bit According to SDM vol1 13.3: Write 1 to reserved bit of XCR0 will trigger GP. This patch make ACRN behavior align with SDM definition. Tracked-On: #3239 Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-12 08:28:53 +08:00
Tianhua Sun	8dd471b37d	hv: fix possible null pointer dereference This patch fix potential null pointer dereference 1, will access null pointer if 'context' is null. 2, if entry already been added to the VM when add intx entry for this vm, but parameter virt_pin is not equal to entry->virt_sid.intx_id.pin. So will saves this entry address to vpin_to_pt_entry[entry->virt_sid.intx_id.pin] and vpin_to_pt_entry[virt_pin]. In this case, this entry will be freed twice. Tracked-On: #3217 Signed-off-by: Tianhua Sun <tianhuax.s.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-11 16:03:04 +08:00
Zide Chen	e63d32ac02	hv: delay enabling SMEP/SMAP until the end of PCPU initialization Host ACPI parsing is needed during initialization only, not in run time. Hence we don't need to clear U flag for memory in reserved or ACPI type E820 entries. - move enable_smep() and enable_smap() to the end of init_pcpu_post(), so stac()/clac() can be removed from any init code before this point. - call init_seed() before init_pcpu_post(), and rmeove stac()/clac() from init_seed(). Tracked-On: #3194 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-06-10 11:35:15 +08:00
Zide Chen	9e91f14bec	hv: correctly grant DRHD register access rights to hypervisor Need to call hv_access_memory_region_update() explicitly for DRHD registers to correctly grant access rights for hypervisor. Currently, other hv_access_memory_region_update() calls happen to cover the DRHD addresses for currently supported platforms. Tracked-On: #3194 Signed-off-by: Zide Chen <zide.chen@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-06-10 11:35:15 +08:00
Zhao Yakui	c71cf753eb	ACRN/HV: Add one new board configuration for ACRN-hypervisor The memory size and IOMMU number are refined to meet with ICL board requirement. Otherwise the ACRN hypervisor can't be booted on the new ICL board. ICL(the abbreviation of Ice Lake) is the next generation platform based on 10nm. CPU is based on Sunny Cove microarchitecture and GPU is based on gen11. The new board is named as icl-rvp. Tracked-On: #3216 Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: Binbin Wu <binbin.wu@intel.com>	2019-06-10 11:23:18 +08:00
Victor Sun	04d82e5c0f	HV: return virtual lapic id in vcpuid 0b leaf Currently vlapic id of SOS VM is virtualized, it is indexed by vcpuid in physical APIC id sequence, but CPUID 0BH leaf still report physical APIC ID. In SDC/INDUSTRY scenario they are identical mapping so no issue occured. In hybrid mode this would be a problem because vAPIC ID might be different with pAPIC ID. We need to make the APIC ID which returned from CPUID consistent with the one returned from LAPIC register. Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-06 15:22:10 +08:00
Victor Sun	0a748fedac	HV: add hybrid scenario Hybrid scenario will run 3 VMs: one pre-launched VM, one pre-launched SOS VM and one post-launched Standard VM. Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 15:22:10 +08:00
Jason Chen CJ	a2c6b11614	HV: change nuc7i7bnh ram start to 0x60000000 to support grub multiboot for nuc7i7bnh, we should put hv ram start at a suitable address as SOS bzImage may need use 0x1000000 Tracked-On: #3214 Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com> Reviewed-by: Victor Sun <victor.sun@intel.com>	2019-06-06 15:22:10 +08:00
Victor Sun	50e09c41b4	HV: remove cpu_num from vm configurations The vcpu num could be calculated based on pcpu_bitmap when prepare_vcpu() is done, so remove this redundant configuration item; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 15:22:10 +08:00
Victor Sun	f4e976ab38	HV: return -1 with invalid vcpuid in pt icr access vm_apicid2vcpu_id() might return invalid vcpu id, when this happens we should return -1 in vlapic_x2apic_pt_icr_access(); Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 15:22:10 +08:00
Victor Sun	ae7dcf443d	HV: fix wrong log when vlapic process init sipi The print message of source and target vcpu id is incorrect, fix it. Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 15:22:10 +08:00
Victor Sun	6940cabd22	HV: modify ve820 to enable low mem at 0x100000 Some OS like Zephyr need to run at 0x100000, so modify the ve820 table accordingly; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 09:40:52 +08:00
Victor Sun	ea7ca8595c	HV: use tag to specify multiboot module Previously multiboot mods[0] is designed for kernel module for all pre-launched VMs including SOS VM, and mods[0].mm_string is used to store kernel cmdline. This design could not satisfy the requirement of hybrid mode scenarios that each VM might use their own kernel image also ramdisk image. To resolve this problem, we will use a tag in mods mm_string field to specify the module type. If the tag could be matched with os_config of VM configurations, the corresponding module would be loaded; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 09:40:52 +08:00
Victor Sun	d0fa83b2cb	HV: move sos bootargs to vm configurations Previously the bootargs of SOS_VM is stored in a text file and stitched into multiboot mods[0].string whereas the bootargs of PRE_LAUNCHED_VM is stored in vm_configurations.c. Given the mods[].string will be used to store Kernel image signature under hybrid mode, move the bootargs of SOS_VM to vm configurations also to make it consistent with PRE_LAUNCHED_VM; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-06-06 09:40:52 +08:00
Victor Sun	8256ba2015	HV: add board specific config header Use a misc_cfg.h in each board configs folder so that VM configurations could include board specific MACROs; Tracked-On: #3214 Signed-off-by: Victor Sun <victor.sun@intel.com>	2019-06-06 09:40:52 +08:00
Conghui Chen	376fcddff8	HV: vuart: add vuart_deinit during vm shutdown Add vuart_deinit to vm shutdown so that the vuart resource can be reset, and when the Guest VM restart, it could have right state. Tracked-On: #2987 Signed-off-by: Conghui Chen <conghui.chen@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-06-03 09:15:16 +08:00
Zhao Yakui	cee2f8b288	ACRN/HV: Refine the function of init_vboot to initialize the depriv_boot env correctly Currently when get_rsdp is called, the EFI depriv_boot env is not initialized. In such case it will fallback to the legacy mechanism of ACPI table. If the ACPI table based on legacy mechanism is not found, it will fail to get the ACPI table and then the system will hang. On the old platform it still can parse the ACPI table from legacy mechanism. In fact when EFI RSDP exists, the EFI RSDP is preferred instead of legacy ACPI RSDP. In order to avoid multiple calling of depriv_init_boot, the init_boot_operations is renamed and called after X2apic is enabled(early_init_lapic). Tracked-On: #3184 Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: Yin Fengwei <fengwei.yin@intel.com> Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>	2019-05-30 14:07:57 +08:00
Li, Fei1	c5d4365770	hv: vmcs: don't trap when setting reserved bit in cr0/cr4 According to Chap 23.8 RESTRICTIONS ON VMX OPERATION, Vol 3, SDM: "Any attempt to set one of these bits to an unsupported value while in VMX operation (including VMX root operation) using any of the CLTS, LMSW, or MOV CR instructions causes a general-protection exception." So we don't need to trap them out then inject the GP in hypervisor. Tracked-On: #2561 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-05-30 11:33:01 +08:00
Li, Fei1	f2c53a9891	hv: vmcs: trap CR4.SMAP/SMEP/PKE setting FuSa requires setting CR4.SMAP/SMEP/PKE will invalidate the TLB. However, setting CR4.SMAP will invalidate the TLB on native while not in non-root mode. To make sure this, we will trap CR4.SMAP/SMEP/PKE setting to invalidate the TLB in root mode. Tracked-On: #2561 Signed-off-by: Li, Fei1 <fei1.li@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-05-30 11:33:01 +08:00
Sainath Grandhi	a7389686a7	hv: Precondition checks for vcpu_from_vid for lapic passthrough ICR access Since the vapic_id is from VM, need to check for pre-condition before passing vcpu_id to vcpu_from_vid. This is in the path of LAPIC passthrough ICR access Tracked-On: #3170 Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Acked-by: Anthony Xu <anthony.xu@intel.com>	2019-05-30 10:10:21 +08:00
Binbin Wu	7a915dc397	hv: vmsr: present sgx related msr to guest Present SGX related MSRs to guest if SGX is supported. - MSR_IA32_SGXLEPUBKEYHASH0 ~ MSR_IA32_SGXLEPUBKEYHASH3: SGX Launch Control is not supported, so these MSRs are read only. - MSR_IA32_SGX_SVN_STATUS: read only - MSR_IA32_FEATURE_CONTROL: If SGX is support in VM, opt-in SGX in this MSR. - MSR_SGXOWNEREPOCH0 ~ MSR_SGXOWNEREPOCH1: The two MSRs' scope is package level, not allow guest to change them. Still leave them in unsupported_msrs array. Tracked-On: #3179 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-29 11:24:13 +08:00
Binbin Wu	1724996bc5	hv: vcpuid: present sgx capabilities to guest If sgx is supported in guest, present SGX capabilities to guest. There will be only one EPC section presented to guest, even if EPC memory for a guest is from muiltiple physcial EPC sections. Tracked-On: #3179 Signed-off-by: Binbin Wu <binbin.wu@intel.com> Reviewed-by: Yan, Like <like.yan@intel.com> Acked-by: Eddie Dong <eddie.dong@intel.com>	2019-05-29 11:24:13 +08:00

... 3 4 5 6 7 ...

1924 Commits