Commit Graph

2472 Commits

Author SHA1 Message Date
Jian Jun Chen
5bf1f04ad7 hv: instr_emul: add emulation for 0xf6 test instruction
It is found that 0xf6 test instruction is used to access MMIO in
Windows. This patch added emulation for 0xf6 test instruction.

Tracked-On: #4310
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
2019-12-30 09:26:05 +08:00
Victor Sun
ee74737fd2 HV: search rsdp from e820 acpi reclaim region
Per ACPI 6.2 spec, chapter 5.2.5.2 "Finding the RSDP on UEFI Enabled Systems":

In Unified Extensible Firmware Interface (UEFI) enabled systems, a pointer to
the RSDP structure exists within the EFI System Table. The OS loader is provided
a pointer to the EFI System Table at invocation. The OS loader must retrieve the
pointer to the RSDP structure from the EFI System Table and convey the pointer
to OSPM, using an OS dependent data structure, as part of the hand off of
control from the OS loader to the OS.

So when ACRN boot from direct mode on a UEFI enabled system, hypervisor might
be failed to get rsdp by seaching rsdp in legacy EBDA or 0xe0000~0xfffff region,
but it still have chance to get rsdp by seaching it in e820 ACPI reclaimable
region with some edk2 based BIOS.

The patch will search rsdp from e820 ACPI reclaim region When failed to get
rsdp from legacy region.

Tracked-On: #4301

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-25 15:06:20 +08:00
Li Fei1
7d27c4bcb7 hv: vpci: restore PCI BARs when doing AF FLR
ACRN hypervisor should trap guest doing PCI AF FLR. Besides, it should save some states
before doing the FLR and restore them later, only BARs values for now.
This patch will trap guest Conventional PCI Advanced Features Control Register write
operation if the device supports Conventional PCI Advanced Features Capability and
check whether it wants to do device AF FLR. If it does, call pdev_do_flr to do the job.

Tracked-On: #3465
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-12-20 13:09:42 +08:00
Li Fei1
bb06f6f9bb hv: vpci: restore PCI BARs when doing PCIe FLR
ACRN hypervisor should trap guest doing PCIe FLR. Besides, it should save some states
before doing the FLR and restore them later, only BARs values for now.
This patch will trap guest Device Capabilities Register write operation if the device
supports PCI Express Capability and check whether it wants to do device FLR. If it does,
call pdev_do_flr to do the job.

Tracked-On: #3465
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-12-20 13:09:42 +08:00
Conghui Chen
92ed860187 hv: hotfix for xsave
In current code, XCR0 and XSS are not in default value during vcpu
launch, it will result in a warning in Linux:

     WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:614
     fpu__init_system_xstate+0x43a/0x878

 For security reason, we set XCR0 and XSS with feature bitmap get from
 CPUID, and run XRSTORS in context switch in. This make sure the XSAVE
 area to be fully in initiate state.
 But, before enter guest for the first time, XCR0 and XSS should be set to
 default value, as the guest kernel assume it.

Tracked-On: #4278
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
2019-12-19 15:25:12 +08:00
Kaige Fu
686d776313 HV: Remove INIT signal notification related code
We don't use INIT signal notification method now. This patch
removes them.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-16 16:54:30 +08:00
Kaige Fu
d7eb14c511 HV: Use NMI to replace INIT signal for lapic-pt VMs S5
We have implemented a new notification method using NMI.
So replace the INIT notification method with the NMI one.
Then we can remove INIT notification related code later.

Tracked-On: #3886
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-16 16:54:30 +08:00
Kaige Fu
29b7aff59f HV: Use NMI-window exiting to address req missing issue
There is a window where we may miss the current request in the
notification period when the work flow is as the following:

      CPUx +                   + CPUr
           |                   |
           |                   +--+
           |                   |  | Handle pending req
           |                   <--+
           +--+                |
           |  | Set req flag   |
           <--+                |
           +------------------>---+
           |     Send NMI      |  | Handle NMI
           |                   <--+
           |                   |
           |                   |
           |                   +--> vCPU enter
           |                   |
           +                   +

So, this patch enables the NMI-window exiting to trigger the next vmexit
once there is no "virtual-NMI blocking" after vCPU enter into VMX non-root
mode. Then we can process the pending request on time.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-16 16:54:30 +08:00
Kaige Fu
d26d8bec01 HV: Don't make NMI injection req when notifying vCPU
The NMI for notification should not be inject to guest. So,
this patch drops NMI injection request when we use NMI
to notify vCPUs. Meanwhile, ACRN doesn't support vNMI well
and there is no well-designed way to check if the NMI is
for notification or for guest now. So, we take all the NMIs as
notificaton NMI for hard rtvm temporarily. It means that the
hard rtvm will never receive NMI with this patch applied.

TODO: vNMI support is not ready yet. we will add it later.

Tracked-On: #3886
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-16 16:54:30 +08:00
Kaige Fu
24c2c0ecc0 HV: Use NMI to kick lapic-pt vCPU's thread
ACRN hypervisor needs to kick vCPU off VMX non-root mode to do some
operations in hypervisor, such as interrupt/exception injection, EPT
flush etc. For non lapic-pt vCPUs, we can use IPI to do so. But, it
doesn't work for lapic-pt vCPUs as the IPI will be injected to VMs
directly without vmexit.

Without the way to kick the vCPU off VMX non-root mode to handle pending
request on time, there may be fatal errors triggered.
1). Certain operation may not be carried out on time which may further
    lead to fatal errors. Taking the EPT flush request as an example, once we
    don't flush the EPT on time and the guest access the out-of-date EPT,
    fatal error happens.
2). ACRN now will send an IPI with vector 0xF0 to target vCPU to kick the vCPU
    off VMX non-root mode if it wants to do some operations on target vCPU.
    However, this way doesn't work for lapic-pt vCPUs. The IPI will be delivered
    to the guest directly without vmexit and the guest will receive a unexpected
    interrupt. Consequently, if the guest can't handle this interrupt properly,
    fatal error may happen.

The NMI can be used as the notification signal to kick the vCPU off VMX
non-root mode for lapic-pt vCPUs. So, this patch uses NMI as notification signal
to address the above issues for lapic-pt vCPUs.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-16 16:54:30 +08:00
Yonghua Huang
d52b45c1f5 hv:fix crash issue when handling HC_NOTIFY_REQUEST_FINISH
Input 'vcpu_id' and the state of target vCPU should be validated
properly:

  - 'vcpu_id' shall be less than 'vm->hw.created_vcpus' instead
     of 'MAX_VCPUS_PER_VM'.
  - The state of target vCPU should be "VCPU_PAUSED", and reject
    all other states.

Tracked-On: #4245
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-16 15:15:18 +08:00
Victor Sun
78139b955c HV: kconfig: add range check for memory setting
When user use make menuconfig to configure memory related kconfig items,
we need add range check to avoid compile error or other potential issues:

CONFIG_LOW_RAM_SIZE:(0 ~ 0x10000)
		the value should be less than 64KB;

CONFIG_HV_RAM_SIZE: (0x1000000 ~ 0x10000000)
		the hypervisor RAM size should be supposed between
		16MB to 256MB;

CONFIG_PLATFORM_RAM_SIZE: (0x100000000 ~ 0x4000000000)
		the platform RAM size should be larger than 4GB
		and less than 256GB;

CONFIG_SOS_RAM_SIZE: (0x100000000 ~ 0x4000000000)
		the SOS RAM size should be larger than 4GB
		and less than 256GB;

CONFIG_UOS_RAM_SIZE: (0 ~ 0x2000000000)
		the UOS RAM size should be less than 128GB;

Tracked-On: #4229

Signed-off-by: Victor Sun <victor.sun@intel.com>
2019-12-16 15:14:55 +08:00
Victor Sun
249947030b HV: Kconfig: set default Kata num to 1 in SDC
Set default CONFIG_KATA_VM_NUM to 1 in SDC scenario so that user could
have a try on Kata container without rebuilding hypervisor.

Please be aware that vcpu affinity of VM1 in CPU partition mode
would be impacted by this patch.

Tracked-On: #4232

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-16 15:14:42 +08:00
Jian Jun Chen
9d5e72e9c9 hv: add lock for ept add/modify/del
EPT table can be changed concurrently by more than one vcpus.
This patch add a lock to protect the add/modify/delete operations
from different vcpus concurrently.

Tracked-On: #4253
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
2019-12-16 14:41:21 +08:00
Yonghua Huang
05682b2bad hv:bugfix in write protect page hypercall
This patch fixes potential hypervisor crash when
 calling hcall_write_protect_page() with a crafted
 GPA in 'struct wp_data' instance, e.g. an invalid
 GPA that is not in the scope of the target VM's
 EPT address space.

 To check the validity for this GPA  before updating
 the 'write protect' page.

Tracked-On: #4240
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
2019-12-13 10:42:31 +08:00
Kaige Fu
2777f23075 HV: Add helper function send_single_nmi
This patch adds a helper function send_single_nmi. The fisrt caller
will soon come with the following patch.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-13 10:13:09 +08:00
Kaige Fu
525d4d3cd0 HV: Install a NMI handler in acrn IDT
This patch installs a NMI handler in acrn IDT to handle
NMIs out of dispatch_exception.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-13 10:13:09 +08:00
Kaige Fu
fb346a6c11 HV: refine excp/external_interrupt_save_frame and excp_rsvd
There are lines of repeated codes in excp/external_interrupt_save_frame
and excp_rsvd. So, this patch defines two .macro, save_frame and restore_frame,
to reduce the repeated codes.

No functional change.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-13 10:13:09 +08:00
Mingqiang Chi
7f96465407 hv:remove need_cleanup flag in create_vm
remove this redundancy flag.

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-12 16:34:13 +08:00
Victor Sun
67ec1b7708 HV: expose port 0x64 read for SOS VM
The port 0x64 is the status register of i8042 keyboard controller. When
i8042 is defined as ACPI PnP device in BIOS, enforce returning 0xff in
read handler would cause infinite loop when booting SOS VM, so expose
the physical port read in this case;

Tracked-On: #4228

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-12 13:51:24 +08:00
Victor Sun
a44c1c900c HV: Kconfig: remove MAX_VCPUS_PER_VM in Kconfig
In current architecutre, the maximum vCPUs number per VM could not
exceed the pCPUs number. Given the MAX_PCPU_NUM macro is provided
in board configurations, so remove the MAX_VCPUS_PER_VM from Kconfig
and add a macro of MAX_VCPUS_PER_VM to reference MAX_PCPU_NUM directly.

Tracked-On: #4230

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-12 13:49:28 +08:00
Victor Sun
ea3476d22d HV: rename CONFIG_MAX_PCPU_NUM to MAX_PCPU_NUM
rename the macro since MAX_PCPU_NUM could be parsed from board file and
it is not a configurable item anymore.

Tracked-On: #4230

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-12 13:49:28 +08:00
Mingqiang Chi
b6bffd01ff hv:remove 2 unused variables in vm_arch structure
remove 'guest_init_pml4' and 'tmp_pg_array' in vm_arch
since they are not used.

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2019-12-12 10:13:11 +08:00
Shiqing Gao
e95b316dd0 hv: vtd: fix improper use of DMAR_GCMD_REG
The initialization of "dmar_unit->gcmd" shall be done via reading from
Global Status Register rather than Global Command Register.

Rationale:
According to Chapter 10.4.4 Global Command Register in VT-d spec, Global Command
Register is a write-only register to control remapping hardware.
Global Status Register is the corresponding read-only register to report remapping
hardware status.

Tracked-On: #1842
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
2019-12-12 09:11:04 +08:00
Vijay Dhanraj
c8a4ca6c78 HV: Extend non-contiguous HPA for hybrid scenario
This patch extends non-contiguous HPA allocations for
pre-launched VMs in hybrid scenario.

Tracked-On: #4217
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-11 10:12:46 +08:00
Shuo A Liu
b32ae229fb hv: sched: use hypervisor configuration to choose scheduler
For now, we set NOOP scheduler as default. User can choose IORR scheduler as needed.

Tracked-On: #4178
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-11 09:31:39 +08:00
Shuo A Liu
6a144e6e3e hv: sched: add yield support
Add yield support for schedule, which can give up pcpu proactively.

Tracked-On: #4178
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-11 09:31:39 +08:00
Shuo A Liu
6554437cc0 hv: sched_iorr: add some interfaces implementation of sched_iorr
Implement .sleep/.wake/.pick_next of sched_iorr.
In .pick_next, we count current object's timeslice and pick the next
avaiable one. The policy is
  1) get the first item in runqueue firstly
  2) if object picked has no time_cycles, replenish it pick this one
  3) At least take one idle sched object if we have no runnable object
     after step 1) and 2)
In .wake, we start the tick if we have more than one active
thread_object in runqueue. In .sleep, stop the tick timer if necessary.

Tracked-On: #4178
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
2019-12-11 09:31:39 +08:00
Shuo A Liu
b39630a8e0 hv: sched_iorr: add tick handler and runqueue operations
sched_control is per-pcpu, each sched_control has a tick timer running
periodically. Every period called a tick. In tick handler, we do
  1) compute left timeslice of current thread_object if it's not the idle
  2) make a schedule request if current thread_object run out of timeslice

For runqueue maintaining, we will keep objects which has timeslice in
the front of runqueue and the ones get new replenished in tail.

Tracked-On: #4178
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
2019-12-11 09:31:39 +08:00
Shuo A Liu
f44aa4e4c9 hv: sched_iorr: add init functions of sched_iorr
We set timeslice to 10ms as default, and set tick interval to 1ms.
When init sched_iorr scheduler, we init a periodic timer as the tick and
init the runqueue to maintain objects in the sched_control. Destroy
the timer in deinit.

Tracked-On: #4178
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-11 09:31:39 +08:00
Shuo A Liu
ed4008630d hv: sched_iorr: Add IO sensitive Round-robin scheduler
IO sensitive Round-robin scheduler aim to schedule threads with
round-robin policy. Meanwhile, we also enhance it with some fairness
configuration, such as thread will be scheduled out without properly
timeslice. IO request on thread will be handled in high priority.

This patch only add a skeleton for the sched_iorr scheduler.

Tracked-On: #4178
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-11 09:31:39 +08:00
Gary
3c8d465a11 acrnboot: correct the calculation of the end boundry of _DYNAMIC region
The calculation of the end boundry address is corrected by
    adding the size extracted from _DYNAMIC to start address
    in type of uint8_t while improving the code by calulating
    the end boundry address after scanning, also reducing type
    casts accordingly.

Tracked-On: projectacrn#4191
Signed-off-by: Gary <gordon.king@intel.com>
2019-12-11 09:31:24 +08:00
Li Fei1
c2c05a29da hv: vlapic: kick targeted vCPU off if interrupt trigger mode has changed
In APICv advanced mode, an targeted vCPU, running in non-root mode, may get outdated
TMR and EOI exit bitmap if another vCPU sends an interrupt to it if the trigger mode
of this interrupt has changed.
This patch try to kick vCPU off to let it get the latest TMR and EOI exit bitmap when
it enters non-root mode again if new coming interrupt trigger mode has changed. Then
fill the interrupt to PIR.

Tracked-On: #4200
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-12-10 09:07:54 +08:00
Vijay Dhanraj
ed65ae61c6 HV: Kconfig changes to support server platform.
This patch updates kconfig to support server platforms
for increased number of VCPUs per VM and PT IRQ number.

Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Tracked-On: #4196
2019-12-09 11:29:34 +08:00
Vijay Dhanraj
6e8b413689 HV: Add support to assign non-contiguous HPA regions for pre-launched VM
On some platforms, HPA regions for Virtual Machine can not be
contiguous because of E820 reserved type or PCI hole. In such
cases, pre-launched VMs need to be assigned non-contiguous memory
regions and this patch addresses it.

To keep things simple, current design has the following assumptions,
	1. HPA2 always will be placed after HPA1
	2. HPA1 and HPA2 don’t share a single ve820 entry.
	(Create multiple entries if needed but not shared)
	3. Only support 2 non-contiguous HPA regions (can extend
	at a later point for multiple non-contiguous HPA)

Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Tracked-On: #4195
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-12-09 11:28:38 +08:00
Zide Chen
03a1b2a717 hypervisor: handle reboot from non-privileged pre-launched guests
To handle reboot requests from pre-launched VMs that don't have
GUEST_FLAG_HIGHEST_SEVERITY, we shutdown the target VM explicitly
other than ignoring them.

Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-12-09 11:27:32 +08:00
Li Fei1
da3ba68cb6 hv: remove corner case in ptirq_prepare_msix_remap
ptirq_prepare_msix_remap was called no matter whether MSI/MSI-X was enabled or not
and it passed zero to input parameter virtual MSI/MSI-X data field to indicate
MSI/MSI-X was disabled. However, it barely did nothing on this case.
Now ptirq_prepare_msix_remap is called only when  MSI/MSI-X is enabled. It doesn't
need to check whether MSI/MSI-X is enabled or not by checking virtual MSI/MSI-X
data field.

Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-12-05 16:43:22 +08:00
Li Fei1
c05d9f8086 hv: vmsix: refine vmsix remap
Do vMSI-X remap only when Mask Bit in Vector Control Register for MSI-X Table Entry
is unmask.
The previous implementation also has two issues:
1. It only check whether Message Control Register for MSI-X has been modified when
guest writes MSI-X CFG space at Message Control Register offset.
2. It doesn't really disable MSI-X when guest wants to disable MSI-X.

Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-12-05 16:43:22 +08:00
Li Fei1
5f5ba1d647 hv: vmsi: refine write_vmsi_cfg implementation
1. disable physical MSI before writing the virtual MSI CFG space
2. do the remap_vmsi if the guest wants to enable MSI or update MSI address or data
3. disable INTx and enable MSI after step 2.

The previous Message Control check depends on the guest write MSI Message Control
Register at the offset of Message Control Register. However, the guest could access
this register at the offset of MSI Capability ID register. This patch remove this
constraint. Also, The previous implementation didn't really disable MSI when guest
wanted to disable MSI.

Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-12-05 16:43:22 +08:00
Shuo A Liu
72644ac2b2 hv: do not sleep a non-RUNNING vcpu
It's meaningless to sleep a non-running vcpu. Add a state check before
sleep the thread object of the vcpu.

Tracked-On: #4178
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-05 11:19:35 +08:00
Shuo A Liu
d624eb5e6c hv: io: do schedule in IO completion polling loop
Now, we support schedule inplace. And with cpu sharing, there might be
multi vcpu running on same pcpu. Reschedule request will happen when
switch the running vcpu. If the current vcpu is polling on the IO
completion, it need to be scheduled back to the polling point.

In the polling path, construct a loop for polling, and do schedule in the
loop if needed.

Tracked-On: #4178
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-05 11:19:35 +08:00
Conghui Chen
d48da2af3a hv: bugfix for debug commands with smp_call
With cpu-sharing enabled, there are more than 1 vcpu on 1 pcpu, so the
smp_call handler should switch the vmcs to the target vcpu's vmcs. Then
get the info.

dump_vcpu_reg and dump_guest_mem should run on certain vmcs, otherwise,
there will be #GP error.

Renaming:
vcpu_dumpreg -> dump_vcpu_reg
switch_vmcs -> load_vmcs

Tracked-On: #4178
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-05 11:19:35 +08:00
Shuo A Liu
47139bd78c hv: print current sched_object in acrn logmsg
Add a header field in acrnlog message to indicate the current
running thread.

Tracked-On: #4178
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-05 11:19:35 +08:00
Kaige Fu
aae974b473 HV: trace leaf and subleaf of cpuid
We care more about leaf and subleaf of cpuid than vcpu_id.
So, this patch changes the cpuid trace-entry to trace the leaf
and subleaf of this cpuid vmexit.

Tracked-On: #4175
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-03 16:34:14 +08:00
Yonghua Huang
450d2cf2e9 hv: trap RDPMC instruction execution from any guest
PMU is hidden from any guest, UD is expected when guest
try to execute 'rdpmc' instruction.

this patch sets 'RDPMC exiting' in Processorbased
VM-execution control.

Tracked-On: #3453
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-03 14:14:27 +08:00
Binbin Wu
3d412266bc hv: ept: build 4KB page mapping in EPT for RTVM for MCE on PSC
Deterministic is important for RTVM. The mitigation for MCE on
Page Size Change converts a large page to 4KB pages runtimely during
the vmexit triggered by the instruction fetch in the large page.
These vmexits increase nondeterminacy, which should be avoided for RTVM.
This patch builds 4KB page mapping in EPT for RTVM to avoid these vmexits.

Tracked-On: #4101
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-03 09:17:04 +08:00
Binbin Wu
0570993b40 hv: config: add an option to disable mce on psc workaround
Add a option MCE_ON_PSC_WORKAROUND_DISABLED to disable the software
workaround for the issue Machine Check Error on Page Size Change.

Tracked-On: #4101
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-03 09:17:04 +08:00
Binbin Wu
192859ee02 hv: ept: apply MCE on page size change mitigation conditionally
Only apply the software workaround on the models that might be
affected by MCE on page size change. For these models that are
known immune to the issue, the mitigation is turned off.

Atom processors are not afftected by the issue.
Also check the CPUID & MSR to check whether the model is immune to the issue:
CPU is not vulnerable when both CPUID.(EAX=07H,ECX=0H).EDX[29] and
IA32_ARCH_CAPABILITIES[IF_PSCHANGE_MC_NO] are 1.

Other cases not listed above, CPU may be vulnerable.

This patch also changes MACROs for MSR IA32_ARCH_CAPABILITIES bits to UL instead of U
since the MSR is 64bit.

Tracked-On: #4101
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-03 09:17:04 +08:00
Shuo A Liu
3cb32bb6e3 hv: make init_vmcs as a event of VCPU
After changing init_vmcs to smp call approach and do it before
launch_vcpu, it could work with noop scheduler. On real sharing
scheudler, it has problem.

   pcpu0                  pcpu1            pcpu1
 vmBvcpu0                vmAvcpu1         vmBvcpu1
                         vmentry
init_vmcs(vmBvcpu1) vmexit->do_init_vmcs
                    corrupt current vmcs
                        vmentry fail
launch_vcpu(vmBvcpu1)

This patch mark a event flag when request vmcs init for specific vcpu. When
it is running and checking pending events, will do init_vmcs firstly.

Tracked-On: #4178
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-02 16:20:43 +08:00
Victor Sun
15da33d8af HV: parse default pci mmcfg base
The default PCI mmcfg base is stored in ACPI MCFG table, when
CONFIG_ACPI_PARSE_ENABLED is set, acpi_fixup() function will
parse and fix up the platform mmcfg base in ACRN boot stage;
when it is not set, platform mmcfg base will be initialized to
DEFAULT_PCI_MMCFG_BASE which generated by acrn-config tool;

Please note we will not support platform which has multiple PCI
segment groups.

Tracked-On: #4157

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-02 16:20:24 +08:00