Commit Graph

1722 Commits

Author SHA1 Message Date
Junming Liu
9dd6ef600d Always disable the iommu_snoop when enabling the IOMMU for GPU
The IOMMU for GPU doesn't support the snoop_control.
So the iommu_snoop is disabled.

This can fix the issue of consolefb in SOS.

Signed-off-by: Junming Liu <junming.liu@intel.com>

Tracked-On: #4360
2020-01-23 15:15:38 +08:00
Shuo A Liu
e15bb5f391 hv: Disable HLT and PAUSE-loop exiting emulation in lapic passthrough
In lapic passthrough mode, it should passthrough HLT/PAUSE execution
too. This patch disable their emulation when switch to lapic passthrough mode.

Tracked-On: #4329
Tested-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-01-23 13:37:53 +08:00
Shuo A Liu
655032e020 hv: HLT emulation in hypervisor
HLT emulation is import to CPU resource maximum utilization. vcpu
doing HLT means it is idle and can give up CPU proactively. Thus, we
pause the vcpu thread in HLT emulation and resume it while event happens.

When vcpu enter HLT, its vcpu thread will sleep, but the vcpu state is
still 'Running'.

VM ID    PCPU ID    VCPU ID    VCPU ROLE    VCPU STATE
=====    =======    =======    =========    ==========
  0         0          0       PRIMARY      Running
  0         1          1       SECONDARY    Running

Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-01-23 13:37:53 +08:00
Shuo A Liu
55b148ab01 hv: Add vlapic_has_pending_intr of apicv to check pending interrupts
Sometimes HV wants to know if there are pending interrupts of one vcpu.
Add .has_pending_intr interface in acrn_apicv_ops and return the pending
interrupts status by check IRRs of apicv.

Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-01-23 13:37:53 +08:00
Shuo A Liu
8e74c2cfd5 hv: vcpu: wait and signal vcpu event support
Introduce two kinds of events for each vcpu,
  VCPU_EVENT_IOREQ: for vcpu waiting for IO request completion
  VCPU_EVENT_VIRTUAL_INTERRUPT: for vcpu waiting for virtual interrupts events
vcpu can wait for such events, and resume to run when the
event get signalled.

This patch also change IO request waiting/notifying to this way.

Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-01-23 13:37:53 +08:00
Shuo A Liu
5352463ceb hv: PAUSE-loop exiting support in hypervisor
As we enabled cpu sharing, PAUSE-loop exiting can help vcpu
to release its pcpu proactively. It's good for performance.

VMX_PLE_GAP: upper bound on the amount of time between two successive
executions of PAUSE in a loop.
VMX_PLE_WINDOW: upper bound on the amount of time a guest is allowed to
execute in a PAUSE loop

Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-01-23 13:37:53 +08:00
Jian Jun Chen
5bf1f04ad7 hv: instr_emul: add emulation for 0xf6 test instruction
It is found that 0xf6 test instruction is used to access MMIO in
Windows. This patch added emulation for 0xf6 test instruction.

Tracked-On: #4310
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
2019-12-30 09:26:05 +08:00
Li Fei1
bb06f6f9bb hv: vpci: restore PCI BARs when doing PCIe FLR
ACRN hypervisor should trap guest doing PCIe FLR. Besides, it should save some states
before doing the FLR and restore them later, only BARs values for now.
This patch will trap guest Device Capabilities Register write operation if the device
supports PCI Express Capability and check whether it wants to do device FLR. If it does,
call pdev_do_flr to do the job.

Tracked-On: #3465
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-12-20 13:09:42 +08:00
Conghui Chen
92ed860187 hv: hotfix for xsave
In current code, XCR0 and XSS are not in default value during vcpu
launch, it will result in a warning in Linux:

     WARNING: CPU: 0 PID: 0 at arch/x86/kernel/fpu/xstate.c:614
     fpu__init_system_xstate+0x43a/0x878

 For security reason, we set XCR0 and XSS with feature bitmap get from
 CPUID, and run XRSTORS in context switch in. This make sure the XSAVE
 area to be fully in initiate state.
 But, before enter guest for the first time, XCR0 and XSS should be set to
 default value, as the guest kernel assume it.

Tracked-On: #4278
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
2019-12-19 15:25:12 +08:00
Kaige Fu
686d776313 HV: Remove INIT signal notification related code
We don't use INIT signal notification method now. This patch
removes them.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-16 16:54:30 +08:00
Kaige Fu
d7eb14c511 HV: Use NMI to replace INIT signal for lapic-pt VMs S5
We have implemented a new notification method using NMI.
So replace the INIT notification method with the NMI one.
Then we can remove INIT notification related code later.

Tracked-On: #3886
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-16 16:54:30 +08:00
Kaige Fu
29b7aff59f HV: Use NMI-window exiting to address req missing issue
There is a window where we may miss the current request in the
notification period when the work flow is as the following:

      CPUx +                   + CPUr
           |                   |
           |                   +--+
           |                   |  | Handle pending req
           |                   <--+
           +--+                |
           |  | Set req flag   |
           <--+                |
           +------------------>---+
           |     Send NMI      |  | Handle NMI
           |                   <--+
           |                   |
           |                   |
           |                   +--> vCPU enter
           |                   |
           +                   +

So, this patch enables the NMI-window exiting to trigger the next vmexit
once there is no "virtual-NMI blocking" after vCPU enter into VMX non-root
mode. Then we can process the pending request on time.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-16 16:54:30 +08:00
Kaige Fu
d26d8bec01 HV: Don't make NMI injection req when notifying vCPU
The NMI for notification should not be inject to guest. So,
this patch drops NMI injection request when we use NMI
to notify vCPUs. Meanwhile, ACRN doesn't support vNMI well
and there is no well-designed way to check if the NMI is
for notification or for guest now. So, we take all the NMIs as
notificaton NMI for hard rtvm temporarily. It means that the
hard rtvm will never receive NMI with this patch applied.

TODO: vNMI support is not ready yet. we will add it later.

Tracked-On: #3886
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-16 16:54:30 +08:00
Kaige Fu
24c2c0ecc0 HV: Use NMI to kick lapic-pt vCPU's thread
ACRN hypervisor needs to kick vCPU off VMX non-root mode to do some
operations in hypervisor, such as interrupt/exception injection, EPT
flush etc. For non lapic-pt vCPUs, we can use IPI to do so. But, it
doesn't work for lapic-pt vCPUs as the IPI will be injected to VMs
directly without vmexit.

Without the way to kick the vCPU off VMX non-root mode to handle pending
request on time, there may be fatal errors triggered.
1). Certain operation may not be carried out on time which may further
    lead to fatal errors. Taking the EPT flush request as an example, once we
    don't flush the EPT on time and the guest access the out-of-date EPT,
    fatal error happens.
2). ACRN now will send an IPI with vector 0xF0 to target vCPU to kick the vCPU
    off VMX non-root mode if it wants to do some operations on target vCPU.
    However, this way doesn't work for lapic-pt vCPUs. The IPI will be delivered
    to the guest directly without vmexit and the guest will receive a unexpected
    interrupt. Consequently, if the guest can't handle this interrupt properly,
    fatal error may happen.

The NMI can be used as the notification signal to kick the vCPU off VMX
non-root mode for lapic-pt vCPUs. So, this patch uses NMI as notification signal
to address the above issues for lapic-pt vCPUs.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-16 16:54:30 +08:00
Victor Sun
78139b955c HV: kconfig: add range check for memory setting
When user use make menuconfig to configure memory related kconfig items,
we need add range check to avoid compile error or other potential issues:

CONFIG_LOW_RAM_SIZE:(0 ~ 0x10000)
		the value should be less than 64KB;

CONFIG_HV_RAM_SIZE: (0x1000000 ~ 0x10000000)
		the hypervisor RAM size should be supposed between
		16MB to 256MB;

CONFIG_PLATFORM_RAM_SIZE: (0x100000000 ~ 0x4000000000)
		the platform RAM size should be larger than 4GB
		and less than 256GB;

CONFIG_SOS_RAM_SIZE: (0x100000000 ~ 0x4000000000)
		the SOS RAM size should be larger than 4GB
		and less than 256GB;

CONFIG_UOS_RAM_SIZE: (0 ~ 0x2000000000)
		the UOS RAM size should be less than 128GB;

Tracked-On: #4229

Signed-off-by: Victor Sun <victor.sun@intel.com>
2019-12-16 15:14:55 +08:00
Victor Sun
249947030b HV: Kconfig: set default Kata num to 1 in SDC
Set default CONFIG_KATA_VM_NUM to 1 in SDC scenario so that user could
have a try on Kata container without rebuilding hypervisor.

Please be aware that vcpu affinity of VM1 in CPU partition mode
would be impacted by this patch.

Tracked-On: #4232

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-16 15:14:42 +08:00
Jian Jun Chen
9d5e72e9c9 hv: add lock for ept add/modify/del
EPT table can be changed concurrently by more than one vcpus.
This patch add a lock to protect the add/modify/delete operations
from different vcpus concurrently.

Tracked-On: #4253
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
2019-12-16 14:41:21 +08:00
Kaige Fu
2777f23075 HV: Add helper function send_single_nmi
This patch adds a helper function send_single_nmi. The fisrt caller
will soon come with the following patch.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-13 10:13:09 +08:00
Kaige Fu
525d4d3cd0 HV: Install a NMI handler in acrn IDT
This patch installs a NMI handler in acrn IDT to handle
NMIs out of dispatch_exception.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-13 10:13:09 +08:00
Kaige Fu
fb346a6c11 HV: refine excp/external_interrupt_save_frame and excp_rsvd
There are lines of repeated codes in excp/external_interrupt_save_frame
and excp_rsvd. So, this patch defines two .macro, save_frame and restore_frame,
to reduce the repeated codes.

No functional change.

Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-13 10:13:09 +08:00
Mingqiang Chi
7f96465407 hv:remove need_cleanup flag in create_vm
remove this redundancy flag.

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-12 16:34:13 +08:00
Victor Sun
67ec1b7708 HV: expose port 0x64 read for SOS VM
The port 0x64 is the status register of i8042 keyboard controller. When
i8042 is defined as ACPI PnP device in BIOS, enforce returning 0xff in
read handler would cause infinite loop when booting SOS VM, so expose
the physical port read in this case;

Tracked-On: #4228

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-12 13:51:24 +08:00
Victor Sun
a44c1c900c HV: Kconfig: remove MAX_VCPUS_PER_VM in Kconfig
In current architecutre, the maximum vCPUs number per VM could not
exceed the pCPUs number. Given the MAX_PCPU_NUM macro is provided
in board configurations, so remove the MAX_VCPUS_PER_VM from Kconfig
and add a macro of MAX_VCPUS_PER_VM to reference MAX_PCPU_NUM directly.

Tracked-On: #4230

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-12 13:49:28 +08:00
Victor Sun
ea3476d22d HV: rename CONFIG_MAX_PCPU_NUM to MAX_PCPU_NUM
rename the macro since MAX_PCPU_NUM could be parsed from board file and
it is not a configurable item anymore.

Tracked-On: #4230

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-12 13:49:28 +08:00
Shiqing Gao
e95b316dd0 hv: vtd: fix improper use of DMAR_GCMD_REG
The initialization of "dmar_unit->gcmd" shall be done via reading from
Global Status Register rather than Global Command Register.

Rationale:
According to Chapter 10.4.4 Global Command Register in VT-d spec, Global Command
Register is a write-only register to control remapping hardware.
Global Status Register is the corresponding read-only register to report remapping
hardware status.

Tracked-On: #1842
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
2019-12-12 09:11:04 +08:00
Vijay Dhanraj
c8a4ca6c78 HV: Extend non-contiguous HPA for hybrid scenario
This patch extends non-contiguous HPA allocations for
pre-launched VMs in hybrid scenario.

Tracked-On: #4217
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-11 10:12:46 +08:00
Shuo A Liu
b32ae229fb hv: sched: use hypervisor configuration to choose scheduler
For now, we set NOOP scheduler as default. User can choose IORR scheduler as needed.

Tracked-On: #4178
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-11 09:31:39 +08:00
Li Fei1
c2c05a29da hv: vlapic: kick targeted vCPU off if interrupt trigger mode has changed
In APICv advanced mode, an targeted vCPU, running in non-root mode, may get outdated
TMR and EOI exit bitmap if another vCPU sends an interrupt to it if the trigger mode
of this interrupt has changed.
This patch try to kick vCPU off to let it get the latest TMR and EOI exit bitmap when
it enters non-root mode again if new coming interrupt trigger mode has changed. Then
fill the interrupt to PIR.

Tracked-On: #4200
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-12-10 09:07:54 +08:00
Vijay Dhanraj
ed65ae61c6 HV: Kconfig changes to support server platform.
This patch updates kconfig to support server platforms
for increased number of VCPUs per VM and PT IRQ number.

Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Tracked-On: #4196
2019-12-09 11:29:34 +08:00
Vijay Dhanraj
6e8b413689 HV: Add support to assign non-contiguous HPA regions for pre-launched VM
On some platforms, HPA regions for Virtual Machine can not be
contiguous because of E820 reserved type or PCI hole. In such
cases, pre-launched VMs need to be assigned non-contiguous memory
regions and this patch addresses it.

To keep things simple, current design has the following assumptions,
	1. HPA2 always will be placed after HPA1
	2. HPA1 and HPA2 don’t share a single ve820 entry.
	(Create multiple entries if needed but not shared)
	3. Only support 2 non-contiguous HPA regions (can extend
	at a later point for multiple non-contiguous HPA)

Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Tracked-On: #4195
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-12-09 11:28:38 +08:00
Zide Chen
03a1b2a717 hypervisor: handle reboot from non-privileged pre-launched guests
To handle reboot requests from pre-launched VMs that don't have
GUEST_FLAG_HIGHEST_SEVERITY, we shutdown the target VM explicitly
other than ignoring them.

Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-12-09 11:27:32 +08:00
Li Fei1
da3ba68cb6 hv: remove corner case in ptirq_prepare_msix_remap
ptirq_prepare_msix_remap was called no matter whether MSI/MSI-X was enabled or not
and it passed zero to input parameter virtual MSI/MSI-X data field to indicate
MSI/MSI-X was disabled. However, it barely did nothing on this case.
Now ptirq_prepare_msix_remap is called only when  MSI/MSI-X is enabled. It doesn't
need to check whether MSI/MSI-X is enabled or not by checking virtual MSI/MSI-X
data field.

Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-12-05 16:43:22 +08:00
Shuo A Liu
72644ac2b2 hv: do not sleep a non-RUNNING vcpu
It's meaningless to sleep a non-running vcpu. Add a state check before
sleep the thread object of the vcpu.

Tracked-On: #4178
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-05 11:19:35 +08:00
Conghui Chen
d48da2af3a hv: bugfix for debug commands with smp_call
With cpu-sharing enabled, there are more than 1 vcpu on 1 pcpu, so the
smp_call handler should switch the vmcs to the target vcpu's vmcs. Then
get the info.

dump_vcpu_reg and dump_guest_mem should run on certain vmcs, otherwise,
there will be #GP error.

Renaming:
vcpu_dumpreg -> dump_vcpu_reg
switch_vmcs -> load_vmcs

Tracked-On: #4178
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-05 11:19:35 +08:00
Kaige Fu
aae974b473 HV: trace leaf and subleaf of cpuid
We care more about leaf and subleaf of cpuid than vcpu_id.
So, this patch changes the cpuid trace-entry to trace the leaf
and subleaf of this cpuid vmexit.

Tracked-On: #4175
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-12-03 16:34:14 +08:00
Yonghua Huang
450d2cf2e9 hv: trap RDPMC instruction execution from any guest
PMU is hidden from any guest, UD is expected when guest
try to execute 'rdpmc' instruction.

this patch sets 'RDPMC exiting' in Processorbased
VM-execution control.

Tracked-On: #3453
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-03 14:14:27 +08:00
Binbin Wu
3d412266bc hv: ept: build 4KB page mapping in EPT for RTVM for MCE on PSC
Deterministic is important for RTVM. The mitigation for MCE on
Page Size Change converts a large page to 4KB pages runtimely during
the vmexit triggered by the instruction fetch in the large page.
These vmexits increase nondeterminacy, which should be avoided for RTVM.
This patch builds 4KB page mapping in EPT for RTVM to avoid these vmexits.

Tracked-On: #4101
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-03 09:17:04 +08:00
Binbin Wu
0570993b40 hv: config: add an option to disable mce on psc workaround
Add a option MCE_ON_PSC_WORKAROUND_DISABLED to disable the software
workaround for the issue Machine Check Error on Page Size Change.

Tracked-On: #4101
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-03 09:17:04 +08:00
Binbin Wu
192859ee02 hv: ept: apply MCE on page size change mitigation conditionally
Only apply the software workaround on the models that might be
affected by MCE on page size change. For these models that are
known immune to the issue, the mitigation is turned off.

Atom processors are not afftected by the issue.
Also check the CPUID & MSR to check whether the model is immune to the issue:
CPU is not vulnerable when both CPUID.(EAX=07H,ECX=0H).EDX[29] and
IA32_ARCH_CAPABILITIES[IF_PSCHANGE_MC_NO] are 1.

Other cases not listed above, CPU may be vulnerable.

This patch also changes MACROs for MSR IA32_ARCH_CAPABILITIES bits to UL instead of U
since the MSR is 64bit.

Tracked-On: #4101
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-03 09:17:04 +08:00
Shuo A Liu
3cb32bb6e3 hv: make init_vmcs as a event of VCPU
After changing init_vmcs to smp call approach and do it before
launch_vcpu, it could work with noop scheduler. On real sharing
scheudler, it has problem.

   pcpu0                  pcpu1            pcpu1
 vmBvcpu0                vmAvcpu1         vmBvcpu1
                         vmentry
init_vmcs(vmBvcpu1) vmexit->do_init_vmcs
                    corrupt current vmcs
                        vmentry fail
launch_vcpu(vmBvcpu1)

This patch mark a event flag when request vmcs init for specific vcpu. When
it is running and checking pending events, will do init_vmcs firstly.

Tracked-On: #4178
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-02 16:20:43 +08:00
Victor Sun
15da33d8af HV: parse default pci mmcfg base
The default PCI mmcfg base is stored in ACPI MCFG table, when
CONFIG_ACPI_PARSE_ENABLED is set, acpi_fixup() function will
parse and fix up the platform mmcfg base in ACRN boot stage;
when it is not set, platform mmcfg base will be initialized to
DEFAULT_PCI_MMCFG_BASE which generated by acrn-config tool;

Please note we will not support platform which has multiple PCI
segment groups.

Tracked-On: #4157

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-02 16:20:24 +08:00
Yan, Like
0d998d6ac6 hv: sync physical and virtual TSC_DEADLINE when msr interception enabled/disabled
Starting with TSC_DEADLINE msr interception disabled, the virtual TSC_DEADLINE msr is always 0.
When the interception is enabled, need to sync the physical TSC_DEADLINE value to virtual TSC_DEADLINE.

When the interception is disabled, there are 2 cases:
 - if the timer hasn't expired, sync virtual TSC_DEADLINE to physical TSC_DEADLINE, to make the guest read the same tsc_deadline
   as it writes. This may change when the timer actually trigger.
 - if the timer has expired, write 0 to the virtual TSC_DEADLINE.

Tracked-On: #4162
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-02 16:10:50 +08:00
Yan, Like
97916364fc hv: fix virtual TSC_DEADLINE msr read/write issues
When write to virtual TSC_DEADLINE, if virtual TSC_ADJUST is not zero:
 - when guest intends to disarm the tsc_deadline timer, should not arm the timer falsely;
 - when guest intends to arm the tsc_deadline timer, should not disarm the timer falsely.

When read from virtual TSC_DEADLINE, if virtual TSC_ADJUST is not zero:
 - if physical TSC_DEADLINE is not zero, return the virtual TSC_DEADLINE value;
 - if physical TSC_DEADLINE is zero which means it's not armed (automatically disarmed after
   timer triggered), return 0 and reset the virtual TSC_DEADLINE.

Tracked-On: #4162
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-02 16:10:50 +08:00
Conghui Chen
e61412981d hv: support xsave in context switch
xsave area:
    legacy region: 512 bytes
    xsave header: 64 bytes
    extended region: < 3k bytes

So, pre-allocate 4k area for xsave. Use certain instruction to save or
restore the area according to hardware xsave feature set.

Tracked-On: #4166
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-02 09:31:12 +08:00
Conghui Chen
8ba203a165 hv: change xsave init function name
change pcpu_xsave_init to init_pcpu_xsave.

Tracked-On: #4166
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-02 09:31:12 +08:00
Li Fei1
6ee076f7df hv: assign: rename ptirq_msix_remap to ptirq_prepare_msix_remap
ptirq_msix_remap doesn't do the real remap, that's the vmsi_remap and vmsix_remap_entry
does. ptirq_msix_remap only did the preparation.

Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-11-29 08:53:07 +08:00
Geoffroy Van Cutsem
51a43dab79 hv: add Kconfig parameter to define the Service VM EFI bootloader
Add a Kconfig parameter called UEFI_OS_LOADER_NAME to hold the Service VM EFI
bootloader to be run by the ACRN hypervisor. A new string manipulation function
to convert from (char *) to (CHAR16 *) has been added to facilitate the
implementation.

The default value is set to systemd-boot (bootloaderx64.efi)

Tracked-On: #2793
Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
2019-11-27 10:38:49 +08:00
Alexander Merritt
94a456ae24 HV: refactor device_to_dmaru
On server platforms, DMAR DRHD device scope entries may contain PCI
bridges.

Bridges in the DRHD device scope indicate this IOMMU translates for all
devices on the hierarchy below that bridge.

ACRN is unaware of bridge types in the device scope, and adds these
directly to its internal representation of a DRHD. When looking up a BDF
within these DRHD entries, device_to_dmaru assumes all entries are
Endpoints, comparing BDF to BDF. Thus device to DMAR unit fails, because
it treats a bridge as an Endpoint type.

This change leverages prior patches by converting a BDF to the
associated device DRHD index, and uses that index to obtain the correct
DRHD state.

Handling a bridge in other ways may require maintaining a bus list for
each, or replacing each bridge in the dev scope with a set of all device
BDFs underneath it. Server platforms can have hundreds of PCI devices,
thus making the device scope artificially large is unwieldy.

Tracked-On: #4134
Signed-off-by: Alexander Merritt <alex.merritt@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-11-27 09:49:32 +08:00
Sainath Grandhi
c5a87d41df HV: Cleanup PCI segment usage from VT-d interfaces
ACRN does not support multiple PCI segments in its current form.
But VT-d module uses segment info in its interfaces and
hardcodes it to 0.
This patch cleans up everything related to segment to avoid
ambiguity.

Tracked-On: #4134
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-11-27 09:49:32 +08:00
Alexander Merritt
810169ad20 HV: initialize IOMMU before PCI device discovery
In later patches we use information from DMAR tables to guide discovery
and initialization of PCI devices.

Tracked-On: #4134
Signed-off-by: Alexander Merritt <alex.merritt@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2019-11-27 09:49:32 +08:00