Commit Graph

2027 Commits

Author SHA1 Message Date
Shuo A Liu
684766f008 hv: Save/restore MSR_IA32_CSTAR during context switch
Both Windows guest and Linux guest use the MSR MSR_IA32_CSTAR, while
Linux uses it rarely. Now vcpu context switch doesn't save/restore it.
Windows detects the change of the MSR and rises a exception.

Do the save/resotre MSR_IA32_CSTAR during context switch.

Tracked-On: #5899
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
2021-04-01 16:51:15 +08:00
Shuo A Liu
d9cd7f964b hv: WA: Disable advanced feature of apicv to WA ACRN-6886
Tracked-On: #5866
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
2021-03-24 13:32:51 +08:00
Victor Sun
8483c8c8f7 HV: Revert "HV: remove efi support for SOS"
This reverts commit ce9d5e8779.

Before readiness of better solution to fix SOS VM e820 and efi memmap mismatch
issue, revert this patch.

Tracked-On: #5626

Signed-off-by: Victor Sun <victor.sun@intel.com>
2021-03-15 11:14:30 +08:00
Li Fei1
b4a23e6c13 hv: ept: build 4KB page mapping in EPT for code pages of rtvm
RTVM is enforced to use 4KB pages to mitigate CVE-2018-12207 and performance jitter,
which may be introduced by splitting large page into 4KB pages on demand. It works
fine in previous hardware platform where the size of address space for the RTVM is
relatively small. However, this is a problem when the platforms support 64 bits
high MMIO space, which could be super large and therefore consumes large # of
EPT page table pages.

This patch optimize it by using large page for purely data pages, such as MMIO spaces,
even for the RTVM.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Tracked-On: #5788
2021-03-11 12:36:17 +08:00
Li Fei1
afe6a67a9f hv: ept: only treak execution right for large pages
To mitigate the page size change MCE vulnerability (CVE-2018-12207), ACRN would
clear the execution permission in the EPT paging-structure entries for large pages
and then intercept an EPT execution-permission violation caused by an attempt to
execution an instruction in the guest.

However, the current code would clear the execution permission in the EPT paging-
structure entries for small pages too when we clearing the the execution permission
for large pages. This would trigger extra EPT violation VM exits.

This patch fix this issue.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Tracked-On: #5788
2021-03-11 12:36:17 +08:00
Li Fei1
8afcd608d8 kv: kconfig: remove some unused ram size kconfig
SOS_RAM_SIZE/UOS_RAM_SIZE Kconfig are only used to calculate how many pages we
should reserve for the VM EPT mapping.

Now we reserve pages for each VM EPT pagetable mapping by the PLATFORM_RAM_SIZE
not the VM RAM SIZE. This could simplify the reserve logic for us: not need to
take care variable corner cases. We could make assume we reserve enough pages
base on the VM could not use the resources beyond the platform hardware resources.

So remove these two unused VM ram size kconfig.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Tracked-On: #5788
2021-03-11 12:36:17 +08:00
Li Fei1
38be61e374 hv: page: add free_page
Add free_page to free page when unmap pagetable.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Tracked-On: #5788
2021-03-11 12:36:17 +08:00
Li Fei1
04a104e856 hv: page: use dynamic page allocation for pagetable mapping
For FuSa's case, we remove all dynamic memory allocation use in ACRN HV. Instead,
we use static memory allocation or embedded data structure. For pagetable page,
we prefer to use an index (hva for MMU, gpa for EPT) to get a page from a special
page pool. The special page pool should be big enougn for each possible index.
This is not a big problem when we don't support 64 bits MMIO. Without 64 bits MMIO
support, we could use the index to search addrss not larger than DRAM_SIZE + 4G.

However, if ACRN plan to support 64 bits MMIO in SOS, we could not use the static
memory alocation any more. This is because there's a very huge hole between the
top DRAM address and the bottom 64 bits MMIO address. We could not reserve such
many pages for pagetable mapping as the CPU physical address bits may very large.

This patch will use dynamic page allocation for pagetable mapping. We also need
reserve a big enough page pool at first. For HV MMU, we don't use 4K granularity
page table mapping, we need reserve PML4, PDPT and PD pages according the maximum
physical address space (PPT va and pa are identical mapping); For each VM EPT,
we reserve PML4, PDPT and PD pages according to the maximum physical address space
too, (the EPT address sapce can't beyond the physical address space), and we reserve
PT pages by real use cases of DRAM, low MMIO and high MMIO.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Tracked-On: #5788
2021-03-11 12:36:17 +08:00
Li Fei1
312702f2ec hv: memory: remove get_sworld_memory_base API
memory_ops structure will be changed to store page table related fields.
However, secure world memory base address is not one of them, it's VM
related. So save sworld_memory_base_hva in vm_arch structure directly.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Tracked-On: #5788
2021-03-11 12:36:17 +08:00
Yonghua Huang
0a7bb7340f hv: disable guest MONITOR-WAIT support when SW SRAM is configured
Per-core software SRAM L2 cache may be flushed by 'mwait'
extension instruction, which guest VM may execute to enter
core deep sleep. Such kind of flushing is not expected when
software SRAM is enabled for RTVM.

Hypervisor disables MONITOR-WAIT support on both hypervisor
and VMs sides to protect above software SRAM from being flushed.

This patch disable ACRN guest MONITOR-WAIT support if software
SRAM is configured.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-03-10 11:42:02 +08:00
Yonghua Huang
92bbd3ff48 hv: disable host MONITOR-WAIT support when SW SRAM is enabled
Per-core software SRAM L2 cache may be flushed by 'mwait'
extension instruction, which guest VM may execute to enter
core deep sleep. Such kind of flushing is not expected when
software SRAM is enabled for RTVM.

Hypervisor disables MONITOR-WAIT support on both hypervisor
and VMs sides to protect above software SRAM from being flushed.

This patch disable hypervisor(host) MONITOR-WAIT support.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-03-10 11:42:02 +08:00
Yonghua Huang
fdc2903d5e hv: wrap function to check software SRAM support
Below boolean function are defined in this patch:
 - is_software_sram_enabled() to check if SW SRAM
   feature is enabled or not.
 - set global variable 'is_sw_sram_initialized'
   to file static.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-03-10 11:42:02 +08:00
Li Fei1
23d8202c77 hv: cpu: remove cache_flush_invalidate_all in cpu_dead
Before the ACRN HV entered the S3, we would call cache_flush_invalidate_all to
flush all the caches into memory and invalidate the caches on each logical cpu
before we halt the cpu.

This was not a problem before we support pSRAM. Once pSRAM binary code has been
executed on the logical cpu, we could not flush the pSRAM cache into memory then.
Otherwise, the pSRAM cache can't been locked.

This patch removes cache_flush_invalidate_all in cpu_dead since we would not
support to put the ACRN HV into S3. Once we want to support put the ACRN HV into
S3, we would try other ways to flush the data caches in this cpu into memory and
valid whether that way is practical or not.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Tracked-On: #5806
2021-03-03 16:55:46 +08:00
Yonghua Huang
30497bdb85 hv: unmap software region of pre-RTVM from Service VM EPT
Accessing to software SRAM region is not allowed when
 software SRAM is pass-thru to prelaunch RTVM.

 This patch removes software SRAM region from service VM
 EPT if it is enabled for prelaunch RTVM.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-25 13:21:41 +08:00
Victor Sun
ce9d5e8779 HV: remove efi support for SOS
Provide EFI support for SOS could cause weird issues. For example, hypervisor
works based on E820 table whereras it's possible that the memory map from EFI
table is not aligned with E820 table. The SOS kernel kaslr will try to find the
random address for extracted kernel image in EFI table first. So it's possible
that none-RAM in E820 is picked for extracted kernel image. This will make
kernel boot fail.

This patch removes EFI support for SOS by not passing struct boot_efi_info to
SOS kernel zeropage, and reserve a memory to store RSDP table for SOS and pass
the RSDP address to SOS kernel zeropage for SOS to locate ACPI table.

The patch requires SOS kernel version be high than 4.20, otherwise the kernel
might fail to find the RSDP.

Tracked-On: #5626

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-02-25 09:34:29 +08:00
Victor Sun
ae7eb25a7f HV: panic on 0 address when do e820_alloc_memory
Current memory allocation algorithm is to find the available address from
the highest possible address below max_address. If the function returns 0,
means all memory is used up and we have to put the resource at address 0,
this is dangerous for a running hypervisor.

Also returns 0 would make code logic very complicated, since memcpy_s()
doesn't support address 0 copy.

Tracked-On: #5626

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-02-25 09:34:29 +08:00
Victor Sun
ff1f6bc975 HV: refine acpi rsdp initialize interface
In previous code, the rsdp initialization is done in get_rsdp() api implicitly.
The function is called multiple times in following acpi table parsing functions
and the condition (rsdp == NULL) need to be added in each parsing function.
This is not needed since the panic would occur if rsdp is NULL when do acpi
initialization.

Tracked-On: #5626

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-02-25 09:34:29 +08:00
Tao Yuhong
50d8525618 HV: deny HV owned PCI bar access from SOS
This patch denies Service VM the access permission to device resources
owned by hypervisor.
HV may own these devices: (1) debug uart pci device for debug version
(2) type 1 pci device if have pre-launched VMs.
Current implementation exposes the mmio/pio resource of HV owned devices
to SOS, should remove them from SOS.

Tracked-On: #5615
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
2021-02-03 14:01:23 +08:00
Tao Yuhong
6e7ce4a73f HV: deny pre-launched VM ptdev bar access from SOS
This patch denies Service VM the access permission to device
resources owned by pre-launched VMs.
Rationale:
 * Pre-launched VMs in ACRN are independent of service VM,
   and should be immune to attacks from service VM. However,
   current implementation exposes the bar resource of passthru
   devices to service VM for some reason. This makes it possible
   for service VM to crash or attack pre-launched VMs.
 * It is same for hypervisor owned devices.

NOTE:
 * The MMIO spaces pre-allocated to VFs are still presented to
  Service VM. The SR-IOV capable devices assigned to pre-launched
  VMs doesn't have the SR-IOV capability. So the MMIO address spaces
  pre-allocated by BIOS for VFs are not decoded by hardware and
  couldn't be enabled by guest. SOS may live with seeing the address
  space or not. We will revisit later.

Tracked-On: #5615
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 14:01:23 +08:00
Shuo A Liu
d4aaf99d86 hv: keylocker: Support keylocker backup MSRs for Guest VM
The logical processor scoped IWKey can be copied to or from a
platform-scope storage copy called IWKeyBackup. Copying IWKey to
IWKeyBackup is called ‘backing up IWKey’ and copying from IWKeyBackup to
IWKey is called ‘restoring IWKey’.

IWKeyBackup and the path between it and IWKey are protected against
software and simple hardware attacks. This means that IWKeyBackup can be
used to distribute an IWKey within the logical processors in a platform
in a protected manner.

Linux keylocker implementation uses this feature, so they are
introduced by this patch.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 13:54:45 +08:00
Shuo A Liu
38cd5b481d hv: keylocker: host keylocker iwkey context switch
Different vCPU may have different IWKeys. Hypervisor need do the iwkey
context switch.

This patch introduce a load_iwkey() function to do that. Switches the
host iwkey when the switch_in vCPU satisfies:
  1) keylocker feature enabled
  2) Different from the current loaded one.

Two opportunities to do the load_iwkey():
  1) Guest enables CR4.KL bit.
  2) vCPU thread context switch.

load_iwkey() costs ~600 cycles when do the load IWKey action.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 13:54:45 +08:00
Shuo A Liu
c11c07e0fe hv: keylocker: Support Key Locker feature for guest VM
KeyLocker is a new security feature available in new Intel CPUs that
protects data-encryption keys for the Advanced Encryption Standard (AES)
algorithm. These keys are more valuable than what they guard. If stolen
once, the key can be repeatedly used even on another system and even
after vulnerability closed.

It also introduces a CPU-internal wrapping key (IWKey), which is a key-
encryption key to wrap AES keys into handles. While the IWKey is
inaccessible to software, randomizing the value during the boot-time
helps its value unpredictable.

Keylocker usage:
 - New “ENCODEKEY” instructions take original key input and returns HANDLE
   crypted by an internal wrap key (IWKey, init by “LOADIWKEY” instruction)
 - Software can then delete the original key from memory
 - Early in boot/software, less likely to have vulnerability that allows
   stealing original key
 - Later encrypt/decrypt can use the HANDLE through new AES KeyLocker
   instructions
 - Note:
      * Software can use original key without knowing it (use HANDLE)
      * HANDLE cannot be used on other systems or after warm/cold reset
      * IWKey cannot be read from CPU after it's loaded (this is the
        nature of this feature) and only 1 copy of IWKey inside CPU.

The virtualization implementation of Key Locker on ACRN is:
 - Each vCPU has a 'struct iwkey' to store its IWKey in struct
   acrn_vcpu_arch.
 - At initilization, every vCPU is created with a random IWKey.
 - Hypervisor traps the execution of LOADIWKEY (by 'LOADIWKEY exiting'
   VM-exectuion control) of vCPU to capture and save the IWKey if guest
   set a new IWKey. Don't support randomization (emulate CPUID to
   disable) of the LOADIWKEY as hypervisor cannot capture and save the
   random IWKey. From keylocker spec:
   "Note that a VMM may wish to enumerate no support for HW random IWKeys
   to the guest (i.e. enumerate CPUID.19H:ECX[1] as 0) as such IWKeys
   cannot be easily context switched. A guest ENCODEKEY will return the
   type of IWKey used (IWKey.KeySource) and thus will notice if a VMM
   virtualized a HW random IWKey with a SW specified IWKey."
 - In context_switch_in() of each vCPU, hypervisor loads that vCPU's
   IWKey into pCPU by LOADIWKEY instruction.
 - There is an assumption that ACRN hypervisor will never use the
   KeyLocker feature itself.

This patch implements the vCPU's IWKey management and the next patch
implements host context save/restore IWKey logic.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 13:54:45 +08:00
Shuo A Liu
4483e93bd1 hv: keylocker: Enable the tertiary VM-execution controls
In order for a VMM to capture the IWKey values of guests, processors
that support Key Locker also support a new "LOADIWKEY exiting"
VM-execution control in bit 0 of the tertiary processor-based
VM-execution controls.

This patch enables the tertiary VM-execution controls.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 13:54:45 +08:00
Shuo A Liu
e9247dbca0 hv: keylocker: Simulate CPUID of keylocker caps for guest VM
KeyLocker is a new security feature available in new Intel CPUs that
protects data-encryption keys for the Advanced Encryption Standard (AES)
algorithm.

This patch emulates Keylocker CPUID leaf 19H to support Keylocker
feature for guest VM.

To make the hypervisor being able to manage the IWKey correctly, this
patch doesn't expose hardware random IWKey capability
(CPUID.0x19.ECX[1]) to guest VM.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-02-03 13:54:45 +08:00
Shuo A Liu
15c967ad34 hv: keylocker: Add CR4 bit CR4_KL as CR4_TRAP_AND_PASSTHRU_BITS
Bit19 (CR4_KL) of CR4 is CPU KeyLocker feature enable bit. Hypervisor
traps the bit's writing to track the keylocker feature on/off of guest.
While the bit is set by guest,
 - set cr4_kl_enabled to indicate the vcpu's keylocker feature enabled status
 - load vcpu's IWKey in host (will add in later patch)
While the bit is clear by guest,
 - clear cr4_kl_enabled

This patch trap and passthru the CR4_KL bit to guest for operation.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 13:54:45 +08:00
Li Fei1
94a980c923 hv: hypercall: prevent sos can touch hv/pre-launched VM resource
Current implementation, SOS may allocate the memory region belonging to
hypervisor/pre-launched VM to a post-launched VM. Because it only verifies
the start address rather than the entire memory region.

This patch verifies the validity of the entire memory region before
allocating to a post-launched VM so that the specified memory can only
be allocated to a post-launched VM if the entire memory region is mapped
in SOS’s EPT.

Tracked-On: #5555
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Reviewed-by: Yonghua Huang  <yonghua.huang@intel.com>
2021-02-02 16:55:40 +08:00
Yonghua Huang
8bec63a6ea hv: remove the hardcoding of Software SRAM GPA base
Currently, we hardcode the GPA base of Software SRAM
 to an address that is derived from TGL platform,
 as this GPA is identical with HPA for Pre-launch VM,
 This hardcoded address may not work on other platforms
 if the HPA bases of Software SRAM are different.

 Now, Offline tool configures above GPA based on the
 detection of Software SRAM on specific platform.

 This patch removes the hardcoding GPA of Software SRAM,
 and also renames MACRO 'SOFTWARE_SRAM_BASE_GPA' to
 'PRE_RTVM_SW_SRAM_BASE_GPA' to avoid confusing, as it
 is for Prelaunch VM only.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-01-30 13:41:02 +08:00
Yonghua Huang
c9ca23d268 hv: refine RTCM initialization code
- RTCM is initialized in hypervisor only
   if RTCM binaries are detected.
 - Remove address space of RTCM binary from
   Software SRAM region.
 - Refine parse_rtct() function, validity of
   ACPI RTCT table shall be checked by caller.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-01-28 11:29:25 +08:00
Yonghua Huang
a6e666dbe7 hv: remove hardcoding of SW SRAM HPA base
Physical address to SW SRAM region maybe different
 on different platforms, this hardcoded address may
 result in address mismatch for SW SRAM operations.

 This patch removes above hardcoded address and uses
 the physical address parsed from native RTCT.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-01-28 11:29:25 +08:00
Yonghua Huang
a6420e8cfa hv: cleanup legacy terminologies in RTCM module
This patch updates below terminologies according
 to the latest TCC Spec:
  PTCT -> RTCT
  PTCM -> RTCM
  pSRAM -> Software SRAM

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2021-01-28 11:29:25 +08:00
Yonghua Huang
806f479108 hv: rename RTCM source files
'ptcm' and 'ptct' are legacy name according
   to the latest TCC spec, hence rename below files
   to avoid confusing:

  ptcm.c -> rtcm.c
  ptcm.h -> rtcm.h
  ptct.h -> rtct.h

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2021-01-28 11:29:25 +08:00
Liang Yi
e8a76868c9 hv: modularization: remove global variable efiloader_sig.
Simplify multiboot API by removing the global variable efiloader_sig.
Replaced by constant at the use site.

Tracked-On: #5661
Signed-off-by: Yi Liang <yi.liang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-01-27 15:59:47 +08:00
Liang Yi
67926cee81 hv: modularization: remove include/boot.h.
Remove include/boot.h since it contains only assembly variables that
should only be accessed in arch/x86/init.c.

Tracked-On: #5661
Signed-off-by: Yi Liang <yi.liang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-01-27 15:59:47 +08:00
Liang Yi
1de396363f hv: modularization: avoid dependency of multiboot on zeropage.h.
Split off definition of "struct efi_info" into a separate header
file lib/efi.h.

Tracked-On: #5661
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-01-27 15:59:47 +08:00
Liang Yi
681688fbe4 hv: modularization: change of multiboot API.
The init_multiboot_info() and sanitize_multiboot_ifno() APIs now
require parameters instead of implicitly relying on global boot
variables.

Tracked-On: #5661
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-01-27 15:59:47 +08:00
Liang Yi
66599e0aa7 hv: modularization: multiboot
Calling sanitize_multiboot() from init.c instead of cpu.c.

Tracked-On: #5661
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-01-27 15:59:47 +08:00
Liang Yi
c23e557a18 hv: modularization: make parse_hv_cmdline() an internal function.
This way, we void exposing acrn_mbi as a global variable.

Tracked-On: #5661
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-01-27 15:59:47 +08:00
Liang Yi
8f9ec59a53 hv: modularization: cleanup boot.h
Move multiboot specific declarations from boot.h to multiboot.h.

Tracked-On: #5661
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-01-27 15:59:47 +08:00
Jie Deng
5c5d272358 hv: remove bitmap_clear_lock of split-lock after completing emulation
When "signal_event" is called, "wait_event" will actually not block.
So it is ok to remove this line.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
2021-01-13 15:32:27 +08:00
Yin Fengwei
ef411d4ac3 hv: ptirq: Shouldn't change sid if intx irq mapping was added
Now, we use hash table to maintain intx irq mapping by using
the key generated from sid. So once the entry is added,we can
not update source ide any more. Otherwise, we can't locate the
entry with the key generated from new source ide.

For source id change, remove_remapping/add_remapping is used
instead of update source id directly if entry was added already.

Tracked-On: #5640
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-01-12 15:23:44 +08:00
Jie Deng
8aebf5526f hv: move split-lock logic into dedicated file
This patch move the split-lock logic into dedicated file
to reduce LOC. This may make the logic more clear.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
2021-01-08 17:37:20 +08:00
Jie Deng
27d5711b62 hv: add a cache register for VMX_PROC_VM_EXEC_CONTROLS
This patch adds a cache register for VMX_PROC_VM_EXEC_CONTROLS
to avoid the frequent VMCS access.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
2021-01-08 17:37:20 +08:00
Jie Deng
f291997811 hv: split-lock: using MTF instead of TF(#DB)
The TF is visible to guest which may be modified by
the guest, so it is not a safe method to emulate the
split-lock. While MTF is specifically designed for
single-stepping in x86/Intel hardware virtualization
VT-x technology which is invisible to the guest. Use MTF
to single step the VCPU during the emulation of split lock.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
2021-01-08 17:37:20 +08:00
Jie Deng
6852438e3a hv: Support concurrent split-lock emulation on SMP.
For a SMP guest, split-lock check may happen on
multiple vCPUs simultaneously. In this case, one
vCPU at most can be allowed running in the
split-lock emulation window. And if the vCPU is
doing the emulation, it should never be blocked
in the hypervisor, it should go back to the guest
to execute the lock instruction immediately and
trap back to the hypervisor with #DB to complete the
split-lock emulation.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
2021-01-08 17:37:20 +08:00
Li Fei1
0b18389d95 hv: vcpuid: expose mce feature to guest
Windows64 seems only support processor which has MCE (Machine Check Error)
feature.

Tracked-On: #5638
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2021-01-08 17:22:34 +08:00
Jie Deng
b14c32a110 hv: Retain RIP only for fault exception.
We have trapped the #DB for split-lock emulation.
Only fault exception need RIP being retained.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-12-31 11:12:33 +08:00
Jie Deng
977e862192 hv: Add split-lock emulation for xchg
xchg may also cause the #AC for split-lock check.
This patch adds this emulation.

 1. Kick other vcpus of the guest to stop execution
    if the guest has more than one vcpu.

 2. Emulate the xchg instruction.

 3. Notify other vcpus (if any) to restart execution.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-12-31 11:12:33 +08:00
Jie Deng
47e193a7bb hv: Add split-lock emulation for LOCK prefix instruction
This patch adds the split-lock emulation.
If a #AC is caused by instruction with LOCK prefix then
emulate it, otherwise, inject it back as it used to be.

 1. Kick other vcpus of the guest to stop execution
    and set the TF flag to have #DB if the guest has more
    than one vcpu.

 2. Skip over the LOCK prefix and resume the current
    vcpu back to guest for execution.

 3. Notify other vcpus to restart exception at the end
    of handling the #DB since we have completed
    the LOCK prefix instruction emulation.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-12-31 11:12:33 +08:00
Yonghua Huang
643bbcfe34 hv: check the availability of guest CR4 features
Check hardware support for all features in CR4,
 and hide bits from guest by vcpuid if they're not supported
 for guests OS.

Tracked-On: #5586
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-12-18 11:21:22 +08:00
Yonghua Huang
442fc30117 hv: refine virtualization flow for cr0 and cr4
- The current code to virtualize CR0/CR4 is not
   well designed, and hard to read.
   This patch reshuffle the logic to make it clear
   and classify those bits into PASSTHRU,
   TRAP_AND_PASSTHRU, TRAP_AND_EMULATE & reserved bits.

Tracked-On: #5586
Signed-off-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2020-12-18 11:21:22 +08:00