Commit Graph

1633 Commits

Author SHA1 Message Date
Liang Yi
3a50f949e1 hv/mod_irq: split irq.c into arch/x86/irq.c and common/irq.c
The common irq file is responsible for managing the central
irq_desc data structure and provides the following APIs for
host interrupt handling.
- init_interrupt()
- reserve_irq_num()
- request_irq()
- free_irq()
- set_irq_trigger_mode()
- do_irq()

API prototypes, constant and data structures belonging to common
interrupt handling are all moved into include/common/irq.h.

Conversely, the following arch specific APIs are added which are
called from the common code at various points:
- init_irq_descs_arch()
- setup_irqs_arch()
- init_interrupt_arch()
- free_irq_arch()
- request_irq_arch()
- pre_irq_arch()
- post_irq_arch()

Tracked-On: #5825
Signed-off-by: Peter Fang <peter.fang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-03-24 11:38:14 +08:00
Liang Yi
c46e3c71ac hv/mod_irq: decouple irq number reservation from ioapic
This is done be adding irq_rsvd_bitmap as an auxiliary bitmap
besides irq_alloc_bitmap.

Tracked-On: #5825
Signed-off-by: Peter Fang <peter.fang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-03-24 11:38:14 +08:00
Liang Yi
f3cae9e258 hv/mod_irq: hide arch specific data in irq_desc
Arch specific IRQ data is now an opaque pointer in irq_desc.

This is a preparation step for spliting IRQ handling into common
and architecture specific parts.

Tracked-On: #5825
Signed-off-by: Peter Fang <peter.fang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-03-24 11:38:14 +08:00
Li Fei1
9000381f34 hv: pgtable: move pgtable definition to pgtable.h
This patch moves pgtable definition to pgtable.h and include the proper
header file for page module.

Tracked-On: #5830
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-03-11 13:48:52 +08:00
Li Fei1
0278a3f46e hv: pgatble: move the EPT page table related APIs to ept.c
Move the EPT page table related APIs to ept.c. page module only provides APIs to
allocate/free page for page table page. pagetabl module only provides APIs to
add/modify/delete/lookup page table entry. The page pool and the page table
related APIs for EPT should defined in EPT module.

Tracked-On: #5830
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-03-11 13:48:52 +08:00
Li Fei1
5c71ca456a hv: pgatble: move the MMU page table related APIs to mmu.c
Move the MMU page table related APIs to mmu.c. page module only provides APIs to
allocate/free page for page table page. pagetabl module only provides APIs to
add/modify/delete/lookup page table entry. The page pool and the page table
related APIs for MMU should defined in MMU module.

Tracked-On: #5830
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-03-11 13:48:52 +08:00
Li Fei1
15d68675e9 hv: pgtable: separate common APIs for MMU/EPT
We would move the MMU page table related APIs to mmu.c and move the EPT related
APIs to EPT.c. The page table module only provides APIs to add/modify/delete/lookup
page table entry.

This patch separates common APIs and adds separate APIs of page table module
for MMU/EPT.

Tracked-On: #5830
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2021-03-11 13:48:52 +08:00
Li Fei1
80bd3ac02a hv: trusty: move post_uos_sworld_memory into vm.c
post_uos_sworld_memory are used for post-launched VM which support trusty.
It's more VM related. So move it definition into vm.c

Tracked-On: #5830
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-03-11 13:48:52 +08:00
Yonghua Huang
1a011bd91b hv: disable guest MONITOR-WAIT support when SW SRAM is configured
Per-core software SRAM L2 cache may be flushed by 'mwait'
extension instruction, which guest VM may execute to enter
core deep sleep. Such kind of flushing is not expected when
software SRAM is enabled for RTVM.

Hypervisor disables MONITOR-WAIT support on both hypervisor
and VMs sides to protect above software SRAM from being flushed.

This patch disable ACRN guest MONITOR-WAIT support if software
SRAM is configured.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-03-11 09:42:44 +08:00
Yonghua Huang
ae43b2a847 hv: disable host MONITOR-WAIT support when SW SRAM is enabled
Per-core software SRAM L2 cache may be flushed by 'mwait'
extension instruction, which guest VM may execute to enter
core deep sleep. Such kind of flushing is not expected when
software SRAM is enabled for RTVM.

Hypervisor disables MONITOR-WAIT support on both hypervisor
and VMs sides to protect above software SRAM from being flushed.

This patch disable hypervisor(host) MONITOR-WAIT support and refine
software sram initializaion flow.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-03-11 09:42:44 +08:00
Yonghua Huang
ea44bb6c4d hv: wrap function to check software SRAM support
Below boolean function are defined in this patch:
 - is_software_sram_enabled() to check if SW SRAM
   feature is enabled or not.
 - set global variable 'is_sw_sram_initialized'
   to file static.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-03-11 09:42:44 +08:00
Li Fei1
768e483cd2 hv: pgtable: rename 'struct memory_ops' to 'struct pgtable'
The fields and APIs in old 'struct memory_ops' are used to add/modify/delete
page table (page or entry). So rename 'struct memory_ops' to 'struct pgtable'.

Tracked-On: #5830
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-03-10 11:42:13 +08:00
Li Fei1
ef98fa69ce hv: pgtable: remove get_default_access_right API
Use default_access_right field to replace get_default_access_right API.

Tracked-On: #5830
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-03-10 11:42:13 +08:00
Li Fei1
1db32f4d03 hv: ept: build 4KB page mapping in EPT for code pages of rtvm
RTVM is enforced to use 4KB pages to mitigate CVE-2018-12207 and performance jitter,
which may be introduced by splitting large page into 4KB pages on demand. It works
fine in previous hardware platform where the size of address space for the RTVM is
relatively small. However, this is a problem when the platforms support 64 bits
high MMIO space, which could be super large and therefore consumes large # of
EPT page table pages.

This patch optimize it by using large page for purely data pages, such as MMIO spaces,
even for the RTVM.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Tracked-On: #5788
2021-03-03 13:46:49 +08:00
Li Fei1
0579e2ee24 hv: page: add free_page
Add free_page to free page when unmap pagetable.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Tracked-On: #5788
2021-03-01 13:10:04 +08:00
Li Fei1
8d9f12f3b7 hv: page: use dynamic page allocation for pagetable mapping
For FuSa's case, we remove all dynamic memory allocation use in ACRN HV. Instead,
we use static memory allocation or embedded data structure. For pagetable page,
we prefer to use an index (hva for MMU, gpa for EPT) to get a page from a special
page pool. The special page pool should be big enougn for each possible index.
This is not a big problem when we don't support 64 bits MMIO. Without 64 bits MMIO
support, we could use the index to search addrss not larger than DRAM_SIZE + 4G.

However, if ACRN plan to support 64 bits MMIO in SOS, we could not use the static
memory alocation any more. This is because there's a very huge hole between the
top DRAM address and the bottom 64 bits MMIO address. We could not reserve such
many pages for pagetable mapping as the CPU physical address bits may very large.

This patch will use dynamic page allocation for pagetable mapping. We also need
reserve a big enough page pool at first. For HV MMU, we don't use 4K granularity
page table mapping, we need reserve PML4, PDPT and PD pages according the maximum
physical address space (PPT va and pa are identical mapping); For each VM EPT,
we reserve PML4, PDPT and PD pages according to the maximum physical address space
too, (the EPT address sapce can't beyond the physical address space), and we reserve
PT pages by real use cases of DRAM, low MMIO and high MMIO.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Tracked-On: #5788
2021-03-01 13:10:04 +08:00
Li Fei1
5621fabbcb hv: memory: remove get_sworld_memory_base API
memory_ops structure will be changed to store page table related fields.
However, secure world memory base address is not one of them, it's VM
related. So save sworld_memory_base_hva in vm_arch structure directly.

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Tracked-On: #5788
2021-03-01 13:10:04 +08:00
Yonghua Huang
fdfd28b140 hv: unmap software region of pre-RTVM from Service VM EPT
Accessing to software SRAM region is not allowed when
 software SRAM is pass-thru to prelaunch RTVM.

 This patch removes software SRAM region from service VM
 EPT if it is enabled for prelaunch RTVM.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2021-02-25 09:35:31 +08:00
Sainath Grandhi
80a91987f4 hv: Fix incorrect struct definition for ir_bits
Fixing an incorrect struct definition for ir_bits in ioapic_rte. Since bits after
the delivery status in the lower 32 bits are not touched by code,
this has never showed up as an issue. And the higher 32 bits in the RTE
are aligned by the compiler.

Tracked-On: #5773
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
2021-02-25 09:34:49 +08:00
Tao Yuhong
50d8525618 HV: deny HV owned PCI bar access from SOS
This patch denies Service VM the access permission to device resources
owned by hypervisor.
HV may own these devices: (1) debug uart pci device for debug version
(2) type 1 pci device if have pre-launched VMs.
Current implementation exposes the mmio/pio resource of HV owned devices
to SOS, should remove them from SOS.

Tracked-On: #5615
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
2021-02-03 14:01:23 +08:00
Tao Yuhong
6e7ce4a73f HV: deny pre-launched VM ptdev bar access from SOS
This patch denies Service VM the access permission to device
resources owned by pre-launched VMs.
Rationale:
 * Pre-launched VMs in ACRN are independent of service VM,
   and should be immune to attacks from service VM. However,
   current implementation exposes the bar resource of passthru
   devices to service VM for some reason. This makes it possible
   for service VM to crash or attack pre-launched VMs.
 * It is same for hypervisor owned devices.

NOTE:
 * The MMIO spaces pre-allocated to VFs are still presented to
  Service VM. The SR-IOV capable devices assigned to pre-launched
  VMs doesn't have the SR-IOV capability. So the MMIO address spaces
  pre-allocated by BIOS for VFs are not decoded by hardware and
  couldn't be enabled by guest. SOS may live with seeing the address
  space or not. We will revisit later.

Tracked-On: #5615
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 14:01:23 +08:00
Shuo A Liu
d4aaf99d86 hv: keylocker: Support keylocker backup MSRs for Guest VM
The logical processor scoped IWKey can be copied to or from a
platform-scope storage copy called IWKeyBackup. Copying IWKey to
IWKeyBackup is called ‘backing up IWKey’ and copying from IWKeyBackup to
IWKey is called ‘restoring IWKey’.

IWKeyBackup and the path between it and IWKey are protected against
software and simple hardware attacks. This means that IWKeyBackup can be
used to distribute an IWKey within the logical processors in a platform
in a protected manner.

Linux keylocker implementation uses this feature, so they are
introduced by this patch.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 13:54:45 +08:00
Shuo A Liu
38cd5b481d hv: keylocker: host keylocker iwkey context switch
Different vCPU may have different IWKeys. Hypervisor need do the iwkey
context switch.

This patch introduce a load_iwkey() function to do that. Switches the
host iwkey when the switch_in vCPU satisfies:
  1) keylocker feature enabled
  2) Different from the current loaded one.

Two opportunities to do the load_iwkey():
  1) Guest enables CR4.KL bit.
  2) vCPU thread context switch.

load_iwkey() costs ~600 cycles when do the load IWKey action.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 13:54:45 +08:00
Shuo A Liu
c11c07e0fe hv: keylocker: Support Key Locker feature for guest VM
KeyLocker is a new security feature available in new Intel CPUs that
protects data-encryption keys for the Advanced Encryption Standard (AES)
algorithm. These keys are more valuable than what they guard. If stolen
once, the key can be repeatedly used even on another system and even
after vulnerability closed.

It also introduces a CPU-internal wrapping key (IWKey), which is a key-
encryption key to wrap AES keys into handles. While the IWKey is
inaccessible to software, randomizing the value during the boot-time
helps its value unpredictable.

Keylocker usage:
 - New “ENCODEKEY” instructions take original key input and returns HANDLE
   crypted by an internal wrap key (IWKey, init by “LOADIWKEY” instruction)
 - Software can then delete the original key from memory
 - Early in boot/software, less likely to have vulnerability that allows
   stealing original key
 - Later encrypt/decrypt can use the HANDLE through new AES KeyLocker
   instructions
 - Note:
      * Software can use original key without knowing it (use HANDLE)
      * HANDLE cannot be used on other systems or after warm/cold reset
      * IWKey cannot be read from CPU after it's loaded (this is the
        nature of this feature) and only 1 copy of IWKey inside CPU.

The virtualization implementation of Key Locker on ACRN is:
 - Each vCPU has a 'struct iwkey' to store its IWKey in struct
   acrn_vcpu_arch.
 - At initilization, every vCPU is created with a random IWKey.
 - Hypervisor traps the execution of LOADIWKEY (by 'LOADIWKEY exiting'
   VM-exectuion control) of vCPU to capture and save the IWKey if guest
   set a new IWKey. Don't support randomization (emulate CPUID to
   disable) of the LOADIWKEY as hypervisor cannot capture and save the
   random IWKey. From keylocker spec:
   "Note that a VMM may wish to enumerate no support for HW random IWKeys
   to the guest (i.e. enumerate CPUID.19H:ECX[1] as 0) as such IWKeys
   cannot be easily context switched. A guest ENCODEKEY will return the
   type of IWKey used (IWKey.KeySource) and thus will notice if a VMM
   virtualized a HW random IWKey with a SW specified IWKey."
 - In context_switch_in() of each vCPU, hypervisor loads that vCPU's
   IWKey into pCPU by LOADIWKEY instruction.
 - There is an assumption that ACRN hypervisor will never use the
   KeyLocker feature itself.

This patch implements the vCPU's IWKey management and the next patch
implements host context save/restore IWKey logic.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 13:54:45 +08:00
Shuo A Liu
4483e93bd1 hv: keylocker: Enable the tertiary VM-execution controls
In order for a VMM to capture the IWKey values of guests, processors
that support Key Locker also support a new "LOADIWKEY exiting"
VM-execution control in bit 0 of the tertiary processor-based
VM-execution controls.

This patch enables the tertiary VM-execution controls.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 13:54:45 +08:00
Shuo A Liu
e9247dbca0 hv: keylocker: Simulate CPUID of keylocker caps for guest VM
KeyLocker is a new security feature available in new Intel CPUs that
protects data-encryption keys for the Advanced Encryption Standard (AES)
algorithm.

This patch emulates Keylocker CPUID leaf 19H to support Keylocker
feature for guest VM.

To make the hypervisor being able to manage the IWKey correctly, this
patch doesn't expose hardware random IWKey capability
(CPUID.0x19.ECX[1]) to guest VM.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2021-02-03 13:54:45 +08:00
Shuo A Liu
15c967ad34 hv: keylocker: Add CR4 bit CR4_KL as CR4_TRAP_AND_PASSTHRU_BITS
Bit19 (CR4_KL) of CR4 is CPU KeyLocker feature enable bit. Hypervisor
traps the bit's writing to track the keylocker feature on/off of guest.
While the bit is set by guest,
 - set cr4_kl_enabled to indicate the vcpu's keylocker feature enabled status
 - load vcpu's IWKey in host (will add in later patch)
While the bit is clear by guest,
 - clear cr4_kl_enabled

This patch trap and passthru the CR4_KL bit to guest for operation.

Tracked-On: #5695
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-02-03 13:54:45 +08:00
Li Fei1
94a980c923 hv: hypercall: prevent sos can touch hv/pre-launched VM resource
Current implementation, SOS may allocate the memory region belonging to
hypervisor/pre-launched VM to a post-launched VM. Because it only verifies
the start address rather than the entire memory region.

This patch verifies the validity of the entire memory region before
allocating to a post-launched VM so that the specified memory can only
be allocated to a post-launched VM if the entire memory region is mapped
in SOS’s EPT.

Tracked-On: #5555
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Reviewed-by: Yonghua Huang  <yonghua.huang@intel.com>
2021-02-02 16:55:40 +08:00
Yonghua Huang
8bec63a6ea hv: remove the hardcoding of Software SRAM GPA base
Currently, we hardcode the GPA base of Software SRAM
 to an address that is derived from TGL platform,
 as this GPA is identical with HPA for Pre-launch VM,
 This hardcoded address may not work on other platforms
 if the HPA bases of Software SRAM are different.

 Now, Offline tool configures above GPA based on the
 detection of Software SRAM on specific platform.

 This patch removes the hardcoding GPA of Software SRAM,
 and also renames MACRO 'SOFTWARE_SRAM_BASE_GPA' to
 'PRE_RTVM_SW_SRAM_BASE_GPA' to avoid confusing, as it
 is for Prelaunch VM only.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-01-30 13:41:02 +08:00
Yonghua Huang
a6e666dbe7 hv: remove hardcoding of SW SRAM HPA base
Physical address to SW SRAM region maybe different
 on different platforms, this hardcoded address may
 result in address mismatch for SW SRAM operations.

 This patch removes above hardcoded address and uses
 the physical address parsed from native RTCT.

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2021-01-28 11:29:25 +08:00
Yonghua Huang
a6420e8cfa hv: cleanup legacy terminologies in RTCM module
This patch updates below terminologies according
 to the latest TCC Spec:
  PTCT -> RTCT
  PTCM -> RTCM
  pSRAM -> Software SRAM

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2021-01-28 11:29:25 +08:00
Yonghua Huang
806f479108 hv: rename RTCM source files
'ptcm' and 'ptct' are legacy name according
   to the latest TCC spec, hence rename below files
   to avoid confusing:

  ptcm.c -> rtcm.c
  ptcm.h -> rtcm.h
  ptct.h -> rtct.h

Tracked-On: #5649
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2021-01-28 11:29:25 +08:00
Liang Yi
1de396363f hv: modularization: avoid dependency of multiboot on zeropage.h.
Split off definition of "struct efi_info" into a separate header
file lib/efi.h.

Tracked-On: #5661
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-01-27 15:59:47 +08:00
Liang Yi
8f9ec59a53 hv: modularization: cleanup boot.h
Move multiboot specific declarations from boot.h to multiboot.h.

Tracked-On: #5661
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2021-01-27 15:59:47 +08:00
Tao Yuhong
6c6fa5f340 Fix: HV: keep reshuffling on VBARs
The commit 'Fix: HV: VM OS failed to assign new address to pci-vuart
BARs' need more reshuffle.

Tracked-On: #5491
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>
2021-01-15 15:00:01 +08:00
Jie Deng
8aebf5526f hv: move split-lock logic into dedicated file
This patch move the split-lock logic into dedicated file
to reduce LOC. This may make the logic more clear.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
2021-01-08 17:37:20 +08:00
Jie Deng
27d5711b62 hv: add a cache register for VMX_PROC_VM_EXEC_CONTROLS
This patch adds a cache register for VMX_PROC_VM_EXEC_CONTROLS
to avoid the frequent VMCS access.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
2021-01-08 17:37:20 +08:00
Tao Yuhong
84752ab229 Fix: HV: VM OS failed to assign new address to pci-vuart BARs
When wrong BAR address is set for pci-vuart, OS may assign a
new BAR address to it. Pci-vuart BAR can't be reprogrammed,
for its wrong fixed value. That can may because pci_vbar.fixed and
pci_vbar.type has overlap in abstraction, pci_vbar.fixed
has a confusing name, pci_vbar.type has PCIBAR_MEM64HI which is not
really a type of pci BARs.
So replace pci_vbar.type with pci_vbar.is_mem64hi, and change
pci_vbar.fixed to an union type with new name pci_vbar.bar_type.

Tracked-On: #5491
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
2021-01-08 17:20:56 +08:00
Jie Deng
977e862192 hv: Add split-lock emulation for xchg
xchg may also cause the #AC for split-lock check.
This patch adds this emulation.

 1. Kick other vcpus of the guest to stop execution
    if the guest has more than one vcpu.

 2. Emulate the xchg instruction.

 3. Notify other vcpus (if any) to restart execution.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-12-31 11:12:33 +08:00
Jie Deng
47e193a7bb hv: Add split-lock emulation for LOCK prefix instruction
This patch adds the split-lock emulation.
If a #AC is caused by instruction with LOCK prefix then
emulate it, otherwise, inject it back as it used to be.

 1. Kick other vcpus of the guest to stop execution
    and set the TF flag to have #DB if the guest has more
    than one vcpu.

 2. Skip over the LOCK prefix and resume the current
    vcpu back to guest for execution.

 3. Notify other vcpus to restart exception at the end
    of handling the #DB since we have completed
    the LOCK prefix instruction emulation.

Tracked-On: #5605
Signed-off-by: Jie Deng <jie.deng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-12-31 11:12:33 +08:00
Yonghua Huang
643bbcfe34 hv: check the availability of guest CR4 features
Check hardware support for all features in CR4,
 and hide bits from guest by vcpuid if they're not supported
 for guests OS.

Tracked-On: #5586
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-12-18 11:21:22 +08:00
Yonghua Huang
442fc30117 hv: refine virtualization flow for cr0 and cr4
- The current code to virtualize CR0/CR4 is not
   well designed, and hard to read.
   This patch reshuffle the logic to make it clear
   and classify those bits into PASSTHRU,
   TRAP_AND_PASSTHRU, TRAP_AND_EMULATE & reserved bits.

Tracked-On: #5586
Signed-off-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2020-12-18 11:21:22 +08:00
Yonghua Huang
08c42f91c9 hv: rename hypercall for hv-emulated device management
Coding style cleanup, use add/remove instead of create/destroy.

Tracked-On: #5586
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2020-12-07 16:25:17 +08:00
Peter Fang
68dc8d9f8f hv: pm: avoid duplicate shutdowns on RTVM
It is possible for more than one vCPUs to trigger shutdown on an RTVM.
We need to avoid entering VM_READY_TO_POWEROFF state again after the
RTVM has been paused or shut down.

Also, make sure an RTVM enters VM_READY_TO_POWEROFF state before it can
be paused.

v1 -> v2:
- rename to poweroff_if_rt_vm for better clarity

Tracked-On: #5411
Signed-off-by: Peter Fang <peter.fang@intel.com>
2020-11-11 14:05:39 +08:00
dongshen
ca5683f78d hv: add support for shutdown for pre-launched VMs
Currently, ACRN only support shutdown when triple fault happens, because ACRN
doesn't present/emulate a virtual HW, i.e. port IO, to support shutdown. This
patch emulate a virtual shutdown component, and the vACPI method for guest OS
to use.

Pre-launched VM uses ACPI reduced HW mode, intercept the virtual sleep control/status
registers for pre-launched VMs shutdown

Tracked-On: #5411
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
2020-11-04 10:33:31 +08:00
Peter Fang
70b1218952 hv: pm: support shutting down multiple VMs when pCPUs are shared
More than one VM may request shutdown on the same pCPU before
shutdown_vm_from_idle() is called in the idle thread when pCPUs are
shared among VMs.

Use a per-pCPU bitmap to store all the VMIDs requesting shutdown.

v1 -> v2:
- use vm_lock to avoid a race on shutdown

Tracked-On: #5411
Signed-off-by: Peter Fang <peter.fang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-11-04 10:33:31 +08:00
Li Fei1
f3067f5385 hv: mmu: rename hv_access_memory_region_update to ppt_clear_user_bit
Rename hv_access_memory_region_update to ppt_clear_user_bit to
verb + object style.

Tracked-On: #5330
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-11-02 10:29:43 +08:00
Li Fei1
35abee60d6 hv: pSRAM: temporarily remove NX bit of PTCM binary
Temporarily remove NX bit of PTCM binary in pagetable during pSRAM
initialization:
1.added a function ppt_set_nx_bit to temporarily remove/restore the NX bit of
a given area in pagetable.
2.Temporarily remove NX bit of PTCM binary during pSRAM initialization to make
PTCM codes executable.
3. TODO: We may use SMP call to flush TLB and do pSRAM initilization on APs.

Tracked-On: #5330
Signed-off-by: Qian Wang <qian1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-11-02 10:29:43 +08:00
Li Fei1
5fa816f921 hv: pSRAM: add PTCT parsing code
The added parse_ptct function will parse native ACPI PTCT table to
acquire information like pSRAM location/size/level and PTCM location,
and save them.

Tracked-On: #5330
Signed-off-by: Qian Wang <qian1.wang@intel.com>
2020-11-02 10:29:43 +08:00
Li Fei1
80121b8347 hv: pSRAM: add pSRAM initialization codes
1.We added a function init_psram to initialize pSRAM as well as some definitions.
Both AP and BSP shall call init_psram to make sure pSRAM is initialized, which is
required by PTCM.

BSP:
  To parse PTCT and find the entry of PTCM command function, then call PTCM ABI.
AP:
  Wait until BSP has done the parsing work, then call the PTCM ABI.

Synchronization of AP and BSP is ensured, both inside and outside PTCM.

2. Added calls of init_psram in init_pcpu_post to initialize pSRAM in HV booting phase

Tracked-On: #5330
Signed-off-by: Qian Wang <qian1.wang@intel.com>
2020-11-02 10:29:43 +08:00
Tao Yuhong
996e8f680c HV: pci-vuart support create vdev hcall
Add cteate method for vmcs9900 vdev in hypercalls.

The destroy method of ivshmem is also suitable for other emulated vdev,
move it into hcall_destroy_vdev() for all emulated vdevs

Tracked-On: #5394
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
2020-10-30 20:41:34 +08:00
Tao Yuhong
691abe90ff HV: vuart: send msi for pci vuart type
if vuart type is pci-vuart, then use MSI interrupt

split vuart_toggle_intr() control flow into vuart_trigger_level_intr() &
trigger_vmcs9900_msix(), because MSI is edge triggered, no deassertion
operation. Only trigger MSI for pci-vuart when assert interrupt.

Tracked-On: #5394
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2020-10-30 20:41:34 +08:00
Tao Yuhong
55b7fae67a HV: pci-vuart: pci based vuart emulation
Add emulation for pci based vuart device mcs9900 at hv land.
add struct pci_vdev_ops vuart_pci_ops, the vdev callbalks for vuart.

How to use
In misc/vm_configs/scenarios/<SCENARIO>/<BOARD>/pci_dev.c, add pci
vuart config to vm_pci_devs[] array. For example:

struct acrn_vm_pci_dev_config vm0_pci_devs[] = {
       /* console vuart setting*/
       {
               .emu_type = PCI_DEV_TYPE_HVEMUL,
               .vbdf.bits = {.b = 0x00U, .d = 0x04U, .f = 0x00U},
               .vdev_ops = &vmcs_ops,
               .vbar_base[0] = 0x80001000,	/* mmio bar */
               .vbar_base[1] = 0x80002000,	/* msix bar */
               .vuart_idx = 0,
       },
       /* communication vuart setting */
       {
               .emu_type = PCI_DEV_TYPE_HVEMUL,
               .vbdf.bits = {.b = 0x00U, .d = 0x05U, .f = 0x00U},
               .vdev_ops = &vmcs_ops,
               .vbar_base[0] = 0x80003000,
               .vbar_base[1] = 0x80004000,
               .vuart_idx = 1,
               .t_vuart.vm_id = 1U,
               .t_vuart.vuart_id = 1U,
       },
}

Tracked-On: #5394
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2020-10-30 20:41:34 +08:00
Tao Yuhong
4120bd391a HV: decouple legacy vuart interface from acrn_vuart layer
support pci-vuart type, and refine:
1.Rename init_vuart() to init_legacy_vuarts(), only init PIO type.
2.Rename deinit_vuart() to deinit_legacy_vuarts(), only deinit PIO type.
3.Move io handler code out of setup_vuart(), into init_legacy_vuarts()
4.add init_pci_vuart(), deinit_pci_vuart, for one pci vuart vdev.

and some change from requirement:
1.Increase MAX_VUART_NUM_PER_VM to 8.

Tracked-On: #5394
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
2020-10-30 20:41:34 +08:00
Tao Yuhong
6ed7b8767c HV: vuart: refine vuart read/write
The vuart_read()/vuart_write() are coupled with PIO vuart type. Move
the non-type related code into vuart_read_reg()/vuart_write_reg(), so
that we can re-use them to handle MMIO request of pci-vuart type.

Tracked-On: #5394
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-10-30 20:41:34 +08:00
Yang, Yu-chu
8c78590da7 acrn-config: refactor pci_dev_c.py and insert vuart device information
- Refactor pci_dev_c.py to insert devices information per VMs
- Add function to get unused vbdf form bus:dev.func 00:00.0 to 00:1F.7

Add pci devices variables to vm_configurations.c
- To pass the pci vuart information form tool, add pci_dev_num and
pci_devs initialization by tool
- Change CONFIG_SOS_VM in hypervisor/include/arch/x86/vm_config.h to
compromise vm_configurations.c

Tracked-On: #5426
Signed-off-by: Yang, Yu-chu <yu-chu.yang@intel.com>
2020-10-30 20:24:28 +08:00
David B. Kinder
bb6b226c86 doc: fix doxygen 1.8.17 issues
The new (1.8.17) release of doxygen is complaining about errors in the
doxygen comments that were's reported by our current 1.8.13 release.
Let's fix these now. In a separate PR we'll also update some
configuration settings that will be obsolete, in preparation for moving
to this newer version.

[External_System_ID]ACRN-6774

Tracked-On: #5385

Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
2020-10-29 08:25:01 -07:00
Zide Chen
a776ccca94 hv: don't need to save boot context
- Since de-privilege boot is removed, we no longer need to save boot
  context in boot time.
- cpu_primary_start_64 is not an entry for ACRN hypervisor any more,
  and can be removed.

Tracked-On: #5197
Signed-off-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2020-10-29 10:05:05 +08:00
Yonghua Huang
9b4ba19753 hv: enable doorbell for hv-land ivshmem device
This patch enables doorbell feature for hv-land
ivshmem device to support interrupt notification
between VMs that use inter-VM(ivshmem) devices.

Tracked-On: #5407
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Li, Fei <fei1.li@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-10-26 08:44:13 +08:00
Yonghua Huang
3ea1ae1e11 hv: refine msi interrupt injection functions
1. refine the prototype of 'inject_msi_lapic_pt()'
 2. rename below function:
    - rename 'vlapic_intr_msi()' to 'vlapic_inject_msi()'
    - rename 'inject_msi_lapic_pt()' to
      'inject_msi_for_lapic_pt()'
    - rename 'inject_msi_lapic_virt()' to
      'inject_msi_for_non_lapic_pt()'

Tracked-On: #5407
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Li Fei <fei1.li@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-10-26 08:44:13 +08:00
Yonghua Huang
012927d0bd hv: move function 'inject_msi_lapic_pt()' to vlapic.c
This function can be used by other modules instead of hypercall
 handling only, hence move it to vlapic.c

Tracked-On: #5407
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Li, Fei <fei1.li@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-10-26 08:44:13 +08:00
Yonghua Huang
f511a71c4e hv: add vmsix capability for hv-land ivshmem device
This patch exposes vmsix capability for ivshmem
  device:
  - Initialize vmsix capability in ivshmem PCI
    config space.
  - Expose BAR1 as vmsix entry table bar.

Tracked-On: #5407
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Li, Fei <fei1.li@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-10-26 08:44:13 +08:00
Yonghua Huang
8137e49e17 hv: add functions to initialize vmsix capability
- add 'vpci_add_capability()' to initialize one
   PCI capability in config space.
 - add 'add_vmsix_capability()' to add vmsix capability.

Tracked-On: #5407
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Li, Fei <fei1.li@intel.com>
Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-10-26 08:44:13 +08:00
Zide Chen
bebffb29fc hv: remove de-privilege boot mode support and remove vboot wrappers
Now ACRN supports direct boot mode, which could be SBL/ABL, or GRUB boot.
Thus the vboot wrapper layer can be removed and the direct boot functions
don't need to be wrapped in direct_boot.c:

- remove call to init_vboot(), and call e820_alloc_memory() directly at the
  time when the trampoline buffer is actually needed.
- Similarly, call CPU_IRQ_ENABLE() instead of the wrapper init_vboot_irq().
- remove get_ap_trampoline_buf(), since the existing function
  get_trampoline_start16_paddr() returns the exact same value.
- merge init_general_vm_boot_info() into init_vm_boot_info().
- remove vm_sw_loader pointer, and call direct_boot_sw_loader() directly.
- move get_rsdp_ptr() from vboot_wrapper.c to multiboot.c, and remove the
  wrapper over two boot modes.

Tracked-On: #5197
Signed-off-by: Zide Chen <zide.chen@intel.com>
2020-10-21 15:09:26 +08:00
Li Fei1
a2fd8c5a9d pci: mcfg: limit device bus numbers which could access by ECAM
Per PCI Firmware Specification Revision 3.0, 4.1.2. MCFG Table Description:
Memory Mapped Enhanced Configuration Space Base Address Allocation Structure
assign the Start Bus Number and the End Bus Number which could decoded by the
Host Bridge. We should not access the PCI device which bus number outside of
the range of [Start Bus Number, End Bus Number).
For ACRN,  we should:
1. Don't detect PCI device which bus number outside the range of
[Start Bus Number, End Bus Number) of MCFG ACPI Table.
2. Only trap the ECAM MMIO size: [MMCFG_BASE_ADDRESS, MMCFG_BASE_ADDRESS +
(End Bus Number - Start Bus Number + 1) * 0x100000) for SOS.

Tracked-On: #5233

Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-09-09 09:31:56 +08:00
Victor Sun
2c0bc146ce HV: remove deprecated vacpi build method
The old method of build pre-launched VM vacpi by HV source code is deprecated,
so remove related source code;

Tracked-On: #5266

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-09-08 19:52:25 +08:00
Victor Sun
34547e1e19 HV: add acpi module support for pre-launched VM
Previously we use a pre-defined structure as vACPI table for pre-launched
VM, the structure is initialized by HV code. Now change the method to use a
pre-loaded multiboot module instead. The module file will be generated by
acrn-config tool and loaded to GPA 0x7ff00000, a hardcoded RSDP table at
GPA 0x000f2400 will point to the XSDT table which at GPA 0x7ff00080;

Tracked-On: #5266

Signed-off-by: Victor Sun <victor.sun@intel.com>
Signed-off-by: Shuang Zheng <shuang.zheng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-09-08 19:52:25 +08:00
Victor Sun
da81a0041d HV: add e820 ACPI entry for pre-launched VM
Previously the ACPI table was stored in F segment which might not be big
enough for a customized ACPI table, hence reserve 1MB space in pre-launched
VM e820 table to store the ACPI related data:

	0x7ff00000 ~ 0x7ffeffff : ACPI Reclaim memory
	0x7fff0000 ~ 0x7fffffff : ACPI NVS memory

Tracked-On: #5266

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-09-08 19:52:25 +08:00
Nishioka, Toshiki
77fb21e98c hv: add vgpio device model support
When HV pass through the P2SB MMIO device to pre-launched VM, vgpio
device model traps MMIO access to the GPIO registers within P2SB so
that it can expose virtual IOAPIC pins to the VM in accordance with
the programmed mappings between gsi and vgsi.

Tracked-On: #5246

Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-09-07 14:52:02 +08:00
Nishioka, Toshiki
ba99984f69 hv: add INTx mapping for pre-launched VMs
Add the capability of forwarding specified physical IOAPIC interrupt
lines to pre-launched VMs as virtual IOAPIC interrupts. This is for the
sake of the certain MMIO pass-thru devices on EHL CRB which can support
only INTx interrupts.

Tracked-On: #5245

Signed-off-by: Toshiki Nishioka <toshiki.nishioka@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-09-07 14:52:02 +08:00
dongshen
3880e6186e hv: add pt_intx related members to struct acrn_vm_config
On EHL platform, we need to expose GPIO chassis interrupt to pre-launched VM
as INTx. Add related data structures so that they can be used in subsequent
commits.

Tracked-On: #5241
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
2020-09-01 09:35:50 +08:00
dongshen
10d4773f1d hv: add a new field pt_p2sb_bar to struct acrn_vm_config
On EHL platform, we need to pass through P2SB bridge to pre-launched VM.
Use pt_p2sb_bar to indicate whether to passthru p2sb bridge to pre-launched VM
or not.

Tracked-On: #5221
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
2020-09-01 09:35:50 +08:00
Yonghua Huang
c03623f3fb hv[v2]: Remove deprecated term in vPIC submodule
This patch cleanup below deprecated terms:
  'master' -> 'primary'
  'slave'  -> 'secondary'

v2 update:
      Refine comments.

Tracked-On: #5249
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2020-09-01 09:30:08 +08:00
Yuan Liu
6d0f0ebd8a hv: implement ivshmem device creation and destruction
For ivshmem vdev creation, the vdev vBDF, vBARs, shared memory region
name and size are set by device model. The shared memory name and size
must be same as the corresponding device configuration which is configured
by offline tool.

v3: add a comment to the vbar_base member of the acrn_vm_pci_dev_config
    structure that vbar_base is power-on default value

Tracked-On: #4853

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-08-28 16:53:12 +08:00
Yuan Liu
8a34cf03ca hv: add new hypercalls to create and destroy an emulated device in hypervisor
Add HC_CREATE_VDEV and HC_DESTROY_VDEV two hypercalls that are used to
create and destroy an emulated device(PCI device or legacy device) in hypervisor

v3: 1) change HC_CREATE_DEVICE and HC_DESTROY_DEVICE to HC_CREATE_VDEV
       and HC_DESTROY_VDEV
    2) refine code style

v4: 1) remove unnecessary parameter
    2) add VM state check for HC_CREATE_VDEV and HC_DESTROY hypercalls

Tracked-On: #4853

Reviewed-by: Wang, Yu1 <yu1.wang@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-08-28 16:53:12 +08:00
Wei Liu
29ac258134 acrn-config: code refactoring for CAT/MBA
1.Modify clos_mask and mba_delay as a member of the union type.
2.Move HV_SUPPORTED_MAX_CLOS ,MAX_CACHE_CLOS_NUM_ENTRIES and
MAX_MBA_CLOS_NUM_ENTRIES to misc_cfg.h file.

Tracked-On: #5229
Signed-off-by: Wei Liu <weix.w.liu@intel.com>
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
2020-08-28 16:44:06 +08:00
dongshen
a425730f64 acrn-config: rename MAX_PLATFORM_CLOS_NUM to HV_SUPPORTED_MAX_CLOS
HV_SUPPORTED_MAX_CLOS:
 This value represents the maximum CLOS that is allowed by ACRN hypervisor.
 This value is set to be least common Max CLOS (CPUID.(EAX=0x10,ECX=ResID):EDX[15:0])
 among all supported RDT resources in the platform. In other words, it is
 min(maximum CLOS of L2, L3 and MBA). This is done in order to have consistent
 CLOS allocations between all the RDT resources.

Tracked-On: #5229
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
2020-08-28 16:44:06 +08:00
Yin Fengwei
d0e06c4f80 hv: debug: Enable MMIO UART support
New board, EHL CRB, does not have legacy port IO UART. Even the PCI UART
are not work due to BIOS's bug workaround(the BARs on LPSS PCI are reset
after BIOS hand over control to OS). For ACRN console usage, expose the
debug UART via ACPI PnP device (access by MMIO) and add support in
hypervisor debug code.

Another special thing is that register width of UART of EHL CRB is
1byte. Introduce reg_width for each struct console_uart.

Tracked-On: #4937
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
2020-08-27 13:31:17 +08:00
Mingqiang Chi
53b11d1048 refine hypercall
-- use an array to fast locate the hypercall handler
   to replace switch case.
-- uniform hypercall handler as below:
   int32_t (*handler)(sos_vm, target_vm, param1, param2)

Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2020-08-26 14:55:24 +08:00
Yuan Liu
43683c7fc9 hv: implement ivshmem device-specific registers emulation
Ivshmem device defines four registers including Interrupt Mask, Interrupt
Status, IVPostion and Doorbell. The first two are useless and no emulation
is required. The latter two are used for interrupts and will be implemented
in the future.

This patch also introduces a new priv_data member for structure pci_vdev,
it can be used to find an ivshmem device through pci_vdev.

v2: refine code style

v3: 1) add @pre for ivshmem_mmio_handler function
    2) refine code style

v4: 1) set ivshmem registers default value when vBAR mapping
    2) change find_ivshmem_device to set_ivshmem_device

v5: 1) change set_ivshmem_device to find_and_set_ivshmem_device
    2) add a ASSERT to check if the vdev->priv_data is set successfully

v6: change find_and_set_ivshmem_device to create_ivshmem_device

Tracked-On: #4853

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-08-19 15:06:15 +08:00
Shuang Zheng
c26ae8c420 hv: Inter-VM communication config for hybrid_rt on whl-ipc-i5
add an IVSHMEM regoin and the related configuration parameters in
hybrid_rt scenario on whl-ipc-i5. The size of the shared memory is
2M, and it is used for the communication between VM0 and VM2.

v6: rename shm name; remove unnecessary MACROs.

v7: rename MACRO for shm name; add unassigned vbdf for post-launched
    VMs.

Tracked-On: #4853

Signed-off-by: Shuang Zheng <shuang.zheng@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-08-19 15:06:15 +08:00
Yuan Liu
92f9f5a4f3 hv: add ivshmem device
Ivshmem device is used for shared memory based communication between
pre-launched/post-launched VMs.

this patch implements ivshmem device configuration space initialization
and ivshmem device operation methods.

v2: introduce init_one_pcibar interface to simplify BAR initialization
    operation of HV emulated PCI device.

v3: 1) due to init_one_pcibar API is only used for pre-launched VM vdevs
       it can't be applied to all vdevs, so remove it.
    2) move ivshmem BARs initialization to subsequent patch, this patch
       only introduce ivshmem configuration space initialization.

Tracked-On: #4853

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-08-19 15:06:15 +08:00
Yuan Liu
d6f563c4eb hv: implement ivshmem memory regions initialization
The ivshmem memory regions use the memory of the hypervisor and
they are continuous and page aligned.

this patch is used to initialize each memory region hpa.

v2: 1) if CONFIG_IVSHMEM_SHARED_MEMORY_ENABLED is not defined, the
       entire code of ivshmem will not be compiled.
    2) change ivshmem shared memory unit from byte to page to avoid
       misconfiguration.
    3) add ivshmem configuration and vm configuration references

v3: 1) change CONFIG_IVSHMEM_SHARED_MEMORY_ENABLED to CONFIG_IVSHMEM_ENABLED
    2) remove the ivshmem configuration sample, offline tool provides default
       ivshmem configuration.
    3) refine code style.

v4: 1) make ivshmem_base 2M aligned.

Tracked-On: #4853

Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-08-19 15:06:15 +08:00
Wei Liu
088cd62d8b HV: sync hv reference code that generated by config tool
Sync hv reference code that generated by acrn-config tool.

Tracked-On: #5092
Signed-off-by: Wei Liu <weix.w.liu@intel.com>
2020-08-17 14:34:30 +08:00
Junming Liu
23d9c13c41 hv:cpuid:refine cpuid_subleaf interface
There's a corner case:
When want to get CPUID.01H:EDX value,
may have the following code snippet:

uint32_t unused,edx;
cpuid_subleaf(0x1U, 0x0U, &unused, &unused, &unused, &edx);

while in cpuid_subleaf:
*eax = leaf;
*ecx = subleaf;
eax and ecx point to the same location,
When deep into asm_cpuid, it's input value will be 0x0U and 0x0U.
but the expected input value is 0x1U and 0x0U.

This case will return CPUID.00H:EDX, which is the wrong answer.

Tracked-On: #4526

Signed-off-by: Junming Liu <junming.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-08-17 10:14:00 +08:00
Junming Liu
3631a85c3c hv:cpu-caps:refine is_apl_platform func and clean up duplicated code
Fix the bug for "is_apl_platform" func.
"monitor_cap_buggy" is identical to "is_apl_platform", so remove it.

On apl platform:
1) ACRN doesn't use monitor/mwait instructions
2) ACRN disable GPU IOMMU

Tracked-On:#3675

Signed-off-by: Junming Liu <junming.liu@intel.com>
2020-08-14 10:08:50 +08:00
liujunming
538e7cf74d hv:cpu-caps:refine processor family and model info
v3 -> v4:
Refine commit message and code stype

1.
SDM Vol. 2A 3-211 states DisplayFamily = Extended_Family_ID + Family_ID
when Family_ID == 0FH.
So it should be family += ((eax >> 20U) & 0xffU) when Family_ID == 0FH.

2.
IF (Family_ID = 06H or Family_ID = 0FH)
	THEN DisplayModel = (Extended_Model_ID « 4) + Model_ID;
While previous code this logic:
IF (DisplayFamily = 06H or DisplayFamily = 0FH)

Fix the bug about calculation of display family and
display model according to SDM definition.

3. use variable name to distinguish Family ID/Display Family/Model ID/Display Model,
then the code is more clear to avoid some mistake

Tracked-On:#3675

Signed-off-by: liujunming <junming.liu@intel.com>
Reviewed-by: Wu Xiangyang <xiangyang.wu@linux.intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-08-14 10:08:50 +08:00
Victor Sun
8245145317 HV: remove sanitize_vm_config function
Remove function of sanitize_vm_config() since the processing of sanitizing
will be moved to pre-build process.

When hypervisor has booted, we assume all VM configurations is sanitized;

Tracked-On: #5077

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-08-12 10:21:17 +08:00
Mingqiang Chi
a67a85c70d hv:refine vm & vcpu lock
-- move vm_state_lock to other place in vm structure
   to avoid the memory waste because of the page-aligned.
-- remove the memset from create_vm
-- explicitly set max_emul_mmio_regions and vcpuid_entry_nr to 0
   inside create_vm to avoid use without initialization.
-- rename max_emul_mmio_regions to nr_emul_mmio_regions
v1->v2:
   add deinit_emul_io in shutdown_vm

Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-08-05 13:39:28 +08:00
Victor Sun
a57a4fd7fb HV: Make: enable build for new configs layout
The make command is same as old configs layout:

under acrn-hypervisor folder:
	make hypervisor BOARD=xxx SCENARIO=xxx [TARGET_DIR]=xxx [RELEASE=x]

under hypervisor folder:
	make BOARD=xxx SCENARIO=xxx [TARGET_DIR]=xxx [RELEASE=x]

if BOARD/SCENARIO parameter is not specified, the default will be:
	BOARD=nuc7i7dnb SCENARIO=industry

Tracked-On: #5077

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-07-24 16:16:06 +08:00
Victor Sun
e792fa3d3c HV: nuc7i7dnb example of new VM configuratons layout
There are 3 kinds of configurations in ACRN hypervisor source code: hypervisor
overall setting, per-board setting and scenario specific per-VM setting.
Currently Kconfig act as hypervisor overall setting and its souce is located at
"hypervisor/arch/x86/configs/$(BOARD).config"; Per-board configs are located at
"hypervisor/arch/x86/configs/$(BOARD)" folder; scenario specific per-VM configs
are located at "hypervisor/scenarios/$(SCENARIO)" folder.

This layout brings issues that board configs and VM configs are coupled tightly.
The board specific Kconfig file and misc_cfg.h are shared by all scenarios, and
scenario specific pci_dev.c is shared by all boards. So the user have no way to
build hypervisor binary for different scenario on different board with one
source code repo.

The patch will setup a new VM configurations layout as below:

  misc/vm_configs
  ├── boards                         --> folder of supported boards
  │   ├── <board_1>                  --> scenario-irrelevant board configs
  │   │   ├── board.c                --> C file of board configs
  │   │   ├── board_info.h           --> H file of board info
  │   │   ├── pci_devices.h          --> pBDF of PCI devices
  │   │   └── platform_acpi_info.h   --> native ACPI info
  │   ├── <board_2>
  │   ├── <board_3>
  │   └── <board...>
  └── scenarios                      --> folder of supported scenarios
      ├── <scenario_1>               --> scenario specific VM configs
      │   ├── <board_1>              --> board specific VM configs for <scenario_1>
      │   │   ├── <board_1>.config   --> Kconfig for specific scenario on specific board
      │   │   ├── misc_cfg.h         --> H file of board specific VM configs
      │   │   ├── pci_dev.c          --> board specific VM pci devices list
      │   │   └── vbar_base.h        --> vBAR base info of VM PT pci devices
      │   ├── <board_2>
      │   ├── <board_3>
      │   ├── <board...>
      │   ├── vm_configurations.c    --> C file of scenario specific VM configs
      │   └── vm_configurations.h    --> H file of scenario specific VM configs
      ├── <scenario_2>
      ├── <scenario_3>
      └── <scenario...>

The new layout would decouple board configs and VM configs completely:

The boards folder stores kinds of supported boards info, each board folder
stores scenario-irrelevant board configs only, which could be totally got from
a physical platform and works for all scenarios;

The scenarios folder stores VM configs of kinds of working scenario. In each
scenario folder, besides the generic scenario specific VM configs, the board
specific VM configs would be put in a embedded board folder.

In new layout, all configs files will be removed out of hypervisor folder and
moved to a separate folder. This would make hypervisor LoC calculation more
precisely with below fomula:
	typical LoC = Loc(hypervisor) + Loc(one vm_configs)
which
	Loc(one vm_configs) = Loc(misc/vm_configs/boards/<board>)
		+ LoC(misc/vm_configs/scenarios/<scenario>/<board>)
		+ Loc(misc/vm_configs/scenarios/<scenario>/vm_configurations.c
		+ Loc(misc/vm_configs/scenarios/<scenario>/vm_configurations.h

Tracked-On: #5077

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-07-24 16:16:06 +08:00
Victor Sun
8bcab8e294 HV: add VM uuid and type for pre-launched RTVM
add VM UUID and CONFIG_XX_VM() api for pre-launched RTVM;

Tracked-On: #5081
Signed-off-by: Victor Sun <victor.sun@intel.com>
2020-07-23 21:58:32 +08:00
Shuo A Liu
112f02851c hv: Disable XSAVE-managed CET state of guest VM
To hide CET feature from guest VM completely, the MSR IA32_MSR_XSS also
need to be intercepted because it comprises CET_U and CET_S feature bits
of xsave/xstors operations. Mask these two bits in IA32_MSR_XSS writing.

With IA32_MSR_XSS interception, member 'xss' of 'struct ext_context' can
be removed because it is duplicated with the MSR store array
'vcpu->arch.guest_msrs[]'.

Tracked-On: #5074
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2020-07-23 20:15:57 +08:00
Shuo A Liu
ac598b0856 hv: Hide CET feature from guest VM
Return-oriented programming (ROP), and similarly CALL/JMP-oriented
programming (COP/JOP), have been the prevalent attack methodologies for
stealth exploit writers targeting vulnerabilities in programs.

CET (Control-flow Enforcement Technology) provides the following
capabilities to defend against ROP/COP/JOP style control-flow subversion
attacks:
 * Shadow stack: Return address protection to defend against ROP.
 * Indirect branch tracking: Free branch protection to defend against
   COP/JOP

The full support of CET for Linux kernel has not been merged yet. As the
first stage, hide CET from guest VM.

Tracked-On: #5074
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2020-07-23 20:15:57 +08:00
Li Fei1
5e605e0daf hv: vmcall: check vm id in dispatch_sos_hypercall
Check whether vm_id is valid in dispatch_sos_hypercall

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-07-23 20:13:20 +08:00
Li Fei1
1859727abc hv: vapci: add tpm2 support for pre-launched vm
On WHL platform, we need to pass through TPM to Secure pre-launched VM. In order
to do this, we need to add TPM2 ACPI Table and add TPM DSDT ACPI table to include
the _CRS.

Now we only support the TPM 2.0 device (TPM 1.2 device is not support). Besides,
the TPM must use Start Method 7 (Uses the Command Response Buffer Interface)
to notify the TPM 2.0 device that a command is available for processing.

Tracked-On: #5053
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-07-23 20:13:20 +08:00
Li Fei1
acc69007e2 hv: mmio_dev: add mmio device pass through support
Add mmio device pass through support for pre-launched VM.
When we pass through a MMIO device to pre-launched VM, we would remove its
resource from the SOS. Now these resources only include the MMIO regions.

Tracked-On: #5053
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-07-23 20:13:20 +08:00
Li Fei1
baf77a79ad hv: mmio_dev: add hypercall to support mmio device pass through
Add two hypercalls to support MMIO device pass through for post-launched VM.
And when we support MMIO pass through for pre-launched VM, we could re-use
the code in mmio_dev.c

Tracked-On: #5053
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-07-23 20:13:20 +08:00
Conghui Chen
821c65b40c hv: fix possible SSE region mismatch issue
During context switch in hypervisor, xsave/xrstore are used to
save/resotre the XSAVE area according to the XCR0 and XSS. The legacy
region in XSAVE area include FPU and SSE, we should make sure the
legacy region be saved during contex switch. FPU in XCR0 is always
enabled according to SDM.
For SSE, we enable it in XCR0 during context switch.

Tracked-On: #5062
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-07-22 14:19:21 +08:00
Conghui Chen
53d4a7169b hv: remove kick_thread from scheduler module
kick_thread function is only used by kick_vcpu to kick vcpu out of
non-root mode, the implementation in it is sending IPI to target CPU if
target obj is running and target PCPU is not current one; while for
runnable obj, it will just make reschedule request. So the kick_thread
is not actually belong to scheduler module, we can drop it and just do
the cpu notification in kick_vcpu.

Tracked-On: #5057
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-07-22 13:38:41 +08:00
Conghui Chen
b6422f8985 hv: remove 'running' from vcpu structure
vcpu->running is duplicated with THREAD_STS_RUNNING status of thread
object. Introduce an API sleep_thread_sync(), which can utilize the
inner status of thread object, to do the sync sleep for zombie_vcpu().

Tracked-On: #5057
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-07-22 13:38:41 +08:00
Conghui Chen
2abbb99f6a hv: make thread status more accurate
1. Update thread status after switch_in/switch_out.
2. Add 'be_blocking' to represent the intermediate state during
sleep_thread and switch_out. After switch_out, the thread status
update to THREAD_STS_BLOCKED.

Tracked-On: #5057
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-07-22 13:38:41 +08:00
Mingqiang Chi
aa89eb3541 hv:add per-vm lock for vm & vcpu state change
-- replace global hypercall lock with per-vm lock
-- add spinlock protection for vm & vcpu state change

v1-->v2:
   change get_vm_lock/put_vm_lock parameter from vm_id to vm
   move lock obtain before vm state check
   move all lock from vmcall.c to hypercall.c

Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-07-20 11:22:17 +08:00
yuhong.tao@intel.com
eb36337622 HV: vdev passthough hidding SRIOV
Support hide SRIOV extend capability for passthough device

Tracked-On: #5041
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
2020-07-16 17:27:18 +08:00
Li Fei1
80c7da8f1c hv: vioapic: expose ioapic to guest unconditionally
Some OSes assume the platform must have the IOAPIC. For example:
Linux Kernel allocates IRQ force from GSI (0 if there's no PIC and IOAPIC) on x86.
And it thinks IRQ 0 is an architecture special IRQ, not for device driver. As a
result, the device driver may goes wrong if the allocated IRQ is 0 for RTVM.

This patch expose vIOAPIC to RTVM with LAPIC passthru even though the RTVM can't
use IOAPIC, it servers as a place holder to fullfil the guest assumption.

After vIOAPIC has exposed to guest unconditionally, the 'ready' field could be
removed since we do vIOAPIC initialization for each guest.

Tracked-On: #4691
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-07-10 19:33:46 +08:00
Mingqiang Chi
3b120807c9 hv:rename vioapic.mtx to vioapic.lock
rename vioapic.mtx to vioapic.lock

Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2020-07-02 09:40:29 +08:00
Li Fei1
82f9233d4a hv: vpci: a minor fix about is_zombie_vf
Now we check whether a device is zombie by the ->user != NULL.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-06-21 12:07:15 +08:00
Mingqiang Chi
1b84741a56 rename vm_lock/vlapic_state in VM structure
rename:
   vlapic_state-->vlapic_mode
   vm_lock -->  vlapic_mode_lock
   check_vm_vlapic_state --> check_vm_vlapic_mode

Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2020-06-19 16:13:20 +08:00
Mingqiang Chi
0bd6555cab remove pci_device_lock in pci.c
-- remove unnecessary lock in pci_mmcfg_read_cfg and
   pci_mmcfg_write_cfg since the mmio operation is atomic
   if the offest is aligned with 1/2/4 bytes.
-- move pci_is_valid_access to pci.h

Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2020-06-19 16:13:20 +08:00
Mingqiang Chi
d0a4052518 remove dead code in io.h
remove thess APIs:
   set64
   set32
   set16
   set8

Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2020-06-19 16:13:20 +08:00
Conghui Chen
2a4c59db74 hv: add check for BASIC VMX INFORMATION
Check bit 48 in IA32_VMX_BASIC MSR, if it is 1, return error, as we only
support Intel 64 architecture.

SDM:
Appendix A.1 BASIC VMX INFORMATION

Bit 48 indicates the width of the physical addresses that may be used for the
VMXON region, each VMCS, anddata structures referenced by pointers in a
VMCS (I/O bitmaps, virtual-APIC page, MSR areas for VMX transitions). If
the bit is 0, these addresses are limited to the processor’s
physical-address width.2 If the bit is 1, these addresses are limited to
32 bits. This bit is always 0 for processors that support Intel 64
architecture.

Tracked-On: #4956
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
2020-06-18 14:05:56 +08:00
Qian Wang
882c9d5d76 HV: refine pci_find_vdev with hash
hv: pci: refine pci_find_vdev with hash

1. Refined pci_find_vdev with BDF-hashing for better performance

Tracked-On: #4857
Signed-off-by: Wang Qian <qian1.wang@intel.com>
Reviewed-by: Li Fei <Fei1.Li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-18 12:58:40 +08:00
Qian Wang
8fb8d81935 HV: refine pci_lookup_drhd_for_pbdf with hash
hv: pci: refine pci_lookup_drhd_for_pbdf with hash

1. Added an auxiliary function pci_find_pdev using hash to find pdev
with pbdf, thus pci_lookup_drhd_for_pbdf will have a better performance

Tracked-On: #4857
Signed-off-by: Wang Qian <qian1.wang@intel.com>
Reviewed-by: Li Fei <Fei1.Li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-18 12:58:40 +08:00
Qian Wang
f58bf1f03f HV: rename pci_pdev_array to pci_pdevs
hv: pci: rename pci_pdev_array to pci_pdevs to make it clearer

Tracked-On: #4857
Signed-off-by: Wang Qian <qian1.wang@intel.com>
Reviewed-by: Li Fei <Fei1.Li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-18 12:58:40 +08:00
Binbin Wu
6be27cdcab hv: vmsi: add vmsix on msi emulation support
Some passthrough devices require multiple MSI vectors, but don't
support MSI-X. In meanwhile, Linux kernel doesn't support continuous
vector allocation.
On native platform, this issue can be mitigated by IOMMU via interrupt
remapping. However, on ACRN, there is no vIOMMU.
vMSI-X on MSI emulation is one solution to mitigate this problem on ACRN.

This patch adds MSI-X emulation on MSI capability.
For the device needs to do MSI-X emulation, HV will hide MSI capability
and present MSI-X capability to guest.

The guest driver may need to modify to reqeust MSI-X vector.
For example:
        ret = pci_alloc_irq_vectors(pdev, 1, STMMAC_MSI_VEC_MAX,
-                                   PCI_IRQ_MSI);
+                                   PCI_IRQ_MSI | PCI_IRQ_MSIX);

To enable MSI-X emulation, the device should:
- 1. The device should be in vmsix_on_msi_devs array.
- 2. Support MSI, but don't support MSI-X.
- 3. MSI capability should support per-vector mask.
- 4. The device should have an unused BAR.
- 5. The device driver should not rely on PBA for functionality.

Tracked-On: #4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-16 08:52:56 +08:00
Binbin Wu
da1788c9a3 hv: vtd: add an API to reserve continuous irtes
dmar_reserve_irte is added to reserve N coutinuous IRTEs.
N could be 1, 2, 4, 8, 16, or 32.

The reserved IRTEs will not be freed.

Tracked-On:#4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-16 08:52:56 +08:00
Binbin Wu
7bfcc673a6 hv: ptirq: associate an irte with ptirq_remapping_info entry
For a ptirq_remapping_info entry, when build IRTE:
- If the caller provides a valid IRTE, use the IRET
- If the caller doesn't provide a valid IRTE, allocate a IRET when the
entry doesn't have a valid IRTE, in this case, the IRET will be freed
when free the entry.

Tracked-On:#4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-16 08:52:56 +08:00
Binbin Wu
2fe4280cfa hv: vtd: add two paramters for dmar_assign_irte
idx_in:
- If the caller of dmar_assign_irte passes a valid IRTE index, it will
be resued;
- If the caller of dmar_assign_irte passes INVALID_IRTE_ID as IRTE index,
the function will allocate a new IRTE.

idx_out:
This paramter return the actual index of IRTE used. The caller need to
check whether the return value is valid or not.

Also this patch adds an internal function alloc_irte.
The function takes count as input paramter to allocate continuous IRTEs.
The count can only be 1, 2, 4, 8, 16 or 32.
This is prepared for multiple MSI vector support.

Tracked-On: #4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-16 08:52:56 +08:00
Li Fei1
65e4a16e6a hv: mmu: release 1GB cpu side support constrain
There're some platforms still doesn't support 1GB large page on CPU side.
Such as lakefield, TNT and EHL platforms on which have some silicon bug and
this case CPU don't support 1GB large page.

This patch tries to release this constrain to support more hardware platform.

Note this patch doesn't release the constrain on IOMMU side.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-06-15 15:16:34 +08:00
Binbin Wu
c907a820df hv: config: add msix emulation support
The information needed to enable MSI-x emulation.
Only enable MSI-x emuation for the devices in msix_emul_devs array.
Currently, only EHL has the need to enable MSI-x emulation for TSN
devices.

Tracked-On: #4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-10 14:32:15 +08:00
Victor Sun
db690e0967 HV: enable multiboot module string as kernel bootargs
Previously the VM kernel bootargs for pre-launched VMs and direct boot mode
of SOS VM are built-in hypervisor binary so end users have no way to change
it. Now we provide another option that the multiboot module string could be
used as bootargs also. This would bring convenience to end users when they
use GRUB as bootloader because the bootargs could be configurable in GRUB
menu.

The usage is if there is any string follows configured kernel_mod_tag in
module string, the string will be used as new kernel bootargs instead of
built-in kernel bootargs. If there is no string follows kernel_mod_tag,
then the built-in bootargs will be the default kernel bootargs.

Please note kernel_mod_tag must be the first word in module string in any
case, it is used to specify the module for which VM.

Tracked-On: #4885

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2020-06-08 13:30:04 +08:00
Victor Sun
80262f0602 HV: rename append_seed_arg to fill_seed_arg
Previously append_seed_arg() just do fill in seed arg to dest cmd buffer,
so rename the api name to fill_seed_arg().

Since fill_seed_arg() will be called in SOS VM path only, the param of
bool vm_is_sos is not needed and will be replaced by dest buffer size.

The seed_args[] which used by fill_seed_arg() is pre-defined as all-zero,
so memset() is not needed in fill_seed_arg(), buffer pointer check
and strncpy_s() are not needed also.

Tracked-On: #4885

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-08 13:30:04 +08:00
Victor Sun
47d20f37e1 HV: replace merge_cmdline api with strncat_s
Add a standard string api strncat_s() to replace merge_cmdline() to make code
more readable.

Another change is that the multiboot cmdline will be appended to the end of
configured SOS bootargs instead of the beginning, this would enable a feature
that some kernel cmdline paramter items could be overriden by multiboot cmdline
since the later one would win if same parameters configured in kernel cmdline.

Tracked-On: #4885

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-08 13:30:04 +08:00
Victor Sun
bad12039c6 HV: rewrite strncpy_s to be iso c11 compliant
Per C11 standard (ISO/IEC 9899:2011): K.3.7.1.4

1. Copying shall not take place between objects that overlap;
2. If there is a runtime-constraint violation, the strncpy_s function sets
   s1[0] to '\0\;
3. The strncpy_s function returns zero if there was no runtime-constraint
   violation. Otherwise, a nonzero value is returned.
4. The function is implemented with memcpy_s() because the runtime-constraint
   detection is almost same.

Tracked-On: #4885

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-08 13:30:04 +08:00
Victor Sun
e254be150a HV: rewrite memcpy_s to be iso c11 compliant
Per C11 standard (ISO/IEC 9899:2011): K.3.7.1.1

1. Copying shall not take place between objects that overlap;
2. If there is a runtime-constraint violation, the memcpy_s function stores
   zeros in the first s1max characters of the object;
3. The memcpy_s function returns zero if there was no runtime-constraint
   violation. Otherwise, a nonzero value is returned.

Tracked-On: #4885

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-08 13:30:04 +08:00
Li Fei1
c4b5af9663 hv: pci: remove some unnecessary functions
We define some functions to read some fields of the CFG header registers. We
could remove them since they're not necessary since calling pci_pdev_read_cfg
is simple.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-06-05 16:53:33 +08:00
Li Fei1
ae4fa40adc hv: vpci: hv: vpci: refine pci device assignment logic
Now Host Bridge and PCI Bridge could only be added to SOS's acrn_vm_pci_dev_config.
So For UOS, we always emualte Host Bridge and PCI Bridge for it and assign PCI device
to it; for SOS, if it's the highest severity VM, we will assign Host Bridge and PCI
Bridge to it directly, otherwise, we will emulate them same as UOS.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-03 22:00:43 +08:00
Li Fei1
b8f151a55f hv: pci: check whether a PCI device is host bridge or not by class
According PCI Code and ID Assignment Specification Revision 1.11, a PCI device
whose Base Class is 06h and Sub-Class is 00h is a Host bridge.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-03 22:00:43 +08:00
Li Fei1
0bd2daf1c5 hv: pci: remove host bridge BDF definition
We should check whether a PCI device is host bridge or not by Base Class (06h)
and Sub-Class (00h).

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-06-03 22:00:43 +08:00
Vijay Dhanraj
d03df0c7e2 HV: Fix MP Init sequence hang by adding a delay
As per the BWG a delay should be provided between the
INIT IPI and Startup IPI. Without the delay observe hangs
on certain platforms during MP Init sequence. So Setting
a delay of 10us between assert INIT IPI and Startup IPI.

Also, as per SDM section 10.7 the the de-assert INIT IPI is
only used for Pentium and P6 processors. This is not applicable
for Pentium4 and Xeon processors so removing this sequence.

Tracked-On: #4835
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-27 13:34:59 +08:00
Binbin Wu
3009d9399f hv: vtd: cleanup snoop control related code
Snoop control will not be turned on by hypervisor, delete snoop control
related code.

Tracked-On: #4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-27 11:27:42 +08:00
Shuo A Liu
9a15ea82ee hv: pause all other vCPUs in same VM when do wbinvd emulation
Invalidate cache by scanning and flushing the whole guest memory is
inefficient which might cause long execution time for WBINVD emulation.
A long execution in hypervisor might cause a vCPU stuck phenomenon what
impact Windows Guest booting.

This patch introduce a workaround method that pausing all other vCPUs in
the same VM when do wbinvd emulation.

Tracked-On: #4703
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-21 15:21:29 +08:00
Mingqiang Chi
f994b5ffaf hv:cleanup vcpu state
-- remove VCPU_PAUSED and resume_vcpu
-- remove vcpu->prev_state in vcpu structure
-- rename pause_vcpu to zombie_vcpu

Tracked-On: #4320
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2020-05-21 15:08:49 +08:00
Li Fei1
53af096726 hv: ptirq: refine find_ptirq_entry by hashing
Refine find_ptirq_entry by hashing instead of walk each of the PTIRQ entries one by one.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
2020-05-20 16:04:16 +08:00
Yonghua Huang
63c019c6d2 hv: Add 64 bits hash function
This patch adds hash function to hash 64bit value.

Tracked-On: #4550
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
2020-05-20 16:04:16 +08:00
Yonghua Huang
3391bffb27 hv:fix rtvm hang with maxcpus=0/1 in bootargs
RTVM (with lapic PT) boots hang when maxcpus is
 assigned a value less than the CPU number configured
 in hypervisor.

 In this case, vlapic_state(per VM) is left in TRANSITION
 state after BSP boot, which blocks interupts to be injected
 to this UOS.

Tracked-On: #4803
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Li, Fei <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-15 10:09:13 +08:00
Li Fei1
27a66acd0e hv: ptdev: refine look up MSI ptirq entry
There's no need to look up MSI ptirq entry by virtual SID any more since the MSI
ptirq entry would be removed before the device is assigned to a VM.

Now the logic of MSI interrupt remap could simplify as:
1. Add the MSI interrupt remap first;
2. If step is already done, just do the remap part.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com>
2020-05-13 14:31:01 +08:00
Li Fei1
f9d26a80ed hv: vpci: refine vpci deinit
The existing code do separately for each VM when we deinit vpci of a VM. This is
not necessary. This patch use the common handling for all VMs: we first deassign
it from the (current) user, then give it back to its parent user.

When we deassign the vdev from the (current) user, we would de-initialize the
vMSI/VMSI-X remapping, so does the vMSI/vMSI-X data structure.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
2020-05-13 14:31:01 +08:00
Li Fei1
15e3062631 hv: vpci: remove is_own_device()
Now we could know a device status by 'user' filed, like

---------------------------------------------------------------------------
           | NULL              | == vdev           | != NULL && != vdev
vdev->user | device is de-init | used by itself VM | assigned to another VM
---------------------------------------------------------------------------

So we don't need to modify 'vpci' field accordingly.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
2020-05-13 14:31:01 +08:00
Li Fei1
af8329394b hv: vpci: minor refine the vdev ownership data structure
Add a new field 'parent_user' to record the parent user of the vdev.  And refine
'new_owner' to 'user' to record who is the current user of the vdev. Like

-----------------------------------------------------------------------------------------------
vdev in    |   HV       |   pre-VM       |               SOS                   | post-VM
           |            |                |vdev used by SOS|vdev used by post-VM|
-----------------------------------------------------------------------------------------------
parent_user| NULL(HV)   |   NULL(HV)     |   NULL(HV)     |   NULL(HV)         | vdev in SOS
-----------------------------------------------------------------------------------------------
user       | vdev in HV | vdev in pre-VM |   vdev in SOS  |   vdev in post-VM  | vdev in post-VM
-----------------------------------------------------------------------------------------------

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
2020-05-13 14:31:01 +08:00
Zide Chen
0a956c34c7 hv: add a new field cpu_affinity in struct acrn_vm
For post-launched VMs, the configured CPU affinity could be different
from the actual running CPU affinity. This new field acrn_vm->cpu_affinity
recognizes this difference so that it's possible that CREATE_VM
hypercall won't overwrite the configured CPU afifnity.

Change name cpu_affinity_bitmap in acrn_vm_config to cpu_affinity.
This is read-only in run time, never overwritten by acrn-dm.

Remove vm_config->vcpu_num, which means the number of vCPUs of the
configured CPU affinity. This is not to be confused with the actual
running vCPU number: vm->hw.created_vcpus.

Changed get_vm_bsp_pcpu_id() to get_configured_bsp_pcpu_id() for less
confusion.

Tracked-On: #4616
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-08 11:04:31 +08:00
Yan, Like
869ccb7ba8 HV: RDT: add CDP support in ACRN
CDP is an extension of CAT. It enables isolation and separate prioritization of
code and data fetches to the L2 or L3 cache in a software configurable manner,
depending on hardware support.

This commit adds a Kconfig switch "CDP_ENABLED" which depends on "RDT_ENABLED".
CDP will be enabled if the capability available and "CDP_ENABLED" is selected.

Tracked-On: #4604
Signed-off-by: Yan, Like <like.yan@intel.com>
Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-05-08 08:50:13 +08:00
Yan, Like
277c668b04 HV: RDT: clean up RDT code
This commit makes some RDT code cleanup, mainling including:
- remove the clos_mask and mba_delay validation check in setup_res_clos_msr(), the check will be done in pre-build;
- rename platform_clos_num to valid_clos_num, which is set as the minimal clos_mas of all enabled RDT resouces;
- init the platform_clos_array in the res_cap_info[] definition;
- remove the unnecessary return values and return value check.

Tracked-On: #4604
Signed-off-by: Yan, Like <like.yan@intel.com>
2020-05-08 08:50:13 +08:00
Yan, Like
f774ee1fba HV: RDT: merge struct rdt_cache and rdt_membw in to a union
A RDT resource could be CAT or MBA, so only one of struct rdt_cache and struct rdt_membw
would be used at a time. They should be a union.
This commit merge struct rdt_cache and struct rdt_membw in to a union res.

Tracked-On: #4604
Signed-off-by: Yan, Like <like.yan@intel.com>
Reviewed-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com
2020-05-08 08:50:13 +08:00
Li Fei1
0c6b3e57d6 hv: ptdev: minor refine about ptirq_build_physical_msi
The virtual MSI information could be included in ptirq_remapping_info structrue,
there's no need to pass another input paramater for this puepose. So we could
remove the ptirq_msi_info input.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-05-06 11:51:11 +08:00
Li Fei1
067b439e69 hv: irq: minor refine about structure idt_64_descriptor
The 'value' field in structure idt_64_descriptor is no one used. We could remove it.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-04-26 10:48:49 +08:00
Li Fei1
907a0f7c04 hv: vioapic: minor refine about vioapic_init
Most code in the if ... else is duplicated. We could put it out of the
conditional statement.

Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2020-04-24 15:35:38 +08:00
Zide Chen
3691e305c0 hv: dynamically configure CPU affinity through hypercall
- add a new member cpu_affinity to struct acrn_create_vm, so that acrn-dm
  is able to assign CPU affinity through HC_CREATE_VM hypercall.

- if vm_create.cpu_affinity is zero, hypervisor launches the VM with the
  statically configured CPU affinity.

Tracked-On: #4616
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-23 09:38:54 +08:00
Zide Chen
9150284ca7 hv: replace vcpu_affinity array with cpu_affinity_bitmap
Currently the vcpu_affinity[] array fixes the vCPU to pCPU mapping.
While the new cpu_affinity_bitmap doesn't explicitly sepcify this
mapping, instead, it implicitly assumes that vCPU0 maps to the pCPU
with lowest pCPU ID, vCPU1 maps to the second lowest pCPU ID, and
so on.

This makes it possible for post-launched VM to run vCPUs on a subset of
these pCPUs only, and not all of them.

acrn-dm may launch post-launched VMs with the current approach: indicate
VM UUID and hypervisor launches all VCPUs from the PCPUs that are masked
in cpu_affinity_bitmap.

Also acrn-dm can choose to launch the VM on a subset of PCPUs that is
defined in cpu_affinity_bitmap. In this way, acrn-dm must specify the
subset of PCPUs in the CREATE_VM hypercall.

Additionally, with this change, a guest's vcpu_num can be easily calculated
from cpu_affinity_bitmap, so don't assign vcpu_num in vm_configuration.c.

Tracked-On: #4616
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2020-04-23 09:38:54 +08:00
Victor Sun
9264f51456 HV: refine usage of idle=halt in sos cmdline
The parameter of "idle=halt" for SOS cmdline is only needed when cpu sharing
is enabled, otherwise it will impact SOS power.

Tracked-On: #4329

Signed-off-by: Victor Sun <victor.sun@intel.com>
2020-04-22 14:49:04 +08:00