Commit Graph

1054 Commits

Author SHA1 Message Date
Conghui Chen
e61412981d hv: support xsave in context switch
xsave area:
    legacy region: 512 bytes
    xsave header: 64 bytes
    extended region: < 3k bytes

So, pre-allocate 4k area for xsave. Use certain instruction to save or
restore the area according to hardware xsave feature set.

Tracked-On: #4166
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-12-02 09:31:12 +08:00
Li Fei1
6ee076f7df hv: assign: rename ptirq_msix_remap to ptirq_prepare_msix_remap
ptirq_msix_remap doesn't do the real remap, that's the vmsi_remap and vmsix_remap_entry
does. ptirq_msix_remap only did the preparation.

Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-11-29 08:53:07 +08:00
Binbin Wu
fa3888c12a hv: ept: disable execute right on large pages
Issue description:
-----------------
Machine Check Error on Page Size Change
Instruction fetch may cause machine check error if page size
and memory type was changed without invalidation on some
processors[1][2]. Malicious guest kernel could trigger this issue.

This issue applies to both primary page table and extended page
tables (EPT), however the primary page table is controlled by
hypervisor only. This patch mitigates the situation in EPT.

Mitigation details:
------------------
Implement non-execute huge pages in EPT.
This patch series clears the execute permission (bit 2) in the
EPT entries for large pages. When EPT violation is triggered by
guest instruction fetch, hypervisor converts the large page to
smaller 4 KB pages and restore the execute permission, and then
re-execute the guest instruction.

The current patch turns on the mitigation by default.
The follow-up patches will conditionally turn on/off the feature
per processor model.

[1] Refer to erratum KBL002 in "7th Generation Intel Processor
Family and 8th Generation Intel Processor Family for U Quad Core
Platforms Specification Update"
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/7th-gen-core-family-spec-update.pdf
[2] Refer to erratum SKL002 in "6th Generation Intel Processor
Family Specification Update"
https://www.intel.com/content/www/us/en/products/docs/processors/core/desktop-6th-gen-core-family-spec-update.html

Tracked-On: #4101
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2019-11-13 08:00:36 +08:00
Peter Fang
b7329f10a5 hv: instr_emul: use cs segment when fetching instructions
In non-64-bit mode, CS segment base address should be considered when
determining the linear address of the vcpu's instruction pointer. Use
vie_calculate_gla() for instruction address translation which also takes
care of 64-bit mode.

Tracked-On: #4064
Signed-off-by: Peter Fang <peter.fang@intel.com>
2019-11-11 13:55:24 +08:00
Mingqiang Chi
8666ba6c01 hv:remove unnecessary wrapper for emulate_instruction
remove unnecessary wrapper for this api(emulate_instruction)

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2019-11-09 11:43:37 +08:00
Yonghua Huang
e51386fe04 hv: refine 'uint64_t' string print format in x86 moudle
Use "0x%lx" string to format 'uint64_t' type value,
 instead of "0x%llx".

Tracked-On: #4020
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2019-11-09 11:42:38 +08:00
Victor Sun
3411f00b5b HV: fix misra violation on platform clos array
MISRA C requires specified bounds for arrays declaration, previous declaration
of platform_clos_array in board.h does not meet the requirement.

Tracked-On: #3987

Signed-off-by: Victor Sun <victor.sun@intel.com>
2019-11-08 16:40:14 +08:00
Conghui Chen
75f512ce8c hv: rename vuart operations
fifo_reset -> reset_fifo
vuart_fifo_init -> init_fifo
vuart_setup - > setup_vuart
vuart_init -> init_vuart
vuart_deinit -> deinit_vuart
vuart_lock_init -> init_vuart_lock
vuart_lock -> obtain_vuart_lock
vuart_unlock -> release_vuart_lock
vuart_deinit_connect -> vuart_deinit_connection

Tracked-On: #4017
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-11-08 09:01:01 +08:00
Kaige Fu
20c1ad1b3a HV: correct the formatting flag of hypcall_id
hypcall_id has a type of uint64_t and should use 'llx' as
formatting flag instead of '%d'. Otherwise, we will get a
confusing error log when not-allowed hypercall occurs.

Without this patch:
[96707209us][cpu=1][sev=3][seq=2386]:hypercall -2147483548 is only allowed from SOS_VM!

With this patch:
[84613395us][cpu=1][sev=3][seq=2136]:hypercall 0x80000064 is only allowed from SOS_VM!

So, we can figure out which not-allowed hypercall has been triggered more conveniently.

BTW, this patch adds hypcall_id which triggered from non-ring0 into error log.

Tracked-On: #4012
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-11-07 15:01:21 +08:00
Li Fei1
620a1c5215 hv: mmu: rename e820 to hv_e820
Now the e820 structure store ACRN HV memory layout, not the physical memory layout.
Rename e820 to hv_hv_e820 to show this explicitly.

Tracked-On: #4007
Signed-off-by: Li Fei1 <fei1.li@intel.com>
2019-11-07 08:47:02 +08:00
Yonghua Huang
8227804b09 hv:Unmap AP trampoline region from service VM's EPT
AP trampoline code should be accessible
 to hypervisor only, this patch is to unmap
 this region from service VM's EPT for security
 reason.

Tracked-On: #3992
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-11-05 15:14:13 +08:00
Kaige Fu
c22f899a5e HV: Fix poweroff issue of hard RTVM
We should use INIT signal to notify the vcpu threads when
powering off the hard RTVM. To achive this, we should set
the vcpu->thread_obj.notify_mode as SCHED_NOTIFY_INIT.

Patch (27163df9 hv: sched: add sleep/wake for thread object)
tries to set the notify_mode according `is_lapic_pt_enabled(vcpu)`
in function prepare_vcpu. But at this point, the is_lapic_pt_enabled(vcpu)
will always return false. Consequently, it will set notify_mode
as SCHED_NOTIFY_IPI. Then leads to the failure of powering off
hard RTVM.

This patch fixes it by:
  - Initialize the notify_mode as SCHED_NOTIFY_IPI in prepare_vcpu.
  - Set notify_mode as SCHED_NOTIFY_INIT after guest is trying to
    enable x2apic mode of passthru lapic.

Tracked-On: #3975
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Yan, Like <like.yan@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-11-04 10:28:16 +08:00
Li, Fei1
9d26dab6d6 hv: mmio: add a lock to protect mmio_node access
After adding PCI BAR remap support, mmio_node may unregister when there's others
access it. This patch add a lock to protect mmio_node access.

Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-11-01 14:44:11 +08:00
Huihuang Shi
5d662ea11f hv: fixed by replace ull to ul.
ul is used as immediate integer suffix with type uint64_t.

Tracked-On: #3214
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
2019-10-31 09:02:59 +08:00
Li, Fei1
2c158d5ad4 hv: io: add unregister_mmio_emulation_handler API
Since guest could re-program PCI device MSI-X table BAR, we should add mmio
emulation handler unregister.
However, after add unregister_mmio_emulation_handler API, emul_mmio_regions
is no longer accurate. Just replace it with max_emul_mmio_regions which records
the max index of the emul_mmio_node.

Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-10-29 14:49:55 +08:00
Li, Fei1
dc1e2adaec hv: vpci: add PCI BAR re-program address check
In theory, guest could re-program PCI BAR address to any address. However, ACRN
hypervisor only support [0, top_address_space) EPT memory mapping. So we need to
check whether the PCI BAR re-program address is within this scope.

Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-29 14:49:55 +08:00
Shuo A Liu
5f8e7a6cb7 hv: sched: add kick_thread to support notification
kick means to notify one thread_object. If the target thread object is
running, send a IPI to notify it; if the target thread object is
runnable, make reschedule on it.

Also add kick_vcpu API in vcpu layer to notify vcpu.

Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-25 13:00:21 +08:00
Shuo A Liu
f04c491259 hv: sched: decouple scheduler from schedule framework
This patch decouple some scheduling logic and abstract into a scheduler.
Then we have scheduler, schedule framework. From modulization
perspective, schedule framework provides some APIs for other layers to
use, also interact with scheduler through scheduler interaces.

Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-25 13:00:21 +08:00
Yonghua Huang
2e62ad9574 hv[v2]: remove registration of default port IO and MMIO handlers
- The default behaviors of PIO & MMIO handlers are same
   for all VMs, no need to expose dedicated APIs to register
   default hanlders for SOS and prelaunched VM.

Tracked-On: #3904
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2019-10-24 13:21:19 +08:00
Mingqiang Chi
d81872ba18 hv:Change the function parameter for init_ept_mem_ops
Currently the parameter of init_ept_mem_ops is
'struct acrn_vm *vm' for this api,change it to
'struct memory_ops *mem_ops' and 'vm_id' to avoid
the reversed dependency, page.c is hardware layer and vm structure
is its upper-layer stuff.

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2019-10-23 12:48:30 +08:00
Shuo A Liu
0f70a5ca3a hv: sched: decouple idle stuff from schedule module
Let init thread end with run_idle_thread(), then idle thread take over and
start to do scheduling.
Change enter_guest_mode() to init_guest_mode() as run_idle_thread() is removed
out of it. Also add run_thread() in schedule module to run
thread_object's thread loop directly.

rename: switch_to_idle -> run_idle_thread

Tracked-On: #3813
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-23 12:47:08 +08:00
Shuo A Liu
27163df9b1 hv: sched: add sleep/wake for thread object
sleep one thread_object means to prevent it from being scheduled.
wake one thread_object is an opposite operation of sleep.
This patch also add notify_mode in thread_object to indicate how to
deliver the request.

Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-23 12:47:08 +08:00
Shuo A Liu
dadcdcefa0 hv: sched: support vcpu context switch on one pcpu
To support cpu sharing, multiple vcpu can run on same pcpu. We need do
necessary vcpu context switch. This patch add below actions in context
switch.
  1) fxsave/fxrstor;
  2) save/restore MSRs: MSR_IA32_STAR, MSR_IA32_LSTAR,
	MSR_IA32_FMASK, MSR_IA32_KERNEL_GS_BASE;
  3) switch vmcs.

Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-23 12:47:08 +08:00
Shuo A Liu
7e66c0d4fa hv: sched: use get_running_vcpu to replace per_cpu vcpu with cpu sharing
With cpu sharing enabled, per_cpu vcpu cannot work properly as we might
has multiple vcpus running on one pcpu.
Add a schedule API sched_get_current to get current thread_object on
specific pcpu, also add a vcpu API get_running_vcpu to get corresponding
vcpu of the thread_object.

Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-23 12:47:08 +08:00
Shuo A Liu
891e46453d hv: sched: move pcpu_id from acrn_vcpu to thread_object
With cpu sharing enabled, we will map acrn_vcpu to thread_object
in scheduling. From modulization perspective, we'd better hide the
pcpu_id in acrn_vcpu and move it to thread_object.

Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-23 12:47:08 +08:00
Shuo A Liu
f85106d1ed hv: Do not reset vcpu thread's stack when reset_vcpu
vcpu thread's stack shouldn't follow reset_vcpu to reset.
There is also a bug here:
while vcpu B thread set vcpu->running to false, other vcpu A thread
will treat the vcpu B is paused while it has not been switch out
completely, then reset_vcpu will reset the vcpu B thread's stack and
corrupt its running context.

This patch will remove the vcpu thread's stack reset from reset_vcpu.
With the change, we need do init_vmcs between vcpu startup address be
settled and scheduled in. And switch_to_idle() is not needed anymore
as S3 thread's stack will not be reset.

Tracked-On: #3813
Signed-off-by: Fengwei Yin <fengwei.yin@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
2019-10-23 12:47:08 +08:00
Jian Jun Chen
1d194ede61 hv: support reference time enlightenment
Two time related synthetic MSRs are implemented in this patch. Both of
them are partition wide MSR.
- HV_X64_MSR_TIME_REF_COUNT is read only and it is used to return the
  partition's reference counter value in 100ns units.
- HV_X64_MSR_REFERENCE_TSC is used to set/get the reference TSC page,
  a sequence number, an offset and a multiplier are defined in this
  page by hypervisor and guest OS can use them to calculate the
  normalized reference time since partition creation, in 100ns units.

Tracked-On: #3831
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-10-22 10:09:16 +08:00
wenwumax
048155d3d6 hv: support minimum set of TLFS
This patch implements the minimum set of TLFS functionality. It
includes 6 vCPUID leaves and 3 vMSRs.

- 0x40000001 Hypervisor Vendor-Neutral Interface Identification
- 0x40000002 Hypervisor System Identity
- 0x40000003 Hypervisor Feature Identification
- 0x40000004 Implementation Recommendations
- 0x40000005 Hypervisor Implementation Limits
- 0x40000006 Implementation Hardware Features

- HV_X64_MSR_GUEST_OS_ID Reporting the guest OS identity
- HV_X64_MSR_HYPERCALL Establishing the hypercall interface
- HV_X64_MSR_VP_INDEX Retrieve the vCPU ID from hypervisor

Tracked-On: #3832
Signed-off-by: wenwumax <wenwux.ma@intel.com>
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-10-22 10:09:16 +08:00
Mingqiang Chi
292d1a15f9 hv:Wrap some APIs related with guest pm
-- change some APIs to static
-- combine two APIs to init_guest_pm

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2019-10-21 10:13:02 +08:00
Jian Jun Chen
e1a2ed1727 hv: fix a bug that tpr threshold is not updated
Consider the following case when TPR shadow is used with vlapic
basic mode:
1) 2 interrupts are pending in vlapic. INTa's priority > TPR and
   INTb's priority <= TPR.
2) TPR threshold is set to zero and INTa is injected to guest.
3) Guest set TPR to the priority of INTa.
4) EOI of INTa. PPR is updated to TPR which equals INTa's priority.
   INTb cannot be injected because its priority <= PPR.
5) Guest set TPR to zero. Because TPR threshold is still zero, there is
   no TPR threshold vmexit. But since both TPR and ISRV are zero at
   this time, the PPR is zero as well. INTb still cannot be injected.
   This is a bug.

By adding vcpu_make_request(vlapic->vcpu, ACRN_REQUEST_EVENT) in EOI,
TPR threshold will be updated before vm_resume.

Tracked-On: #3795
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-16 16:40:29 +08:00
Shuo A Liu
de157ab96c hv: sched: remove runqueue from current schedule logic
Currently we are using a 1:1 mapping logic for pcpu:vcpu. So don't need
a runqueue for it. Removing it as preparation work to abstract scheduler
framework.

Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-16 10:25:53 +08:00
Shuo A Liu
837e4d8788 hv: sched: rename schedule related structs and vars
prepare_switch_out -> switch_out
prepare_switch_in -> switch_in
prepare_switch -> do_switch
run_thread_t -> thread_entry_t
sched_object -> thread_object
sched_object.thread -> thread_object.thread_entry
sched_obj -> thread_obj
sched_context -> sched_control
sched_ctx -> sched_ctl

Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-10-16 10:25:53 +08:00
Binbin Wu
d19592a33e hv: vmsr: disable prmrr related msrs in vm
PRMRR related MSRs need to be configured by platform BIOS / bootloader.
These settings are not allowed to be changed by guest.
VMs currently have no requirement to access these MSRs even when vSGX is enabled.
So, this patch disables PRMRR related MSRs in VM.

Tracked-On: #3739
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-10-15 15:13:11 +08:00
Mingqiang Chi
de0a5a48d6 hv:remove some unnecessary includes
--remove unnecessary includes
--remove unnecssary forward-declaration for 'struct vhm_request'

Tracked-On: #861
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2019-10-15 14:40:39 +08:00
Peter Fang
28b50463c9 hv: vm: properly reset pCPUs with LAPIC PT enabled during VM shutdown/reset
When a VM is configured with LAPIC PT mode and its vCPU is in x2APIC
mode, the corresponding pCPU needs to be reset during VM shutdown/reset
as its physical LAPIC was used by its guest.

This commit fixes an issue where this reset never happens.
is_lapic_pt_enabled() needs to be called before reset_vcpu() to be able
to correctly reflect a vCPU's APIC mode.

A vCPU with LAPIC PT mode but in xAPIC mode does not require such reset,
since its physical LAPIC was not touched by its guest directly.

v2 -> v3:
- refine edge case detection logic

v1 -> v2:
- use a separate function to return the bitmap of LAPIC PT enabled pCPUs

Tracked-On: #3708
Signed-off-by: Peter Fang <peter.fang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Jack Ren <jack.ren@intel.com>
2019-09-29 15:12:25 +08:00
Shiqing Gao
658fff27b4 hv: pci: update "union pci_bdf"
- add one more filed in "union pci_bdf"
- remove following interfaces:
  * pci_bus
  * pci_slot
  * pci_func
  * pci_devfn

Tracked-On: #1842
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
2019-09-25 13:45:39 +08:00
Shuo A Liu
2096c43e5c hv: create all VCPUs for guest when create VM
To enable static configuration of different scenarios, we configure VMs
in HV code and prepare all nesserary resources for this VM in create VM
hypercall. It means when we create one VM through hypercall, HV will
read all its configuration and run it automatically.

Tracked-On: #3663
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-09-24 11:58:45 +08:00
Shuo A Liu
9a23ec6b5a hv: remove unused pcpu assignment functions
As we introduced vcpu_affinity[] to assign vcpus to different pcpus, the
old policy and functions are not needed. Remove them.

Tracked-On: #3663
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-09-24 11:58:45 +08:00
Shuo A Liu
1c526e6d16 hv: use vcpu_affinity[] in vm_config to support vcpu assignment
Add this vcpu_affinity[] for each VM to indicate the assignment policy.
With it, pcpu_bitmap is not needed, so remove it from vm_config.
Instead, vcpu_affinity is a must for each VM.

This patch also add some sanitize check of vcpu_affinity[]. Here are
some rules:
  1) only one bit can be set for each vcpu_affinity of vcpu.
  2) two vcpus in same VM cannot be set with same vcpu_affinity.
  3) vcpu_affinity cannot be set to the pcpu which used by pre-launched VM.

v4: config SDC with CONFIG_MAX_KATA_VM_NUM
v5: config SDC with CONFIG_MAX_PCPU_NUM

Tracked-On: #3663
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
2019-09-24 11:58:45 +08:00
Shuo A Liu
f4ce9cc4a2 hv: make hypercall HC_CREATE_VCPU empty
Now, we create vcpus while VM being created in hypervisor. The
create vcpu hypercall will not be used any more. For compatbility,
keep the hypercall HC_CREATE_VCPU do nothing.

v4: Don't remove HC_CREATE_VCPU hypercall, let it do nothing.

Tracked-On: #3663
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
2019-09-24 11:58:45 +08:00
Mingqiang Chi
60adef33d3 hv:move down structures run_context and ext_context
Now the structures(run_context & ext_context) are defined
in vcpu.h,and they are used in the lower-layer modules(wakeup.S),
this patch move down the structures from vcpu.h to cpu.h
to avoid reversed dependency.

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-09-16 14:51:36 +08:00
Mingqiang Chi
4f98cb03a7 hv:move down the structure intr_source
Now the structures(union source & struct intr_source) are defined
in ptdev.h,they are used in vtd.c and assign.c,
vtd is the hardware layer and ptdev is the upper-layer module
from the modularization perspective, this patch move down
these structures to avoid reversed dependency.

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-09-16 14:51:36 +08:00
Shuo A Liu
4742d1c747 hv: ptdev: move softirq_dev_entry_list from vm structure to per_cpu region
Using per_cpu list to record ptdev interrupts is more reasonable than
recording them per-vm. It makes dispatching such interrupts more easier
as we now do it in softirq which happens following interrupt context of
each pcpu.

Tracked-On: #3663
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-09-16 09:36:52 +08:00
Shuo A Liu
2cc45534d6 hv: move pcpu offline request and vm shutdown request from schedule
From modulization perspective, it's not suitable to put pcpu and vm
related request operations in schedule. So move them to pcpu and vm
module respectively. Also change need_offline return value to bool.

Tracked-On: #3663
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
2019-09-16 09:36:52 +08:00
Yin Fengwei
6b6aa80600 hv: pm: fix coding style issue
This patch fix the coding style issue introduced by previous two
patches.

Tracked-On: #3564
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
2019-09-11 17:30:24 +08:00
Yin Fengwei
f039d75998 hv: pm: enhencement platform S5 entering operation
Now, we have assumption that SOS control whether the platform
should enter S5 or not. So when SOS tries enter S5, we just
forward the S5 request to native port which make sure platform
S5 is totally aligned with SOS S5.

With higher serverity guest introduced,this assumption is not
true any more. We need to extend the platform S5 process to
handle higher severity guest:
  - For DM launched RTVM, we need to make sure these guests
    is off before put the whole platfrom to S5.

  - For pre-launched VM, there are two cases:
    * if os running in it support S5, we wait for guests off.
    * if os running in it doesn't support S5, we expect it
      will invoke one hypercall to notify HV to shutdown it.
      NOTE: this case is not supported yet. Will add it in the
      future.

Tracked-On: #3564
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
2019-09-11 17:30:24 +08:00
Mingqiang Chi
cd40980d5f hv:change function parameter for invept
change the input parameter from vcpu to eptp in order to let this api
more generic, no need to care normal world or secure world.

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-09-05 16:32:30 +08:00
Binbin Wu
cd1ae7a89e hv: cat: isolate hypervisor from rtvm
Currently, the clos id of the cpu cores in vmx root mode is the same as non-root mode.
For RTVM, if hypervisor share the same clos id with non-root mode, the cacheline may
be polluted due to the hypervisor code execution when vmexit.

The patch adds hv_clos in vm_configurations.c
Hypervisor initializes clos setting according to hv_clos during physical cpu cores initialization.
For RTVM,  MSR auto load/store areas are used to switch different settings for VMX root/non-root
mode for RTVM.

Tracked-On: #2462
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2019-09-05 09:59:13 +08:00
Mingqiang Chi
38ca8db19f hv:tiny cleanup
-- remove some unnecessary includes
-- fix a typo
-- remove unnecessary void before launch_vms

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2019-09-05 09:58:47 +08:00
Yan, Like
f15a3600ec hv: fix tsc_deadline correctness issue
Fix tsc_deadline issue by trapping TSC_DEADLINE msr write if VMX_TSC_OFFSET is not 0.
Because there is an assupmtion in the ACRN vART design that pTSC_Adjust and vTSC_Adjust are
both 0.
We can leave the TSC_DEADLINE write pass-through without correctness issue becuase there is
no offset between the pTSC and vTSC, and there is no write to vTSC or vTSC_Adjust write observed
in the RTOS so far.
This commit fix the potential correctness issue, but the RT performance will be badly affected
if vTSC or vTSC_Adjust was not zero, which we will address if such case happened.

Tracked-On: #3636
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-09-05 09:58:16 +08:00
dongshen
295701cc55 hv: remove mptable code for pre-launched VMs
Now that ACPI is enabled for pre-launched VMs, we can remove all mptable code.

Tracked-On: #3601
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2019-08-29 10:12:25 +08:00
dongshen
b447ce3d86 hv: add ACPI support for pre-launched VMs
Statically define the per vm RSDP/XSDT/MADT ACPI template tables in vacpi.c,
RSDP/XSDT tables are copied to guest physical memory after checksum is
calculated. For MADT table, first fix up process id/lapic id in its lapic
subtable, then the MADT table's checksum is calculated before it is copies to
guest physical memory.

Add 8-bit checksum function in util.h

Tracked-On: #3601
Signed-off-by: dongshen <dongsheng.x.zhang@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2019-08-29 10:12:25 +08:00
Binbin Wu
5c81659713 hv: ept: flush cache for modified ept entries
EPT tables are shared by MMU and IOMMU.
Some IOMMUs don't support page-walk coherency, the cpu cache of EPT entires
should be flushed to memory after modifications, so that the modifications
are visible to the IOMMUs.

This patch adds a new interface to flush the cache of modified EPT entires.
There are different implementations for EPT/PPT entries:
- For PPT, there is no need to flush the cpu cache after update.
- For EPT, need to call iommu_flush_cache to make the modifications visible
to IOMMUs.

Tracked-On: #3607
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
2019-08-26 10:47:17 +08:00
Mingqiang Chi
2310d99ebf hv: cleanup vmcs.h
-- move 'RFLAGS_AC' to cpu.h
-- move 'VMX_SUPPORT_UNRESTRICTED_GUEST' to msr.h
   and rename it to 'MSR_IA32_MISC_UNRESTRICTED_GUEST'
-- move 'get_vcpu_mode' to vcpu.h
-- remove deadcode 'vmx_eoi_exit()'

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-08-22 14:13:15 +08:00
Mingqiang Chi
bd09f471a6 hv:move some APIs related host reset to pm.c
move some data structures and APIs related host reset
from vm_reset.c to pm.c, these are not related with guest.

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-08-22 14:09:18 +08:00
Yin Fengwei
6beb34c3cb vm_load: update init gdt preparation
Now, we use native gdt saved in boot context for guest and assume
it could be put to same address of guest. But it may not be true
after the pre-launched VM is introduced. The gdt for guest could
be overwritten by guest images.

This patch make 32bit protect mode boot not use saved boot context.
Insteadly, we use predefined vcpu_regs value for protect guest to
initialize the guest bsp registers and copy pre-defined gdt table
to a safe place of guest memory to avoid gdt table overwritten by
guest images.

Tracked-On: #3532

Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-08-20 09:22:20 +08:00
Yonghua Huang
700a37856f hv: remove 'flags' field in struct vm_io_range
Currently, 'flags' is defined and set but never be used
  in the flow of handling i/o request after then.

Tracked-On: #861
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-08-19 10:19:54 +08:00
Yonghua Huang
f791574f0e hv: refine the function pointer type of port I/O request handlers
In the definition of port i/o handler, struct acrn_vm * pointer
 is redundant as input, as context of acrn_vm is aleady linked
 in struct acrn_vcpu * by vcpu->vm, 'vm' is not required as input.

 this patch removes argument '*vm' from 'io_read_fn_t' &
 'io_write_fn_t', use '*vcpu' for them instead.

Tracked-On: #861
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-08-16 11:44:27 +08:00
Jie Deng
866935a53f hv: vcr: check guest cr3 before loading pdptrs
Check whether the address area pointed by the guest
cr3 is valid or not before loading pdptrs. Inject #GP(0)
to guest if there are any invalid cases.

Tracked-On: #3572
Signed-off-by: Jie Deng <jie.deng@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-08-16 11:43:17 +08:00
huihuang.shi
f147c388a5 hv: fix Violations touched ACRN Coding Guidelines
fix violations touched below:
1.Cast operation on a constant value
2.signed/unsigned implicity conversion
3.return value unused.

V1->V2:
1.bitmap api will return boolean type, not need to check "!= 0", deleted.
2.The behaves ~(uint32_t)X and (uint32_t)~X are not defined in ACRN hypervisor Coding Guidelines,
removed the change of it.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
2019-08-15 09:47:11 +08:00
Shiqing Gao
062fe19800 hv: move vmx_rdmsr_pat/vmx_wrmsr_pat from vmcs.c to vmsr.c
This patch moves vmx_rdmsr_pat/vmx_wrmsr_pat from vmcs.c to vmsr.c,
so that these two functions would become internal functions inside
vmsr.c.
This approach improves the modularity.

v1 -> v2:
 * remove 'vmx_rdmsr_pat'
 * rename 'vmx_wrmsr_pat' with 'write_pat_msr'

Tracked-On: #1842
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-08-14 10:51:35 +08:00
Li, Fei1
ff54fa2325 hv: vpci: add emulated PCI device configure for SOS
Add emulated PCI device configure for SOS to prepare for add support for customizing
special pci operations for each emulated PCI device.

Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-08-09 14:19:49 +08:00
Li, Fei1
5471473f60 hv: vpci: create iommu domain in vpci_init for all guests
Create an iommu domain for all guest in vpci_init no matter if there's a PTDev
in it.

Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com>
2019-08-06 11:51:02 +08:00
Li, Fei1
eb21f205e4 hv: vm_config: build pci device configure for SOS
Align SOS pci device configure with pre-launched VM and filter pre-launched VM's
PCI PT device from SOS pci device configure.

Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-08-06 11:51:02 +08:00
Victor Sun
901a65cb53 HV: inject exception for invalid vmcall
For non-trusty hypercalls, HV should inject #GP(0) to vCPU if they are
from non-ring0 or inject #UD if they are from ring0 of non-SOS. Also
we should not modify RAX of vCPU for these invalid vmcalls.

Tracked-On: #3497

Signed-off-by: Victor Sun <victor.sun@intel.com>
2019-08-01 16:07:57 +08:00
Victor Sun
363daf6aa2 HV: return extended info in vCPUID leaf 0x40000001
In some case, guest need to get more information under virtual environment,
like guest capabilities. Basically this could be done by hypercalls, but
hypercalls are designed for trusted VM/SOS VM, We need a machenism to report
these information for normal VMs. In this patch, vCPUID leaf 0x40000001 will
be used to satisfy this needs that report some extended information for guest
by CPUID.

Tracked-On: #3498

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Zhao Yakui <yakui.zhao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-07-31 14:13:39 +08:00
Kaige Fu
accdadce98 HV: Enable vART support by intercepting TSC_ADJUST MSR
The policy of vART is that software in native can run in
VM too. And in native side, the relationship between the
ART hardware and TSC is:

  pTSC = (pART * M) / N + pAdjust

The vART solution is:
  - Present the ART capability to guest through CPUID leaf
    15H for M/N which identical to the physical values.
  - PT devices see the pART (vART = pART).
  - Guest expect: vTSC = vART * M / N + vAdjust.
  - VMCS.OFFSET = vTSC - pTSC = vAdjust - pAdjust.

So to support vART, we should do the following:
  1. if vAdjust and vTSC are changed by guest, we should change
     VMCS.OFFSET accordingly.
  2. Make the assumption that the pAjust is never touched by ACRN.

For #1, commit "a958fea hv: emulate IA32_TSC_ADJUST MSR" has implementation
it. And for #2, acrn never touch pAdjust.

--
 v2 -> v3:
   - Add comment when handle guest TSC_ADJUST and TSC accessing.
   - Initialize the VMCS.OFFSET = vAdjust - pAdjust.

 v1 -> v2
   Refine commit message to describe the whole vART solution.

Tracked-On: #3501
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-07-31 13:29:51 +08:00
Li, Fei1
4a27d08360 hv: schedule: schedule to idel after SOS resume form S3
After "commit f0e1c5e init vcpu host stack when reset vcpu", SOS resume form S3
wants to schedule to vcpu_thread not the point where SOS enter S3. So we should
schedule to idel first then reschedule to execute vcpu_thread.

Tracked-On: #3387
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-07-29 09:53:18 +08:00
Yonghua Huang
49e60ae151 hv: refine handler to 'rdpmc' vmexit
PMC is hidden from guest and hypervisor should
 inject UD to guest when 'rdpmc' vmexit.

Tracked-On: #3453
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-07-24 15:05:46 +08:00
Victor Sun
a7b6fc74e5 HV: allow write 0 to MSR_IA32_MCG_STATUS
Per SDM, writing 0 to MSR_IA32_MCG_STATUS is allowed, HV should not
return -EACCES on this case;

Tracked-On: #3454

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-07-23 15:24:50 +08:00
Yonghua Huang
dde20bdb03 HV:refine the handler for 'invept' vmexit
'invept' is not expected in guest and hypervisor should
inject UD when 'invept' VM exit happens.

Tracked-On: #3444
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
2019-07-22 13:23:47 +08:00
Yin Fengwei
f0e1c5e55f vcpu: init vcpu host stack when reset vcpu
Otherwise, the previous local variables in host stack is not reset.

Tracked-On: #3387
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-07-22 09:55:06 +08:00
Yin, Fengwei
11e67f1c4a softirq: move softirq from hv_main to interrupt context
softirq shouldn't be bounded to vcpu thread. One issue for this
is shell (based on timer) can't work if we don't start any guest.

This change also is trying best to make softirq handler running
with irq enabled.

Also update the irq disable/enabel in vmexit handler to align
with the usage in vcpu_thread.

Tracked-On: #3387
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-07-22 09:55:06 +08:00
Yan, Like
a4abeaf980 hv: enforce no interrupt to RT VM via vlapic once lapic pt
Because we depend on guest OS to switch x2apic mode to enable lapic pass-thru, vlapic is working at the early stage of booting, eg: in virtual boot loader.
After lapic pass-thru enabled, no interrupt should be injected via vlapic any more.

This commit resets the vlapic to clear the pending status and adds ptapic_ops to enforce that no more interrupt accepted/injected via vlapic.

Tracked-On: #3227
Signed-off-by: Yan, Like <like.yan@intel.com>
2019-07-19 16:47:06 +08:00
Yan, Like
97f6097f04 hv: add ops to vlapic structure
This commit adds ops to vlapic structure, and add an *ops parameter to vlapic_reset().
At vlapic reset, the ops is set to the global apicv_ops, and may be assigned
to other ops later.

Tracked-On: #3227
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-07-19 16:47:06 +08:00
Li, Fei1
b39526f759 hv: schedule: vCPU schedule state setting don't need to be atomic
vCPU schedule state change is under schedule lock protection. So there's no need
to be atomic.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-07-17 09:20:54 +08:00
Li, Fei1
8af334cbb2 hv: vcpu: operation in vcpu_create don't need to be atomic
For pre-launched VMs and SOS, vCPUs are created on BSP one by one; For post-launched VMs,
vCPUs are created under vmm_hypercall_lock protection. So vcpu_create is called sequentially.
Operation in vcpu_create don't need to be atomic.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-07-17 09:20:54 +08:00
Li, Fei1
540841ac5d hv: vlapic: EOI exit bitmap should set or clear atomically
For per-vCPU, EOI exit bitmap is a global parameter which should set or clear
atomically since there's no lock to protect this critical variable.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
2019-07-17 09:20:54 +08:00
Li, Fei1
e69b3dcf67 hv: schedule: remove runqueue_lock in sched_context
Now sched_object and sched_context are protected by scheduler_lock. There's no
chance to use runqueue_lock to protect schedule runqueue if we have no plan to
support schedule migration.

Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
2019-07-17 09:20:54 +08:00
Tianhua Sun
2d4809e3b1 hv: fix some potential array overflow risk
'pcpu_id' should be less than CONFIG_MAX_PCPU_NUM,
else 'per_cpu_data' will overflow. This commit fixes
this potential overflow issue.

Tracked-On: #3397
Signed-off-by: Tianhua Sun <tianhuax.s.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
2019-07-12 09:41:15 +08:00
Huihuang Shi
304ae38161 HV: fix "use -- or ++ operations"
ACRN coding guidelines banned -- or ++ operations.
V1->V2:
  add comments to struct stack_frame

Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
2019-07-12 09:26:15 +08:00
Huihuang Shi
9063504bc8 HV: ve820 fix "Casting operation to a pointer"
ACRN Coding guidelines requires two different types pointer can't
convert to each other, except void *.

Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
2019-07-11 13:57:21 +08:00
Huihuang Shi
714162fb8b HV: fix violations touched type conversion
ACRN Coding guidelines requires type conversion shall be explicity.

Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-07-11 09:16:09 +08:00
Li, Fei1
5d6c9c33ca hv: vlapic: clear up where needs atomic operation in vLAPIC
In almost case, vLAPIC will only be accessed by the related vCPU. There's no
synchronization issue in this case. However, other vCPUs could deliver interrupts
to the current vCPU, in this case, the IRR (for APICv base situation) or PIR
(for APICv advanced situation) and TMR for both cases could be accessed by more
than one vCPUS simultaneously. So operations on IRR or PIR should be atomical
and visible to other vCPUs immediately. In another case, vLAPIC could be accessed
by another vCPU when create vCPU or reset vCPU which could be supposed to be
consequently.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-07-11 09:15:47 +08:00
Mingqiang Chi
e4d1c321ad hv:fix "no prototype for non-static function"
change some APIs to static or include header file

Tracked-On: #861
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2019-07-09 10:36:03 +08:00
Shuo A Liu
4129b72b2e hv: remove unnecessary cancel_event_injection related stuff
cancel_event_injection is not need any more if we do 'scheudle' prior to
acrn_handle_pending_request. Commit "921288a6672: hv: fix interrupt
lost when do acrn_handle_pending_request twice" bring 'schedule'
forward, so remove cancel_event_injection related stuff.

Tracked-On: #3374
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-07-09 09:23:12 +08:00
Huihuang Shi
9a7043e83f HV: remove instr_emul.c dead code
ACRN Coding guidelines requires no dead code.

Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
2019-07-09 09:22:53 +08:00
Yonghua Huang
3164f3976a hv: Mitigation for CPU MDS vulnerabilities.
Microarchitectural Data Sampling (MDS) is a hardware vulnerability
 which allows unprivileged speculative access to data which is available
 in various CPU internal buffers.

 1. Mitigation on ACRN:
    1) Microcode update is required.
    2) Clear CPU internal buffers (store buffer, load buffer and
       load port) if current CPU is affected by MDS, when VM entry
       to avoid any information leakage to guest thru above buffers.
    3) Mitigation is not needed if ARCH_CAP_MDS_NO bit (bit5)
       is set in IA32_ARCH_CAPABILITIES MSR (10AH), in this case,
       current processor is no affected by MDS vulnerability, in other
       cases mitigation for MDS is required.

 2. Methods to clear CPU buffers (microcode update is required):
    1) L1D cache flush
    2) VERW instruction
    Either of above operations will trigger clearing all
    CPU internal buffers if this CPU is affected by MDS.
    Above mechnism is enumerated by:
    CPUID.(EAX=7H, ECX=0):EDX[MD_CLEAR=10].

 3. Mitigation details on ACRN:
    if (processor is affected by MDS)
	    if (processor is not affected by L1TF OR
		  L1D flush is not launched on VM Entry)
		    execute VERW instruction when VM entry.
	    endif
    endif

 4. Referrence:
    Deep Dive: Intel Analysis of Microarchitectural Data Sampling
    https://software.intel.com/security-software-guidance/insights/
    deep-dive-intel-analysis-microarchitectural-data-sampling

    Deep Dive: CPUID Enumeration and Architectural MSRs
    https://software.intel.com/security-software-guidance/insights/
    deep-dive-cpuid-enumeration-and-architectural-msrs

Tracked-On: #3317
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
Reviewed-by: Jason CJ Chen <jason.cj.chen@intel.com>
2019-07-05 15:17:27 +08:00
Cai Yulong
127c98f5db hv: vioapic: fix interrupt lost and redundant interrupt
1. reset polarity of ptirq_remapping_info to zero.
   this help to set correct initial pin state, and fix the interrupt lost issue
   when assign a ptirq to uos.

2. since vioapic_generate_intr relys on rte, we should build rte before
   generating an interrput, this fix the redundant interrupt.

Tracked-On: #3362
Signed-off-by: Cai Yulong <yulongc@hwtc.com.cn>
2019-07-05 10:22:56 +08:00
Binbin Wu
f3ffce4be1 hv: vmexit: ecx should be checked instead of rcx when xsetbv
According to SDM, xsetbv writes the contents of registers EDX:EAX into the 64-bit
extended control register (XCR) specified in the ECX register. (On processors
that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.)

In current code, RCX is checked, should ingore the high-order 32bits.

Tracked-On: #3360
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
2019-07-05 09:48:53 +08:00
Li, Fei1
09a63560f4 hv: vm_manage: minor fix about triple_fault_shutdown_vm
The current implement will trigger shutdown vm request on the BSP VCPU on the VM,
not the VCPU will trap out because triple fault. However, if the BSP VCPU on the VM
is handling another IO emulation, it may overwrite the triple fault IO request on
the vhm_request_buffer in function acrn_insert_request. The atomic operation of
get_vhm_req_state can't guarantee the vhm_request_buffer will not access by another
IO request if it is not running on the corresponding VCPU. So it should trigger
triple fault shutdown VM IO request on the VCPU which trap out because of triple
fault exception.
Besides, rt_vm_pm1a_io_write will do the right thing which we shouldn't do it in
triple_fault_shutdown_vm.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-07-03 17:44:45 +08:00
Binbin Wu
4a22801dd1 hv: ept: mask EPT leaf entry bit 52 to bit 63 in gpa2hpa
According to SDM, bit N (physical address width) to bit 63 should be masked when calculate
host page frame number.
Currently, hypervisor doesn't set any of these bits, so gpa2hpa can work as expectd.
However, any of these bit set, gpa2hpa return wrong value.

Hypervisor never sets bit N to bit 51 (reserved bits), for simplicity, just mask bit 52 to bit 63.

Tracked-On: #3352
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2019-07-03 09:39:41 +08:00
Huihuang Shi
74b788988d HV:fix vcpu more than one return entry
ACRN coding guideline requires function shall have only one return entry.
Fix it.

Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-28 13:28:26 +08:00
Huihuang Shi
198e01716a HV:fix vcpu violations
vcpu is never scan because of scan tool will be crashed!
After modulization, the vcpu can be scaned by the scan tool.
Clean up the violations in vcpu.c.
Fix the violations:
    1.No brackets to then/else.
    2.Function return value not checked.
    3.Signed/unsigned coversion without cast.

V1->V2:
    change the type of "vcpu->arch.irq_window_enabled" to bool.
V2->V3:
    add "void *" prefix on the 1st parameter of memset.

Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-28 13:28:26 +08:00
Conghui Chen
4d88b2bb65 hv: bugfix for sbuf reset
The sbuf is allocated for each pcpu by hypercall from SOS. Before launch
Guest OS, the script will offline cpus, which will trigger vcpu reset and
then reset sbuf pointer. But sbuf only initiate once by SOS, so these
cpus for Guest OS has no sbuf to use. Thus, when run 'acrntrace' on SOS,
there is no trace data for Guest OS.
To fix the issue, only reset the sbuf for SOS.

Tracked-On: #3335
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Yan, Like <like.yan@intel.com>
2019-06-27 15:40:19 +08:00
Li, Fei1
e793b5d091 hv: vlapic: remove ISR vector stack
The current implement will cache each ISR vector in ISR vector stack and do
ISR vector stack check when updating PPR. However, there is no need to do this
because:
1) We will not touch vlapic->isrvec_stk[0] except doing vlapic_reset:
  So we don't need to do vlapic->isrvec_stk[0] check.
2) We only deliver higher priority interrupt from IRR to ISR:
  So we don't need to check whether vlapic->isrvec_stk interrupts is always increasing.
3) There're only 15 different priority interrupt, It will not happened that more that
15 interrupts could been delivered to ISR:
  So we don't need to check whether vlapic->isrvec_stk_top will larger than
  ISRVEC_STK_SIZE which is 16.
This patch try to remove ISR vector stack and use isrv to cache the vector number for
the highest priority bit that is set in the ISR.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-06-27 15:27:37 +08:00
Huihuang Shi
3a61530d4e HV:fix simple violations
Fix the violations not touched the logical.
1.Function return value not checked.
2.Logical conjuctions need brackets.
3.No brackets to then/else.
4.Type conversion without cast.

Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
2019-06-25 20:09:21 +08:00
Mingqiang Chi
c1e23f1a4a hv:Fix MISRA-C violations for static inline
MISRA-C requires inline functions should be declared static,
these APIs are external interfaces,remove inline

Tracked-On: #861
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>

	modified:   arch/x86/guest/vcpu.c
2019-06-24 08:31:32 +08:00
Huihuang Shi
e3ee9cf20e HV: fix expression is not boolean
MISRA-C standard requires the type of result of expression in if/while pattern shall be boolean.

Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
2019-06-21 09:04:44 +08:00
Kaige Fu
8740232ad6 HV: Allow pause RTVM when its state is VM_CREATED
There are a lot of works to do between create_vm (HV will mark vm's state
as VM_CREATED at this stage) and vm_run (HV will mark vm's state as VM_STARTED),
like building mptable/acpi table, initializing mevent and vdevs. If there is
something goes wrong between create_vm and vm_run, the devicemodel will jumps
to the deinit process and will try to destroy the vm. For example, if the
vm_init_vdevs failed, the devicemodel will jumps to dev_fail and then destroy
the vm.

For normal vm in above situation, it is fine to destroy vm. And we can create and
start it next time. But for RTVM, we can't destroy the vm as the vm's state is
VM_CREATED. And we can only destroy vm when its state is VM_POWERING_OFF. So, the
vm will stay at VM_CREATED state and we will never have chance to destroy it.
Consequently, we can't create and start the vm next time.

This patch fixes it by allowing to pause and then destroy RTVM when its state is VM_CREATED.

Tracked-On: #3069
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
2019-06-20 22:24:38 +08:00
Yuan Liu
f8934df355 HV: implement wbinvd instruction emulation
wbinvd is used to write back all modified cache lines in the processor's
internal cache to main memory and invalidates(flushes) the internal caches.

Using clflushopt instructions to emulate wbinvd to flush each
guest vm memory, if CLFLUSHOPT is not supported, boot will fail.

Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-20 09:32:55 +08:00
Yuan Liu
ea699af861 HV: Add has_rt_vm API
The has_rt_vm walk through all VMs to check RT VM flag and if
there is no any RT VM, then return false otherwise return true.

Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
2019-06-20 09:32:55 +08:00
Yuan Liu
7018a13cb6 HV: Add ept_flush_leaf_page API
The ept_flush_leaf_page API is used to flush address space
from a ept page entry, user can use it to match walk_ept_mr to
flush VM address space.

Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-20 09:32:55 +08:00
Yuan Liu
f320130d58 HV: Add walk_ept_table and get_ept_entry APIs
The walk_ept_table API is used to walk through EPT table for getting
all of present pages, user can get each page entry and its size
from the walk_ept_table callback.

The get_ept_entry is used to getting EPT pointer of the vm, if current
context of mv is secure world, return secure world EPT pointer, otherwise
return normal world EPT pointer.

Signed-off-by: Jack Ren <jack.ren@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-20 09:32:55 +08:00
Li, Fei1
0e046c7a0a hv: vlapic: clear which access type we support for APIC-Access VM Exit
The current implement doesn't clear which access type we support for
APIC-Access VM Exit:
1) linear access for an instruction fetch
-- APIC-access page is mapped as UC which doesn't support fetch
2) linear access (read or write) during event delivery
-- Which is not happened in normal case except the guest went wrong, such as,
set the IDT table in APIC-access page. In this case, we don't need to support.
3) guest-physical access during event delivery;
   guest-physical access for an instruction fetch or during instruction execution
-- Do we plan to support enable APIC in real mode ? I don't think so.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-06-20 08:53:25 +08:00
Li, Fei1
9960ff98c5 hv: ept: unify EPT API name to verb-object style
Rename ept_mr_add to ept_add_mr
Rename ept_mr_modify to ept_modify_mr
Rename ept_mr_del to ept_del_mr

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-06-14 14:40:25 +08:00
Sainath Grandhi
7d44cd5c28 hv: Introduce check_vm_vlapic_state API
This patch introduces check_vm_vlapic_state API instead of is_lapic_pt_enabled
to check if all the vCPUs of a VM are using x2APIC mode and LAPIC
pass-through is enabled on all of them.

When the VM is in VM_VLAPIC_TRANSITION or VM_VLAPIC_DISABLED state,
following conditions apply.
1) For pass-thru MSI interrupts, interrupt source is not programmed.
2) For DM emulated device MSI interrupts, interrupt is not delivered.
3) For IPIs, it will work only if the sender and destination are both in x2APIC mode.

Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-14 13:55:26 +08:00
Sainath Grandhi
f3627d4839 hv: Add update_vm_vlapic_state API to sync the VM vLAPIC state
This patch introduces vLAPIC state for a VM. The VM vLAPIC state can
be one of the following
 * VM_VLAPIC_X2APIC - All the vCPUs/vLAPICs (Except for those in Disabled mode) of this
	VM use x2APIC mode
 * VM_VLAPIC_XAPIC - All the vCPUs/vLAPICs (Except for those in Disabled mode) of this
	VM use xAPIC mode
 * VM_VLAPIC_DISABLED - All the vCPUs/vLAPICs of this VM are in Disabled mode
 * VM_VLAPIC_TRANSITION - Some of the vCPUs/vLAPICs of this VM (Except for those in Disabled mode)
	are in xAPIC and the others in x2APIC

Upon a vCPU updating the IA32_APIC_BASE MSR to switch LAPIC mode, this
API is called to sync the vLAPIC state of the VM. Upon VM creation and reset,
vLAPIC state is set to VM_VLAPIC_XAPIC, as ACRN starts the vCPUs vLAPIC in
XAPIC mode.

Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-14 13:55:26 +08:00
Sainath Grandhi
a3fdc7a496 hv: Add is_xapic_enabled API to check vLAPIC moe
is_xapic_enabled API returns true if vLAPIC is in xAPIC mode. In
all other cases, it returns false.

Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-14 13:55:26 +08:00
Sainath Grandhi
7cb71a317e hv: Make is_x2apic_enabled API visible across source code
Remove static and inline attributes to the API is_x2apic_enabled
and declare a prototype in vlapic.h. Also fix the check performed on guest
APICBASE_MSR value to query vLAPIC mode.

Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-14 13:55:26 +08:00
Sainath Grandhi
1026f1754c hv: Shuffle logic in vlapic_set_apicbase API implementation
This patch changes the code in vlapic_set_apicbase for the following reasons
1) Better readability as it first checks if the new value programmed into
MSR is any different from the existing value cached in guest structures
2) Check if both bits 11:10 are set before enabling x2APIC mode for guest.
Current code does not check if Bit 11 is set.
3) Add TODO in the comments, to detail about the current gaps in
IA32_APIC_BASE MSR emulation.

Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-14 13:55:26 +08:00
Arindam Roy
2321fcdf78 HV:Modularize vpic code to remove usage of acrn_vm
V1:Initial Patch
Modularize vpic. The current patch reduces the usage
of acrn_vm inside the vpic.c file.
Due to the global natire of register_pio_handler, where
acrn_vm is being passed, some usage remains.
These needs to be a separate "interface" file.
That will come in smaller newer patch provided
this patch is accepted.

V2:
Incorporated comments from Jason.

V3:
Fixed some MISRA-C Violations.

Tracked-On: #1842
Signed-off-by: Arindam Roy <arindam.roy@intel.com>
Reviewed-by: Xu, Anthony <anthony.xu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-06-13 09:54:52 +08:00
Yan, Like
32239cf55f hv: reduce cyclomatic complexity of create_vm()
This commit extract pm io handler registration code to register_pm_io_handler()
to reduce the cyclomatic complexity of create_vm() in order to be complied with
MISRA-C rules.

Tracked-On: #3227
Signed-off-by: Yan, Like <like.yan@intel.com>
2019-06-12 14:29:50 +08:00
Conghui Chen
ac6c5dce81 HV: Clean vpic and vioapic logic when lapic is pt
When the lapic is passthru, vpic and vioapic cannot be used anymore. In
current code, user can still inject vpic interrupt to Guest OS, this is
not allowed.
This patch remove the vpic and vioapic initiate functions during
creating VM with lapic passthru. But the APIs in vpic and vioapic are
called in many places, for these APIs, follow the below principles:

1. For the APIs which will access uninitiated variables, and may case
hypervisor hang, add @pre to make sure user should call them after vpic or
vioapic is initiated.

2. For the APIs which only return some static value, do noting with them.

3. For the APIs which user will called to inject interrupt, such as
vioapic_set_irqline_lock or vpic_set_irqline, add condition in these
APIs to make sure it only inject interrupt when vpic or vioapic is
initiated. This change is to make sure the vuart or hypercall need not
to care whether lapic is passthru or the vpic and vioapic is initiated
or not.

Tracked-On: #3227
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
2019-06-12 14:29:50 +08:00
Victor Sun
f83ddd393f HV: introduce relative vm id for hcall api
On SDC scenario, SOS VM id is fixed to 0 so some hypercalls from guest
are using hardcoded "0" to represent SOS VM, this would bring issues
for HYBRID scenario which SOS VM id is non-zero.

Now introducing a new VM id concept for DM/VHM hypercall APIs, that
return a relative VM id which is from SOS view when create VM for post-
launched VMs. DM/VHM could always treat their own vm id is "0". When they
make hypercalls, hypervisor will convert the VM id to the absolute id
when dispatch the hypercalls.

Tracked-On: #3214

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2019-06-12 11:00:40 +08:00
Victor Sun
3d3de6bd38 HV: specify dispatch hypercall for sos or trusty
Changes:

- In current design, the hypercall is only allowed calling from SOS or
trusty VM, so separate the trusty hypercalls from dispatch_hypercall().
The vm parameter which referenced by hcall_xxx() should be SOS VM;

- do not inject #UD for hypercalls from non-SOS, just return -ENODEV;

Tracked-On: #3214

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
2019-06-12 11:00:40 +08:00
Yin Fengwei
6b7233446f xsave: inject GP when guest tries to write 1 to XCR0 reserved bit
According to SDM vol1 13.3:
Write 1 to reserved bit of XCR0 will trigger GP.

This patch make ACRN behavior align with SDM definition.

Tracked-On: #3239
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-12 08:28:53 +08:00
Tianhua Sun
8dd471b37d hv: fix possible null pointer dereference
This patch fix potential null pointer dereference

1, will access null pointer if 'context' is null.
2, if entry already been added to the VM when add
   intx entry for this vm, but parameter virt_pin
   is not equal to entry->virt_sid.intx_id.pin. So
   will saves this entry address to
   vpin_to_pt_entry[entry->virt_sid.intx_id.pin] and
   vpin_to_pt_entry[virt_pin]. In this case, this entry
   will be freed twice.

Tracked-On: #3217
Signed-off-by: Tianhua Sun <tianhuax.s.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-11 16:03:04 +08:00
Victor Sun
04d82e5c0f HV: return virtual lapic id in vcpuid 0b leaf
Currently vlapic id of SOS VM is virtualized, it is indexed by vcpuid in
physical APIC id sequence, but CPUID 0BH leaf still report physical
APIC ID. In SDC/INDUSTRY scenario they are identical mapping so no issue
occured. In hybrid mode this would be a problem because vAPIC ID might
be different with pAPIC ID. We need to make the APIC ID which returned from
CPUID consistent with the one returned from LAPIC register.

Tracked-On: #3214

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-06 15:22:10 +08:00
Victor Sun
50e09c41b4 HV: remove cpu_num from vm configurations
The vcpu num could be calculated based on pcpu_bitmap when prepare_vcpu()
is done, so remove this redundant configuration item;

Tracked-On: #3214

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-06-06 15:22:10 +08:00
Victor Sun
f4e976ab38 HV: return -1 with invalid vcpuid in pt icr access
vm_apicid2vcpu_id() might return invalid vcpu id, when this happens
we should return -1 in vlapic_x2apic_pt_icr_access();

Tracked-On: #3214

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-06-06 15:22:10 +08:00
Victor Sun
ae7dcf443d HV: fix wrong log when vlapic process init sipi
The print message of source and target vcpu id is incorrect, fix it.

Tracked-On: #3214

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-06-06 15:22:10 +08:00
Victor Sun
ea7ca8595c HV: use tag to specify multiboot module
Previously multiboot mods[0] is designed for kernel module for all
pre-launched VMs including SOS VM, and mods[0].mm_string is used
to store kernel cmdline. This design could not satisfy the requirement
of hybrid mode scenarios that each VM might use their own kernel image
also ramdisk image. To resolve this problem, we will use a tag in
mods mm_string field to specify the module type. If the tag could
be matched with os_config of VM configurations, the corresponding
module would be loaded;

Tracked-On: #3214

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-06-06 09:40:52 +08:00
Conghui Chen
376fcddff8 HV: vuart: add vuart_deinit during vm shutdown
Add vuart_deinit to vm shutdown so that the vuart resource can be
reset, and when the Guest VM restart, it could have right state.

Tracked-On: #2987
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-06-03 09:15:16 +08:00
Li, Fei1
c5d4365770 hv: vmcs: don't trap when setting reserved bit in cr0/cr4
According to Chap 23.8 RESTRICTIONS ON VMX OPERATION, Vol 3, SDM:
"Any attempt to set one of these bits to an unsupported value while in VMX
operation (including VMX root operation) using any of the CLTS, LMSW, or
MOV CR instructions causes a general-protection exception."
So we don't need to trap them out then inject the GP in hypervisor.

Tracked-On: #2561
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-05-30 11:33:01 +08:00
Li, Fei1
f2c53a9891 hv: vmcs: trap CR4.SMAP/SMEP/PKE setting
FuSa requires setting CR4.SMAP/SMEP/PKE will invalidate the TLB. However,
setting CR4.SMAP will invalidate the TLB on native while not in non-root mode.
To make sure this, we will trap CR4.SMAP/SMEP/PKE setting to invalidate the TLB
in root mode.

Tracked-On: #2561
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-05-30 11:33:01 +08:00
Sainath Grandhi
a7389686a7 hv: Precondition checks for vcpu_from_vid for lapic passthrough ICR access
Since the vapic_id is from VM, need to check for pre-condition before passing
vcpu_id to vcpu_from_vid. This is in the path of LAPIC passthrough ICR
access

Tracked-On: #3170
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-05-30 10:10:21 +08:00
Binbin Wu
7a915dc397 hv: vmsr: present sgx related msr to guest
Present SGX related MSRs to guest if SGX is supported.

- MSR_IA32_SGXLEPUBKEYHASH0 ~ MSR_IA32_SGXLEPUBKEYHASH3:
  SGX Launch Control is not supported, so these MSRs are read only.
- MSR_IA32_SGX_SVN_STATUS:
  read only
- MSR_IA32_FEATURE_CONTROL:
  If SGX is support in VM, opt-in SGX in this MSR.
- MSR_SGXOWNEREPOCH0 ~ MSR_SGXOWNEREPOCH1:
  The two MSRs' scope is package level, not allow guest to change them.
  Still leave them in unsupported_msrs array.

Tracked-On: #3179
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-29 11:24:13 +08:00
Binbin Wu
1724996bc5 hv: vcpuid: present sgx capabilities to guest
If sgx is supported in guest, present SGX capabilities to guest.
There will be only one EPC section presented to guest, even if EPC
memory for a guest is from muiltiple physcial EPC sections.

Tracked-On: #3179
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-29 11:24:13 +08:00
Binbin Wu
65d437283e hv: vm: build ept for sgx epc reource
Build EPT entries for SGX EPC resource for VMs.
- SOS: EPC resrouce will be removed from EPT of SOS, don't support SGX virtualization for SOS.
- Non-SOS: build ept mapping for EPC resource for guest.
  Guest base address and size is specified in vm configuration.

Tracked-On: #3179
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-29 11:24:13 +08:00
Yin Fengwei
c5cfd7c200 vm state: reset vm state to VM_CREATED when reset_vm is called
When we call reset_vm() function to reset vm, the vm state
should be reset to VM_CREATED as well.

Tracked-On: #3182
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-29 10:43:03 +08:00
Binbin Wu
7e80a6eee3 hv: vm: enable iommu after vpci init
In current code, vpci do the pci enumartion and add pci devices to the
context table of iommu.
Need to enable iommu DMA address translation later than vpci init.
Otherwise, in UEFI platform, there will be a shot time that address translation
is enabled, but the context table is not setup.
For the devices active in UEFI environment will have problem on address translation.

Tracked-On: #3160
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-05-24 11:46:44 +08:00
Zide Chen
bfc08c2812 hv: move msr_bitmap from acrn_vm to acrn_vcpu_arch
At the time the guest is switching to X2APIC mode, different VCPUs in the
same VM could expect the setting of the per VM msr_bitmap differently,
which could cause problems.

Considering different approaches to address this issue:
1) only guest BSP can update msr_bitmap when switching to X2APIC.
2) carefully re-write the update_msr_bitmap_x2apic_xxx() functions to
   make sure any bit in the bitmap won't be toggled by the VCPUs.
3) make msr_bitmap as per VCPU.

We chose option 3) because it's simple and clean, though it takes more
memory than other options.

BTW, need to remove const modifier from update_msr_bitmap_x2apic_xxx()
functions to get it compiled.

Tracked-On: #3166
Signed-off-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-05-24 11:37:13 +08:00
Victor Sun
1fe5711170 HV: validate pstate by checking px ctl range
The previous function just check the pstate target value in PERF_CTL msr
by indexing px data control value which comes from ACPI table, this would
bring a bug in the case that guest is running intel_pstate_driver:
the turbo pstate target value from intel_pstate driver is in a range
instead of fixed value in ACPI _PSS table, thus the turbo px request would
be rejected. This patch fixed this issue.

Tracked-On: #3158

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
2019-05-24 11:36:54 +08:00
Mingqiang Chi
04ccaacb9c hv:not allow SOS to access prelaunched vm memory
filter out prelaunched vm memory from e820 table
and unmap prelaunched vm memory from ept table
before boot service OS

Tracked-On: #3148
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-05-23 19:01:48 +08:00
Zide Chen
5a23f7b664 hv: initial host reset implementation
- add the GUEST_FLAG_HIGHEST_SEVERITY flag to indicate that the guest has
  privilege to reboot the host system.

- this flag is statically assigned to guest(s) in vm_configurations.c in
  different scenarios.

- implement reset_host() function to reset the host. First try the ACPI
  reset register if available, then try the 0xcf9 PIO.

Tracked-On: #3145
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-23 18:24:17 +08:00
Sainath Grandhi
536c69b9ff hv: distinguish between LAPIC_PASSTHROUGH configured vs enabled
ACRN supports LAPIC emulation for guests using x86 APICv. When guest OS/BIOS
switches from xAPIC to x2APIC mode of operation, ACRN also supports switching
froom LAPIC emulation to LAPIC passthrough to guest. User/developer needs to
configure GUEST_FLAG_LAPIC_PASSTHROUGH for guest_flags in the corresponding
VM's config for ACRN to enable LAPIC passthrough.

This patch does the following

1)Fixes a bug in the abovementioned feature. For a guest that is
configured with GUEST_FLAG_LAPIC_PASSTHROUGH, during the time period guest is
using xAPIC mode of LAPIC,  virtual interrupts are not delivered. This can be
manifested as guest hang when it does not receive virtual timer interrupts.

2)ACRN exposes physical topology via CPUID leaf 0xb to LAPIC PT VMs. This patch
removes that condition and exposes virtual topology via CPUID leaf 0xb.

Tracked-On: #3136
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2019-05-23 11:15:31 +08:00
yliu79
c5391d2592 HV: remove unused function vcpu_inject_ac
Change-Id: I4b139e78c372d0941923fc8260db7a3578a894f2
Tracked-On: #3123
Signed-off-by: yliu79 <ying2.liu@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-22 16:36:03 +08:00
yliu79
26de86d761 HV: remove unused function copy_to_gva
Change-Id: I18a6c860ba4bcec4e5915fa6a2a18ed1ecb20fff
Tracked-On: #3123
Signed-off-by: yliu79 <ying2.liu@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-22 16:36:03 +08:00
yliu79
163c63d21f HV: remove unused function resume_vm
Change-Id: Ia6b6617e55044b5555a2d80c26a4ef7d7e56b7fa
Tracked-On: #3123
Signed-off-by: yliu79 <ying2.liu@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-22 16:36:03 +08:00
yliu79
c68c6e4af2 HV: remove unused function shutdown_vcpu
Change-Id: Ia6f9aa4d2d603d23bc0cb9c3b12032d1a96504db
Tracked-On: #3123
Signed-off-by: yliu79 <ying2.liu@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-22 16:36:03 +08:00
Minggui Cao
fc1cbebe31 HV: remove vcpu arch lock, not needed.
the pcpu just write its own vmcs, not need spinlock.
and the arch.lock not used other places, remove it too.

Tracked-On: #3130

Signed-off-by: Minggui Cao <minggui.cao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-22 16:35:26 +08:00
Jason Chen CJ
238d8bbaa2 reshuffle init_vm_boot_info
now only SOS need decide boot with de-privilege or direct boot mode, while
for other pre-launched VMs, they should use direct boot mode.

this patch merge boot/guest/direct_boot_info.c &
boot/guest/deprivilege_boot_info.c into boot/guest/vboot_info.c,
and change init_direct_vboot_info() function name to init_general_vm_boot_info().

in init_vm_boot_info(), depend on get_sos_boot_mode(), SOS may choose to init
vm boot info by setting the vm_sw_loader to deprivilege specific one; for SOS
using DIRECT_BOOT_MODE and all other VMS, they will use general_sw_loader as
vm_sw_loader and go through init_general_vm_boot_info() for virtual boot vm
info filling.

this patch also move spurious handler initilization for de-privilege mode from
boot/guest/deprivilege_boot.c to boot/guest/vboot_info.c, and just set it in
deprivilege sw_loader before irq enabling.

Changes to be committed:
	modified:   Makefile
	modified:   arch/x86/guest/vm.c
	modified:   boot/guest/deprivilege_boot.c
	deleted:    boot/guest/deprivilege_boot_info.c
	modified:   boot/guest/direct_boot.c
	renamed:    boot/guest/direct_boot_info.c -> boot/guest/vboot_info.c
	modified:   boot/guest/vboot_wrapper.c
	modified:   boot/include/guest/deprivilege_boot.h
	modified:   boot/include/guest/direct_boot.h
	modified:   boot/include/guest/vboot.h
	new file:   boot/include/guest/vboot_info.h
	modified:   common/vm_load.c
	modified:   include/arch/x86/guest/vm.h

Tracked-On: #1842
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
2019-05-20 18:49:59 +08:00
Zide Chen
865ee2956e hv: emulate ACPI reset register for Service OS guest
Handle the PIO reset register that is defined in host ACPI:

Parse host FADT table to get the host reset register info, and emulate
it for Service OS:

- return all '1' for guest reads because the read behavior is not defined
  in ACPI.
- ignore guest writes with the reset value to stop it from resetting host;
  if guest writes other values, passthru it to hardware in case the reset
  register supports other functionalities.

Tracked-On: #2700
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-15 11:20:12 +08:00
Zide Chen
26f08680eb hv: shutdown guest VM upon triple fault exceptions
This patch implements triple fault vmexit handler and base on VM types:

- post-launched VMs: shutdown_target_vm() injects S5 PIO write to request
  DM to shut down the target VM.
- pre-launched VMs: shut down the guest.
- SOS: similarly, but shut down all the non real-time post-launched VMs that
  depend to SOS before shutting down SOS.

Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-15 11:20:12 +08:00
Zide Chen
9aa3fe646b hv: emulate reset register 0xcf9 and 0x64
- post-launched RTVM: intercept both PIO ports so that hypervisor has a
  chance to set VM_POWERING_OFF flag.
- all other type of VMs: deny these 2 ports from guest access so that
  guests are not able to reset host.

Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-15 11:20:12 +08:00
Zide Chen
8ad0fd98a3 hv: implement NEED_SHUTDOWN_VM request to idle thread
For pre-launched VMs and SOS, VM shutdown should not be executed in the
current VM context.

- implement NEED_SHUTDOWN_VM request so that the BSP of the target VM can shut
  down the guest in idle thread.
- implement shutdown_vm_from_idle() to shut down target VM.

Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-15 11:20:12 +08:00
Li, Fei1
86f5993bc9 hv: vlapic: fix tpr virtualization when VID is not enabled.
1) According SDM Vol 3, Chap 29.1.2, Any VM exit caused by TPR virtualization
is trap-like: the instruction causing TPR virtualization completes before the VM
exit occurs (for example, the value of CS:RIP saved in the guest-state area of
the VMCS references the next instruction). So we need to retain the RIP.
2) The previous implement only consides the situation the guest will reduce the
TPR. However, the guest will increase it. So we need to update the PPR before we
find a deliverable interrupt. For examples: a) if the guest increase the TPR before
a irq windows vmexit, we need to update the PPR when check whether there has a pending
delivery interrupt or not; b) if the guest increase the TPR, then an external irq raised,
we need to update the PPR when check whether there has a deliverable interrupt to inject
to guest.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yu Wang <yu1.wang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-05-15 08:59:39 +08:00
Li, Fei1
a68dadb74a hv: vm: minor fix about vRTC
For prelaunched VM, Service OS and postlaunched RT VM, we only need the vRTC
provides backed-up date, so we could use the simple vRTC which implemented
in hypervisor. For postlaunched VM (which is not a RT VM), we needs the device
module to emulate the vRTC for it.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yu Wang <yu1.wang@intel.com>
2019-05-15 08:59:39 +08:00
Li, Fei1
bdae8efb7f hv: instr_emul: fix movzx return memory opsize wrong
There're some instructions which not support bit 0(w bit) flag but which
memory opcode size is fixed and the memory opcode size is not equal to the
register opcode size. In our code, there is movzx (which opcode is 0F B7)
which memory opcode size is fixed to 16 bits. So add a flag VIE_OP_F_WORD_OP
to indicate a instruction which memory opcode size is fixed to 16 bits.

Tracked-On: #1337
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
2019-05-13 11:53:51 +08:00
Binbin Wu
a581f50600 hv: vmsr: enable msr ia32_misc_enable emulation
Add MSR_IA32_MISC_ENABLE to emulated_guest_msrs to enable the emulation.
Init MSR_IA32_MISC_ENABLE for guest.

Tracked-On: #2834
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-09 16:35:15 +08:00
Binbin Wu
8e310e6ea1 hv: vcpuid: modify vcpuid according to msr ia32_misc_enable
According to SDM Vol4 2.1, modify vcpuid according to msr ia32_misc_enable:
- Clear CPUID.01H: ECX[3] if guest disabled monitor/mwait.
- Clear CPUID.80000001H: EDX[20] if guest set XD Bit Disable.
- Limit the CPUID leave maximum value to 2 if guest set Limit CPUID MAXVal.

Tracked-On: #2834
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-09 16:35:15 +08:00
Binbin Wu
ef19ed8961 hv: vcpuid: reduce the cyclomatic complexity of function guest_cpuid
This patch reduces the cyclomatic complexity of the function guest_cpuid.

Tracked-On: #2834
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
2019-05-09 16:35:15 +08:00
Binbin Wu
f0d06165d3 hv: vmsr: handle guest msr ia32_misc_enable read/write
Guest MSR_IA32_MISC_ENABLE read simply returns the value set by guest.
Guest MSR_IA32_MISC_ENABLE write:
- Clear EFER.NXE if MSR_IA32_MISC_ENABLE_XD_DISABLE set.
- MSR_IA32_MISC_ENABLE_MONITOR_ENA:
  Allow guest to control this feature when HV doesn't use this feature and hw has no bug.

vcpuid update according to the change of the msr will be covered in following patch.

Tracked-On: #2834
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-09 16:35:15 +08:00
Jason Chen CJ
41ac9e5d10 rename function & definition from firmware to guest boot
The interface struct & API changes like below:
  struct uefi_context->struct depri_boot_context
  init_firmware_operations()->init_vboot_operations()
  init_firmware()->init_vboot()
  firmware_init_irq()->init_vboot_irq()
  firmware_get_rsdp()->get_rsdp_ptr()
  firmware_get_ap_trampoline()->get_ap_trampoline_buf()
  firmware_init_vm_boot_info()->init_vm_boot_info()

Tracked-On: #1842
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-09 16:33:44 +08:00
Jason Chen CJ
20f97f7559 restruct boot and bsp dir for firmware stuff
currently, ACRN hypervisor can either boot from sbl/abl or uefi, that's
why we have different firmware method under bsp & boot dirs.
but the fact is that we actually have two different operations based on
different guest boot mode:
1. de-privilege-boot: ACRN hypervisor will boot VM0 in the same context as
native(before entering hypervisor) - it means hypervisor will co-work with
ACRN UEFI bootloader, restore the context env and de-privilege this env
to VM0 guest.
2. direct-boot: ACRN hypervisor will directly boot different pre-launched
VM(including SOS), it will setup guest env by pre-defined configuration,
and prepare guest kernel image, ramdisk which fetch from multiboot modules.

this patch is trying to:
- rename files related with firmware, change them to guest vboot related
- restruct all guest boot stuff in boot & bsp dirs into a new boot/guest dir
- use de-privilege & direct boot to distinguish two different boot operations

this patch is pure file movement, the rename of functions based on old assumption will
be in the following patch.

Changes to be committed:
	modified:   ../efi-stub/Makefile
	modified:   ../efi-stub/boot.c
	modified:   Makefile
	modified:   arch/x86/cpu.c
	modified:   arch/x86/guest/vm.c
	modified:   arch/x86/init.c
	modified:   arch/x86/irq.c
	modified:   arch/x86/trampoline.c
	modified:   boot/acpi.c
	renamed:    bsp/cmdline.c -> boot/cmdline.c
	renamed:    bsp/firmware_uefi.c -> boot/guest/deprivilege_boot.c
	renamed:    boot/uefi/uefi_boot.c -> boot/guest/deprivilege_boot_info.c
	renamed:    bsp/firmware_sbl.c -> boot/guest/direct_boot.c
	renamed:    boot/sbl/multiboot.c -> boot/guest/direct_boot_info.c
	renamed:    bsp/firmware_wrapper.c -> boot/guest/vboot_wrapper.c
	modified:   boot/include/acpi.h
	renamed:    bsp/include/firmware_uefi.h -> boot/include/guest/deprivilege_boot.h
	renamed:    bsp/include/firmware_sbl.h -> boot/include/guest/direct_boot.h
	renamed:    bsp/include/firmware.h -> boot/include/guest/vboot.h
	modified:   include/arch/x86/multiboot.h

Tracked-On: #1842
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-09 16:33:44 +08:00
Yin Fengwei
8626e5aa3a vm_state: Update vm state VM_STATE_INVALID to VM_POWERED_OFF
Replace the vm state VM_STATE_INVALID to VM_POWERED_OFF.
Also replace is_valid_vm() with is_poweroff_vm().
Add API is_created_vm() to identify VM created state.

Tracked-On: #3082
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
2019-05-08 16:58:41 +08:00
Victor Sun
e2d723d4fa HV: enable acpi pm1a register info fixup
Previously ACPI PM1A register info was hardcoded to 0 in HV for generic boards,
but SOS still can know the real PM1A info so the system would hang if user
trigger S3 in SOS. Enabling PM1A register info fixup will let HV be able to
intercept the operation on PM1A and then make basic function of S3 work for
all boards;

Tracked-On: #2291

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-07 11:39:51 +08:00
Mingqiang Chi
da9ed0eda9 hv:remove some unnecessary includes
remove some unnecessary includes

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2019-05-07 09:10:13 +08:00
Zide Chen
5f87e716b8 hv: release IOMMU irte when releasing ptirq remapping entries
IRTE is freed if ptirq entry is released from remove_msix_remapping() or
remove_intx_remapping(). But if it's called from ptdev_release_all_entries(),
e.g. SOS shutdown/reboot, IRTE is not freed.

This patch adds a release_cb() callback function to do any architectural
specific cleanup. In x86, it's used to release IRTE.

On VM shutdown, vpci_cleanup() needs to remove MSI/MSI-X remapping on
ptirq entries, thus it should be called before ptdev_release_all_entries().

Tracked-On: #2700
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-06 18:25:37 +08:00
Conghui Chen
dd86a78e75 HV: rename 'type' in struct io_request
Rename 'type' in struct io_request to 'io_type' to be more readable.

Tracked-On: #3061
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-06 18:25:20 +08:00
Conghui Chen
e6670b32f4 HV: rename structure acrn_vm_type
Rename structure acrn_vm_type to acrn_vm_load_order as it is used to
indicate the load order instead of the VM type.

Tracked-On: #2291
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-05 11:50:36 +08:00
Yonghua Huang
be2fe2a44d hv:remove accessing shared log buffer cases between stac/clac
Shared buffer is allocated by VM and is protected by SMAP.
Accessing to shared buffer between stac/clac pair will invalidate
SMAP protection.This patch is to remove these cases.
Fix minor stac/clac mis-usage,and add comments as stac/clac usage BKM

Tracked-On: #2526
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-05-05 10:00:32 +08:00
Sainath Grandhi
9214c84600 hv: Rename NORMAL_VM to POST_LAUNCHED_VM
The name NORMAL_VM does not clearly reflect the attribute that these VMs
are launched "later". POST_LAUNCHED_VM is closer to the fact
that these VMs are launched "later" by one of the VMs launched by ACRN.

Tracked-On: #3034
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-29 09:19:51 +08:00
Li, Fei1
536bc5bd12 hv: instr_emul: fix operand size decode
1) For some instructions (like movsx, movzx which we support), there're two operands
and the source operand size is not equal to the dest operand size. In this case,
if we update the memory operand size according to the bit 0(w bit) of opcode,
we will lost the register operand size. This patch tries to fix this by calculating
memory operand size when we want to use it.
2) Calculate memory operand size form operand size and the bit 0(w bit) of opcode
when we want to operate on memory operand.

Tracked-On: #1337
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-28 11:53:24 +08:00
Victor Sun
3bb4308361 HV: check vm id param when dispatching hypercall
If the vmcall param passed from guest is representing a vmid, we should
make sure it is a valid one because it is a pre-condition of following
get_vm_from_vmid(). And then we don't need to do NULL VM pointer check
in is_valid_vm() because get_vm_from_vmid() would never return NULL.

Tracked-On: #2978

Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-26 16:21:14 +08:00
Zide Chen
1b7d33a426 hv: separate host e820 and SOS ve820
There are 2 reasons why SOS ve820 has to be separated from host e820:
- in hybrid mode, SOS may not own all the host memory.
- when SOS is being re-launched, it needs an untainted version of host
  e820 to create SOS ve820.

This patch creates sos_e820 table for SOS and keeps host e820 intact
during SOS creation.

Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-26 16:20:50 +08:00
Conghui Chen
235d886103 HV: vuart: enable vuart console for VM
In previous code, only for pre-launched VM, hypervisor would create
vuart console for each VM. But for post-launched VM, no vuart is
created.
In this patch, create vuart according to configuration in structure
acrn_vm_config. As the new configuration is set for pre-launched VM and
post-launched VM, and the vuart initialize process is common for each
VM, so, remove CONFIG_PARTITION_MODE from vuart related code.

Tracked-On: #2987
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-25 11:21:54 +08:00
Victor Sun
2362e58509 HV: correct usage of GUEST_FLAG_IO_COMPLETION_POLLING
The guest flags of GUEST_FLAG_IO_COMPLETION_POLLING work for NORMAL_VM only;

Tracked-On: #2291

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-25 11:07:21 +08:00
Yin Fengwei
578592b566 vlapic: refine IPI broadcast to support x2APIC mode
According to SDM 10.6.1, if dest fields is 0xffU for register
ICR operation, that means the IPI is for broadcast.

According to SDM 10.12.9, 0xffffffffU of dest fields for x2APIC
means IPI is for broadcast.

We add new parameter to vlapic_calc_dest() to show whether the
dest is for broadcast. For IPI, we will set it according to
dest fields. For ioapic and MSI, we hardcode it to false because
no broadcast for ioapic and MSI.

Tracked-On: #3003
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-25 09:12:16 +08:00
Kaige Fu
a85d11ca7a HV: Add prefix 'p' before 'cpu' to physical cpu related functions
This patch adds prefix 'p' before 'cpu' to physical cpu related functions.
And there is no code logic change.

Tracked-On: #2991
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-24 10:50:28 +08:00
li bing
25741b62db HV: fix the issue of ACRN_REQUEST_EXCP flag is not cleared.
the problem is : System will crash when run crashme.
The root cause of this problem is that when the ACRN_REQUEST_EXCP flag is set by calling
the vcpu_make_request function, the flag is not cleared.
Add the following statement to the vcpu_inject_exception function to fix the problem:
bitmap_test_and_clear_lock(ACRN_REQUEST_EXCP, &vcpu->arch.pending_req);
Tested that one night, there was no crash.

Tracked-On: #2527
Signed-off-by: bing.li<bingx.li@intel.com>
Acked-by:      Eddie Dong<eddie.dong@intel.com>
2019-04-23 15:17:13 +08:00
Li, Fei1
28d50f1b96 hv: vlapic: add apic register offset check API
Add apic rgister offset check before do vlapic read/write.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-04-23 15:16:56 +08:00
Li, Fei1
70dd254456 hv: vmsr: refine x2apic MSR bitmap setting
In theory, we should trap out all the x2apic MSR access if APICv is not enabled.
When "Use TPR shadow" and "Virtualize x2APIC mode" are enabled, we could disable
TPR interception; when APICv is fully enabled, besides TPR, we could disable all
MSR read, EOI and self-IPI interception; when we pass through lapic to guest, we
could disable all the MSR access interception except XAPICID/LDR read and ICR write.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-04-23 15:16:56 +08:00
Li, Fei1
0c347e607a hv: vlapic: wrap APICv check pending delivery interrupt
When in fully APICv mode, we enable VID. All pending delivery interrupts
will inject to VM before VM entry. So there is no pending delivery interrupt.
However, if VID is not enabled, we can only inject pending delivery interrupt
one by one. So we always need to do this check.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-04-23 15:16:56 +08:00
Li, Fei1
037fffc203 hv: vlapic: wrap APICv inject interrupt API
apicv_advanced_inject_intr is used if APICv fully features are supported,
it uses PIR to inject interrupt. otherwise, apicv_basic_inject_intr is used.
it will use VMCS INTR INFO field to inject irq.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-04-23 15:16:56 +08:00
Li, Fei1
1db8123c2d hv: virq: refine pending event inject coding style
In order to combine vlapic_apicv_inject_pir and vcpu_inject_vlapic_int.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-04-23 15:16:56 +08:00
Li, Fei1
fde2c07c0a hv: vlapic: minor fix about APICv inject interrupt
When VID is enabeld, we should always inejct the pending interrupts when vm enter.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-04-23 15:16:56 +08:00
Li, Fei1
846b5cf6b7 hv: vlapic: wrap APICv accept interrupt API
The APICv ops is decided once the APICv feature on the physical platform is detected.
We will use apicv_advanced_ops if the physical platform support fully APICv feature;
otherwise, we will use apicv_basic_ops.
This patch only wrap the accept interrupt API for them.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-04-23 15:16:56 +08:00
Yonghua Huang
f991d179b0 hv: fix possible buffer overflow in vlapic.c
Possible buffer overflow will happen in vlapic_set_tmr()
  and vlapic_update_ppr(),this path is to fix them.

Tracked-On: #1252
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-23 13:43:25 +08:00
Li, Fei1
2c13ac7400 hv: vmcs: minor fix about APICv feature setting
1) Shouldn't try to set APIC-register virtualization if the physical doesn't
support APICV advanced mode.
2) Remove all APICv features VMCS setting when LAPIC is passed through to guest.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-04-23 09:47:36 +08:00
Li, Fei1
4fc2009770 hv: instr_emul: check the bit 0(w bit) of opcode when necessary
Not every instruction supports the operand-size bit (w). This patch try to correct
what done in commit-id 9df8790 by setting a flag VIE_OP_F_BYTE_OP to indicate which
instruction supports the operand-size bit (w).

This bug is found by removing VMX_PROCBASED_CTLS2_VAPIC_REGS VMCS setting when the
physical doesn't support this APICv feature. However, if emulated this in MRB board,
the android can't boot because when switch to trusty world, it will check
"Delivery Status" in ICR first. It turns out that this Bit Test instruction is not
emulated correctly.

Tracked-On: #1337
Signed-off-by: Qi Yadong <yadong.qi@intel.com>
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
2019-04-23 09:47:36 +08:00
Kaige Fu
91c1408197 HV: Reset physical core of lapic_pt vm when shutdown
The physical core of lapic_pt vm should be reset for security and
correctness when shutdown the vm.

Tracked-On: #2991
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-22 19:58:20 +08:00
Kaige Fu
565f3c723a HV: Clear DM set guest_flags when shutdown vm
Currently, the previous configurations about guest_flags set by DM will
not be cleared when shutdown the vm. Then it might bring issue for the
next dm-launched vm.

For example, if we create one vm with LAPIC_PASSTHROUGH flag and shutdown it.
Then the next dm-launched vm will has the LAPIC_PASSTHROUGH flag set no matter
whether we set it in DM.

This patch clears all the DM set flags when shtudown vm.

Tracked-On: #2991
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-22 19:58:20 +08:00
Zide Chen
a3207b2bc2 hv: allocate vpid based on vm_id and vcpu_id mapping
Currently vpid is not released in reset_vcpu() hence the vpid resource
could be exhausted easily if guests are re-launched.

This patch assigns vpid according to the fixed mapping of runtime vm_id
and vcpu_id to guarantee the uniqueness of vpid.

Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-04-22 19:57:28 +08:00
Victor Sun
9673f3dad4 HV: validate target vm in hypercall
- The target vm in most of hypercalls should be a NORMAL_VM, in some
exceptions it might be a SOS_VM, we should validate them.

- Please be aware that some hypercall might have limitation on specific
target vm like RT or SAFETY VMs, this leaves "TODO" in future;

- Unify the coding style:

	int32_t hcall_foo(vm, target_vm_id, ...)
	{
		int32_t ret = -1;
		...

		if ((is_valid_vm(target_vm) && is_normal_vm(target_vm)) {
			ret = ....
		}

		return ret;
	}

Tracked-On: #2978

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-22 19:57:03 +08:00
Cai Yulong
887d41683b hv: check vm state before creating a VM
If launch two UOS with same UUID by acrn-dm, current code path will
return same VM instance to the acrn-dm, this will crash the two UOS.

Check VM state and make sure it's in VM_STATE_INVALID state before
creating a VM.

Tracked-On: #2984
Signed-off-by: Cai Yulong <yulongc@hwtc.com.cn>
2019-04-22 15:18:03 +08:00
Zide Chen
aee9f3c666 hv: reset per cpu sbuf pointers during vcpu reset
When shutting down SOS VM, the shared sbuf is released from guest OS, but
the per cpu sbuf pointers in hypervisor keep inact. This creates a problem
that after SOS is re-launched, hypervisor could write to the shared
buffer that no longer exists.

This patch implements sbuf_reset() and call it from reset_vcpu() to
reset sbuf pointers.

Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-04-19 16:20:34 +08:00
Li, Fei1
56acaacc29 hv: vlapic: add TPR below threshold implement
Add TPR below threshold implement for "Virtual-interrupt delivery" not support.
Windows will use it to delay interrupt handle.

Complete all the interrupts in IRR as long as they are higher priority than
current TPR. Once current IRR priority is less than current TPR enable TPR
threshold to IRR, so that if guest reduces the TPR threshold, it would be good
to take below TPR threshold exit and let interrupts to go thru.

Tracked-On: #1842
Signed-off-by: Zheng, Gen <gen.zheng@intel.com>
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
2019-04-19 11:11:54 +08:00
Yuan Liu
b1e68453bd hv: enable vMCE from guest CPUID
Enable vMCE feature to boot windows guest.

vMCE is set in EDX from Microsoft TLFS spec, to support windows guest
vMCA and vMCE should be supported by guest CPUID.

Support MSR_IA32_MCG_CAP and MSR_IA32_MCG_STATUS reading when vMCE is enabled,
but they are not emulated yet, so return 0 directly.

Tracked-On: #1867
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Yu Wang <yu1.wang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-18 09:00:42 +08:00
Sainath Grandhi
824caf8ce0 hv: Remove need for init_fallback_iommu_domain and fallback_iommu_domain
In the presence of SOS, ACRN uses fallback_iommu_domain which is the same
used by SOS, to assign domain to devices during ACRN init. Also it uses
fallback_iommu_domain when DM requests ACRN to remove device from UOS domain.
This patch changes the design of assign/remove_iommu_device to avoid the
concept of fallback_iommu_domain and its setup. This way ACRN can commonly
treat pre-launched VMs bringup w.r.t. IOMMU domain creation.

Tracked-On: #2965
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
2019-04-17 11:42:36 +08:00
Yonghua Huang
46480f6e23 hv: add new hypercall to fetch platform configurations
add new hypercall get platform information,
 such as physical CPU number.

Tracked-On: #2538
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-15 22:14:13 +08:00
Sainath Grandhi
16a2af5715 hv: Build mptable for guest if VM type is Pre-Launched
ACRN builds mptable for pre-launched VMs. It uses CONFIG_PARTITION_MODE
to compile mptable source code and related support. This patch removes
the macro and checks if the type of VM is pre-launched to build mptable.

Tracked-On: #2941
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-15 15:51:02 +08:00
Victor Sun
445999af5d HV: make vm id statically by uuid
Currently VM id of NORMAL_VM is allocated dymatically, we need to make
VM id statically for FuSa compliance.

This patch will pre-configure UUID for all VMs, then NORMAL_VM could
get its VM id/configuration from vm_configs array by indexing the UUID.

If UUID collisions is found in vm configs array, HV will refuse to
load the VM;

Tracked-On: #2291

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-12 13:45:32 +08:00
Victor Sun
6071234337 HV: use term of UUID
The code mixed the usage on term of UUID and GUID, now use UUID to make
code more consistent, also will use lowercase (i.e. uuid) in variable name
definition.

Tracked-On: #2291

Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-12 13:45:32 +08:00
Li, Fei1
4557033a3a hv: vlapic: minor fix about vlapic write
1) In x2apic mode, when read ICR, we want to read a 64-bits value.
2) In x2apic mode, write self-IPI will trap out through MSR write when VID isn't enabled.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-04-12 10:11:10 +08:00
Li, Fei1
fa8fa37cdf hv: vlapic: remove vlapic_rdmsr/wrmsr
We could call vlapic API directly, remove vlapic_rdmsr/wrmsr to make things easier.

Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
2019-04-12 10:11:10 +08:00
Mingqiang Chi
69627ad7b6 hv: rename io_emul.c to vmx_io.c
renamed:  arch/x86/guest/io_emul.c -> arch/x86/guest/vmx_io.c
renamed:  include/arch/x86/guest/io_emul.h
	   -> include/arch/x86/guest/vmx_io.h

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-12 10:09:26 +08:00
Mingqiang Chi
2b79c6df6a hv:move some common APIs to io_req.c
Now the io_emul.c is relates with arch,io_req.c is common,
move some APIs from io_emul.c to io_req.c as common like these APIs:
register_pio/mmio_emulation_handler
dm_emulate_pio/mmio_complete
pio_default_read/write
mmio_default_access_handler
hv_emulate_pio/mmio etc

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-12 10:09:26 +08:00
Mingqiang Chi
0a1c016dbb hv: move 'emul_pio[]' from strcut vm_arch to acrn_vm
Move ‘emul_pio[]/default_io_read/default_io_write’
from struct vm_arch to struct acrn_vm

Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
2019-04-12 10:09:26 +08:00