HLT emulation is import to CPU resource maximum utilization. vcpu
doing HLT means it is idle and can give up CPU proactively. Thus, we
pause the vcpu thread in HLT emulation and resume it while event happens.
When vcpu enter HLT, its vcpu thread will sleep, but the vcpu state is
still 'Running'.
VM ID PCPU ID VCPU ID VCPU ROLE VCPU STATE
===== ======= ======= ========= ==========
0 0 0 PRIMARY Running
0 1 1 SECONDARY Running
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Sometimes HV wants to know if there are pending interrupts of one vcpu.
Add .has_pending_intr interface in acrn_apicv_ops and return the pending
interrupts status by check IRRs of apicv.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In current architecutre, the maximum vCPUs number per VM could not
exceed the pCPUs number. Given the MAX_PCPU_NUM macro is provided
in board configurations, so remove the MAX_VCPUS_PER_VM from Kconfig
and add a macro of MAX_VCPUS_PER_VM to reference MAX_PCPU_NUM directly.
Tracked-On: #4230
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In APICv advanced mode, an targeted vCPU, running in non-root mode, may get outdated
TMR and EOI exit bitmap if another vCPU sends an interrupt to it if the trigger mode
of this interrupt has changed.
This patch try to kick vCPU off to let it get the latest TMR and EOI exit bitmap when
it enters non-root mode again if new coming interrupt trigger mode has changed. Then
fill the interrupt to PIR.
Tracked-On: #4200
Signed-off-by: Li Fei1 <fei1.li@intel.com>
After changing init_vmcs to smp call approach and do it before
launch_vcpu, it could work with noop scheduler. On real sharing
scheudler, it has problem.
pcpu0 pcpu1 pcpu1
vmBvcpu0 vmAvcpu1 vmBvcpu1
vmentry
init_vmcs(vmBvcpu1) vmexit->do_init_vmcs
corrupt current vmcs
vmentry fail
launch_vcpu(vmBvcpu1)
This patch mark a event flag when request vmcs init for specific vcpu. When
it is running and checking pending events, will do init_vmcs firstly.
Tracked-On: #4178
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
When write to virtual TSC_DEADLINE, if virtual TSC_ADJUST is not zero:
- when guest intends to disarm the tsc_deadline timer, should not arm the timer falsely;
- when guest intends to arm the tsc_deadline timer, should not disarm the timer falsely.
When read from virtual TSC_DEADLINE, if virtual TSC_ADJUST is not zero:
- if physical TSC_DEADLINE is not zero, return the virtual TSC_DEADLINE value;
- if physical TSC_DEADLINE is zero which means it's not armed (automatically disarmed after
timer triggered), return 0 and reset the virtual TSC_DEADLINE.
Tracked-On: #4162
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch decouple some scheduling logic and abstract into a scheduler.
Then we have scheduler, schedule framework. From modulization
perspective, schedule framework provides some APIs for other layers to
use, also interact with scheduler through scheduler interaces.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
With cpu sharing enabled, we will map acrn_vcpu to thread_object
in scheduling. From modulization perspective, we'd better hide the
pcpu_id in acrn_vcpu and move it to thread_object.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
vcpu thread's stack shouldn't follow reset_vcpu to reset.
There is also a bug here:
while vcpu B thread set vcpu->running to false, other vcpu A thread
will treat the vcpu B is paused while it has not been switch out
completely, then reset_vcpu will reset the vcpu B thread's stack and
corrupt its running context.
This patch will remove the vcpu thread's stack reset from reset_vcpu.
With the change, we need do init_vmcs between vcpu startup address be
settled and scheduled in. And switch_to_idle() is not needed anymore
as S3 thread's stack will not be reset.
Tracked-On: #3813
Signed-off-by: Fengwei Yin <fengwei.yin@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Consider the following case when TPR shadow is used with vlapic
basic mode:
1) 2 interrupts are pending in vlapic. INTa's priority > TPR and
INTb's priority <= TPR.
2) TPR threshold is set to zero and INTa is injected to guest.
3) Guest set TPR to the priority of INTa.
4) EOI of INTa. PPR is updated to TPR which equals INTa's priority.
INTb cannot be injected because its priority <= PPR.
5) Guest set TPR to zero. Because TPR threshold is still zero, there is
no TPR threshold vmexit. But since both TPR and ISRV are zero at
this time, the PPR is zero as well. INTb still cannot be injected.
This is a bug.
By adding vcpu_make_request(vlapic->vcpu, ACRN_REQUEST_EVENT) in EOI,
TPR threshold will be updated before vm_resume.
Tracked-On: #3795
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Fix tsc_deadline issue by trapping TSC_DEADLINE msr write if VMX_TSC_OFFSET is not 0.
Because there is an assupmtion in the ACRN vART design that pTSC_Adjust and vTSC_Adjust are
both 0.
We can leave the TSC_DEADLINE write pass-through without correctness issue becuase there is
no offset between the pTSC and vTSC, and there is no write to vTSC or vTSC_Adjust write observed
in the RTOS so far.
This commit fix the potential correctness issue, but the RT performance will be badly affected
if vTSC or vTSC_Adjust was not zero, which we will address if such case happened.
Tracked-On: #3636
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
fix violations touched below:
1.Cast operation on a constant value
2.signed/unsigned implicity conversion
3.return value unused.
V1->V2:
1.bitmap api will return boolean type, not need to check "!= 0", deleted.
2.The behaves ~(uint32_t)X and (uint32_t)~X are not defined in ACRN hypervisor Coding Guidelines,
removed the change of it.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Reviewed-by: Junjie Mao <junjie.mao@intel.com>
After "commit f0e1c5e init vcpu host stack when reset vcpu", SOS resume form S3
wants to schedule to vcpu_thread not the point where SOS enter S3. So we should
schedule to idel first then reschedule to execute vcpu_thread.
Tracked-On: #3387
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Because we depend on guest OS to switch x2apic mode to enable lapic pass-thru, vlapic is working at the early stage of booting, eg: in virtual boot loader.
After lapic pass-thru enabled, no interrupt should be injected via vlapic any more.
This commit resets the vlapic to clear the pending status and adds ptapic_ops to enforce that no more interrupt accepted/injected via vlapic.
Tracked-On: #3227
Signed-off-by: Yan, Like <like.yan@intel.com>
This commit adds ops to vlapic structure, and add an *ops parameter to vlapic_reset().
At vlapic reset, the ops is set to the global apicv_ops, and may be assigned
to other ops later.
Tracked-On: #3227
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In almost case, vLAPIC will only be accessed by the related vCPU. There's no
synchronization issue in this case. However, other vCPUs could deliver interrupts
to the current vCPU, in this case, the IRR (for APICv base situation) or PIR
(for APICv advanced situation) and TMR for both cases could be accessed by more
than one vCPUS simultaneously. So operations on IRR or PIR should be atomical
and visible to other vCPUs immediately. In another case, vLAPIC could be accessed
by another vCPU when create vCPU or reset vCPU which could be supposed to be
consequently.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
The current implement will cache each ISR vector in ISR vector stack and do
ISR vector stack check when updating PPR. However, there is no need to do this
because:
1) We will not touch vlapic->isrvec_stk[0] except doing vlapic_reset:
So we don't need to do vlapic->isrvec_stk[0] check.
2) We only deliver higher priority interrupt from IRR to ISR:
So we don't need to check whether vlapic->isrvec_stk interrupts is always increasing.
3) There're only 15 different priority interrupt, It will not happened that more that
15 interrupts could been delivered to ISR:
So we don't need to check whether vlapic->isrvec_stk_top will larger than
ISRVEC_STK_SIZE which is 16.
This patch try to remove ISR vector stack and use isrv to cache the vector number for
the highest priority bit that is set in the ISR.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
The current implement doesn't clear which access type we support for
APIC-Access VM Exit:
1) linear access for an instruction fetch
-- APIC-access page is mapped as UC which doesn't support fetch
2) linear access (read or write) during event delivery
-- Which is not happened in normal case except the guest went wrong, such as,
set the IDT table in APIC-access page. In this case, we don't need to support.
3) guest-physical access during event delivery;
guest-physical access for an instruction fetch or during instruction execution
-- Do we plan to support enable APIC in real mode ? I don't think so.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
This patch introduces check_vm_vlapic_state API instead of is_lapic_pt_enabled
to check if all the vCPUs of a VM are using x2APIC mode and LAPIC
pass-through is enabled on all of them.
When the VM is in VM_VLAPIC_TRANSITION or VM_VLAPIC_DISABLED state,
following conditions apply.
1) For pass-thru MSI interrupts, interrupt source is not programmed.
2) For DM emulated device MSI interrupts, interrupt is not delivered.
3) For IPIs, it will work only if the sender and destination are both in x2APIC mode.
Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch introduces vLAPIC state for a VM. The VM vLAPIC state can
be one of the following
* VM_VLAPIC_X2APIC - All the vCPUs/vLAPICs (Except for those in Disabled mode) of this
VM use x2APIC mode
* VM_VLAPIC_XAPIC - All the vCPUs/vLAPICs (Except for those in Disabled mode) of this
VM use xAPIC mode
* VM_VLAPIC_DISABLED - All the vCPUs/vLAPICs of this VM are in Disabled mode
* VM_VLAPIC_TRANSITION - Some of the vCPUs/vLAPICs of this VM (Except for those in Disabled mode)
are in xAPIC and the others in x2APIC
Upon a vCPU updating the IA32_APIC_BASE MSR to switch LAPIC mode, this
API is called to sync the vLAPIC state of the VM. Upon VM creation and reset,
vLAPIC state is set to VM_VLAPIC_XAPIC, as ACRN starts the vCPUs vLAPIC in
XAPIC mode.
Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
is_xapic_enabled API returns true if vLAPIC is in xAPIC mode. In
all other cases, it returns false.
Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Remove static and inline attributes to the API is_x2apic_enabled
and declare a prototype in vlapic.h. Also fix the check performed on guest
APICBASE_MSR value to query vLAPIC mode.
Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch changes the code in vlapic_set_apicbase for the following reasons
1) Better readability as it first checks if the new value programmed into
MSR is any different from the existing value cached in guest structures
2) Check if both bits 11:10 are set before enabling x2APIC mode for guest.
Current code does not check if Bit 11 is set.
3) Add TODO in the comments, to detail about the current gaps in
IA32_APIC_BASE MSR emulation.
Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
vm_apicid2vcpu_id() might return invalid vcpu id, when this happens
we should return -1 in vlapic_x2apic_pt_icr_access();
Tracked-On: #3214
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
The print message of source and target vcpu id is incorrect, fix it.
Tracked-On: #3214
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Since the vapic_id is from VM, need to check for pre-condition before passing
vcpu_id to vcpu_from_vid. This is in the path of LAPIC passthrough ICR
access
Tracked-On: #3170
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
ACRN supports LAPIC emulation for guests using x86 APICv. When guest OS/BIOS
switches from xAPIC to x2APIC mode of operation, ACRN also supports switching
froom LAPIC emulation to LAPIC passthrough to guest. User/developer needs to
configure GUEST_FLAG_LAPIC_PASSTHROUGH for guest_flags in the corresponding
VM's config for ACRN to enable LAPIC passthrough.
This patch does the following
1)Fixes a bug in the abovementioned feature. For a guest that is
configured with GUEST_FLAG_LAPIC_PASSTHROUGH, during the time period guest is
using xAPIC mode of LAPIC, virtual interrupts are not delivered. This can be
manifested as guest hang when it does not receive virtual timer interrupts.
2)ACRN exposes physical topology via CPUID leaf 0xb to LAPIC PT VMs. This patch
removes that condition and exposes virtual topology via CPUID leaf 0xb.
Tracked-On: #3136
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
1) According SDM Vol 3, Chap 29.1.2, Any VM exit caused by TPR virtualization
is trap-like: the instruction causing TPR virtualization completes before the VM
exit occurs (for example, the value of CS:RIP saved in the guest-state area of
the VMCS references the next instruction). So we need to retain the RIP.
2) The previous implement only consides the situation the guest will reduce the
TPR. However, the guest will increase it. So we need to update the PPR before we
find a deliverable interrupt. For examples: a) if the guest increase the TPR before
a irq windows vmexit, we need to update the PPR when check whether there has a pending
delivery interrupt or not; b) if the guest increase the TPR, then an external irq raised,
we need to update the PPR when check whether there has a deliverable interrupt to inject
to guest.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yu Wang <yu1.wang@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
According to SDM 10.6.1, if dest fields is 0xffU for register
ICR operation, that means the IPI is for broadcast.
According to SDM 10.12.9, 0xffffffffU of dest fields for x2APIC
means IPI is for broadcast.
We add new parameter to vlapic_calc_dest() to show whether the
dest is for broadcast. For IPI, we will set it according to
dest fields. For ioapic and MSI, we hardcode it to false because
no broadcast for ioapic and MSI.
Tracked-On: #3003
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch adds prefix 'p' before 'cpu' to physical cpu related functions.
And there is no code logic change.
Tracked-On: #2991
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In theory, we should trap out all the x2apic MSR access if APICv is not enabled.
When "Use TPR shadow" and "Virtualize x2APIC mode" are enabled, we could disable
TPR interception; when APICv is fully enabled, besides TPR, we could disable all
MSR read, EOI and self-IPI interception; when we pass through lapic to guest, we
could disable all the MSR access interception except XAPICID/LDR read and ICR write.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
When in fully APICv mode, we enable VID. All pending delivery interrupts
will inject to VM before VM entry. So there is no pending delivery interrupt.
However, if VID is not enabled, we can only inject pending delivery interrupt
one by one. So we always need to do this check.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
apicv_advanced_inject_intr is used if APICv fully features are supported,
it uses PIR to inject interrupt. otherwise, apicv_basic_inject_intr is used.
it will use VMCS INTR INFO field to inject irq.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
The APICv ops is decided once the APICv feature on the physical platform is detected.
We will use apicv_advanced_ops if the physical platform support fully APICv feature;
otherwise, we will use apicv_basic_ops.
This patch only wrap the accept interrupt API for them.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Possible buffer overflow will happen in vlapic_set_tmr()
and vlapic_update_ppr(),this path is to fix them.
Tracked-On: #1252
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add TPR below threshold implement for "Virtual-interrupt delivery" not support.
Windows will use it to delay interrupt handle.
Complete all the interrupts in IRR as long as they are higher priority than
current TPR. Once current IRR priority is less than current TPR enable TPR
threshold to IRR, so that if guest reduces the TPR threshold, it would be good
to take below TPR threshold exit and let interrupts to go thru.
Tracked-On: #1842
Signed-off-by: Zheng, Gen <gen.zheng@intel.com>
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
1) In x2apic mode, when read ICR, we want to read a 64-bits value.
2) In x2apic mode, write self-IPI will trap out through MSR write when VID isn't enabled.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
We could call vlapic API directly, remove vlapic_rdmsr/wrmsr to make things easier.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
For Pre-launched VMs, ACRN uses mptable for reporting APIC IDs to guest OS.
In current code, ACRN uses physical LAPIC IDs for vLAPIC IDs.
This patch is to let ACRN use vCPU id for vLAPIC IDs and also report the same
when building mptable. ACRN should still use physical LAPIC IDs for SOS
because host ACPI tables are passthru to SOS.
Tracked-On: #2934
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
All if . . else if constructs shall be
terminated with an else statement.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com
Since we always enable "Use TPR shadow", so operate on TPR will not
trigger VM exit. So remove these APIs.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Add two functions to combine constraint for APICv:
is_apicv_basic_feature_supported: check the physical platform whether support
"Use TPR shadow", "Virtualize APIC accesses" and "Virtualize x2APIC mode"
is_apicv_advanced_feature_supported: check the physical platform whether support
"APIC-register virtualization", "Virtual-interrupt delivery" and
"Process posted interrupts".
If the physical platform only support APICv basic feature, enable "Use TPR shadow"
and "Virtualize APIC accesses" for xAPIC mode; enable "Use TPR shadow" and
"Virtualize x2APIC mode" for x2APIC. Otherwise, if the physical platform support
APICv advanced feature, enable APICv feature for xAPIC mode and x2APIC mode.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
This commit extracts the common logic of vlapic_calc_dest() and vlapic_calc_dest_lapic_pt()
to static inline functions, in order to make vlapic_calc_dest() clean and easy to read.
Tracked-On: #1842
Signed-off-by: Yan, Like <like.yan@intel.com>
We could simple the vector check for LVT IRQ by move this check to
vlapic_fire_lvt when the IRQ is fixed mode.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Set when the local APIC detects an illegal vector (one in the range 0 to 15)
in the message that it is sending. This occurs as the result of a write to the
ICR (in both xAPIC and x2APIC modes) or to SELF IPI register (x2APIC mode only)
with an illegal vector.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
LVT ERROR is an edge and fixed mode interrupt. We could call vlapic_accept_intr
to fire it directly. Otherwise, if LVT ERR vector is invalid, an invalid
interrupt will be accepted in IRR.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>