In theory, guest could re-program PCI BAR address to any address. However, ACRN
hypervisor only support [0, top_address_space) EPT memory mapping. So we need to
check whether the PCI BAR re-program address is within this scope.
Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
kick means to notify one thread_object. If the target thread object is
running, send a IPI to notify it; if the target thread object is
runnable, make reschedule on it.
Also add kick_vcpu API in vcpu layer to notify vcpu.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
After moving softirq to following interrupt path, softirq handler might
break in the schedule spinlock context and try to grab the lock again,
then deadlock.
Disable interrupt with schedule spinlock context.
For the IRQ disable/restore operations:
CPU_INT_ALL_DISABLE(&rflag)
CPU_INT_ALL_RESTORE(rflag)
each takes 50~60 cycles.
renaming: get_schedule_lock -> obtain_schedule_lock
Tracked-On: #3813
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Clean up do_swtich and do switch related things in schedule().
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch decouple some scheduling logic and abstract into a scheduler.
Then we have scheduler, schedule framework. From modulization
perspective, schedule framework provides some APIs for other layers to
use, also interact with scheduler through scheduler interaces.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
To get pcpu_id from sched_control quickly and easier.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Let init thread end with run_idle_thread(), then idle thread take over and
start to do scheduling.
Change enter_guest_mode() to init_guest_mode() as run_idle_thread() is removed
out of it. Also add run_thread() in schedule module to run
thread_object's thread loop directly.
rename: switch_to_idle -> run_idle_thread
Tracked-On: #3813
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
sleep one thread_object means to prevent it from being scheduled.
wake one thread_object is an opposite operation of sleep.
This patch also add notify_mode in thread_object to indicate how to
deliver the request.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now, we have three valid status for thread_object:
THREAD_STS_RUNNING,
THREAD_STS_RUNNABLE,
THREAD_STS_BLOCKED.
This patch also provide several helpers to check the thread's status and
a status set wrapper function.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
schedule infrastructure is per pcpu, so move its initialization to each
pcpu's initialization.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
With cpu sharing enabled, per_cpu vcpu cannot work properly as we might
has multiple vcpus running on one pcpu.
Add a schedule API sched_get_current to get current thread_object on
specific pcpu, also add a vcpu API get_running_vcpu to get corresponding
vcpu of the thread_object.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
With cpu sharing enabled, we will map acrn_vcpu to thread_object
in scheduling. From modulization perspective, we'd better hide the
pcpu_id in acrn_vcpu and move it to thread_object.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
vcpu thread's stack shouldn't follow reset_vcpu to reset.
There is also a bug here:
while vcpu B thread set vcpu->running to false, other vcpu A thread
will treat the vcpu B is paused while it has not been switch out
completely, then reset_vcpu will reset the vcpu B thread's stack and
corrupt its running context.
This patch will remove the vcpu thread's stack reset from reset_vcpu.
With the change, we need do init_vmcs between vcpu startup address be
settled and scheduled in. And switch_to_idle() is not needed anymore
as S3 thread's stack will not be reset.
Tracked-On: #3813
Signed-off-by: Fengwei Yin <fengwei.yin@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Currently we are using a 1:1 mapping logic for pcpu:vcpu. So don't need
a runqueue for it. Removing it as preparation work to abstract scheduler
framework.
Tracked-On: #3813
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
--remove unnecessary includes
--remove unnecssary forward-declaration for 'struct vhm_request'
Tracked-On: #861
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
this patch is to fix error debug message
for invalid 'param' case, there is no string
variable for '%s' output, which will potenially
trigger hypervisor crash as it may access random
memroy address and trigger SMAP violation.
Tracked-On: #3801
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
As we introduced vcpu_affinity[] to assign vcpus to different pcpus, the
old policy and functions are not needed. Remove them.
Tracked-On: #3663
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
There is plan that define each VM configuration statically in HV and let
DM just do VM creating and destroying. So DM need get vcpu_num
information when VM creating.
This patch return the vcpu_num via the API param. And also initial the
VMs' cpu_num for existing scenarios.
Tracked-On: #3663
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now, we create vcpus while VM being created in hypervisor. The
create vcpu hypercall will not be used any more. For compatbility,
keep the hypercall HC_CREATE_VCPU do nothing.
v4: Don't remove HC_CREATE_VCPU hypercall, let it do nothing.
Tracked-On: #3663
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
No need to memset since it will overwrite the memory
by copy_from_gpa in some cases.
Tracked-On: #861
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Using per_cpu list to record ptdev interrupts is more reasonable than
recording them per-vm. It makes dispatching such interrupts more easier
as we now do it in softirq which happens following interrupt context of
each pcpu.
Tracked-On: #3663
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
From modulization perspective, it's not suitable to put pcpu and vm
related request operations in schedule. So move them to pcpu and vm
module respectively. Also change need_offline return value to bool.
Tracked-On: #3663
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
Signed-off-by: Yu Wang <yu1.wang@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
-- move 'RFLAGS_AC' to cpu.h
-- move 'VMX_SUPPORT_UNRESTRICTED_GUEST' to msr.h
and rename it to 'MSR_IA32_MISC_UNRESTRICTED_GUEST'
-- move 'get_vcpu_mode' to vcpu.h
-- remove deadcode 'vmx_eoi_exit()'
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now, we use native gdt saved in boot context for guest and assume
it could be put to same address of guest. But it may not be true
after the pre-launched VM is introduced. The gdt for guest could
be overwritten by guest images.
This patch make 32bit protect mode boot not use saved boot context.
Insteadly, we use predefined vcpu_regs value for protect guest to
initialize the guest bsp registers and copy pre-defined gdt table
to a safe place of guest memory to avoid gdt table overwritten by
guest images.
Tracked-On: #3532
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Create an iommu domain for all guest in vpci_init no matter if there's a PTDev
in it.
Tracked-On: #3475
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Dongsheng Zhang <dongsheng.x.zhang@intel.com>
To avoid acrn_handle_pending_request called twice within one vmexit,
we remove the error-prone "continue" in vcpu_thread.
And make vcpu shecheduled out if fatal error happens with vcpu.
Tracked-On: #3387
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
softirq shouldn't be bounded to vcpu thread. One issue for this
is shell (based on timer) can't work if we don't start any guest.
This change also is trying best to make softirq handler running
with irq enabled.
Also update the irq disable/enabel in vmexit handler to align
with the usage in vcpu_thread.
Tracked-On: #3387
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
ACRN Coding guidelines requires type conversion shall be explicity. However,
there's no need for this case since we could return bool directly.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now sched_object and sched_context are protected by scheduler_lock. There's no
chance to use runqueue_lock to protect schedule runqueue if we have no plan to
support schedule migration.
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Schedule context flag may set out of schedule lock protection, like NEED_OFFLINE
or NEED_SHUTDOWN_VM. So NEED_RESCHEDULE flag should be set atomically too.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
In current design, devicemodel passes VM UUID to create VMs and hypervisor
would check the UUID whether it is matched with the one in VM configurations.
Kata container would maintain few UUIDs to let ACRN launch the VM, so
hypervisor need to add these UUIDs in VM configurations for Kata running.
In the hypercall of hcall_get_platform_info(), hypervisor will report the
maximum Kata container number it will support. The patch will add a Kconfig
to indicate the maximum Kata container number that SOS could support.
In current stage, only one Kata container is supported by SOS on SDC scenario
so add one UUID for Kata container in SDC VM configuration. If we want to
support Kata on other scenarios in the future, we could follow the example
of this patch;
Tracked-On: #3402
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
ACRN Coding guidelines requires two different types pointer can't
convert to each other, except void *.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
One cycle of vmexit/vmentry might lost interrupts.
This is the scenario,
1) vmexit, vmexit_handlers
2) softirq & disable interrupt
3) acrn_handle_pending_request
4) schedule if needed, then back to 1) and loop again.
5) vmentry
The step 3) might be executed twice. The problem is at the second
execution of acrn_handle_pending_request, we might overwrite
VMX_ENTRY_INT_INFO_FIELD of current vmcs, which cause guest lost
interrupts.
The fix is moving 4) prior to 3), then we will handle the pending
requests and vmentry directly.
Tracked-On: #3374
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Since spinlock ptdev_lock is used to protect ptdev entry, there's no need to
use atomic operation to protect ptdev active flag in split of it's wrong used
to protect ptdev entry. And refine active flag data type to bool.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
In hcall_inject_msi, we check vlapic state of SOS by mistake.
If the SOS's vlapic state doesn't equal to target_vm's, the OVMF will
hang when boot up. Instead, we should check the target_vm's
vlapic state.
Tracked-On: #3069
Acked-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
This patch introduces check_vm_vlapic_state API instead of is_lapic_pt_enabled
to check if all the vCPUs of a VM are using x2APIC mode and LAPIC
pass-through is enabled on all of them.
When the VM is in VM_VLAPIC_TRANSITION or VM_VLAPIC_DISABLED state,
following conditions apply.
1) For pass-thru MSI interrupts, interrupt source is not programmed.
2) For DM emulated device MSI interrupts, interrupt is not delivered.
3) For IPIs, it will work only if the sender and destination are both in x2APIC mode.
Tracked-On: #3253
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
V1:Initial Patch
Modularize vpic. The current patch reduces the usage
of acrn_vm inside the vpic.c file.
Due to the global natire of register_pio_handler, where
acrn_vm is being passed, some usage remains.
These needs to be a separate "interface" file.
That will come in smaller newer patch provided
this patch is accepted.
V2:
Incorporated comments from Jason.
V3:
Fixed some MISRA-C Violations.
Tracked-On: #1842
Signed-off-by: Arindam Roy <arindam.roy@intel.com>
Reviewed-by: Xu, Anthony <anthony.xu@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
On SDC scenario, SOS VM id is fixed to 0 so some hypercalls from guest
are using hardcoded "0" to represent SOS VM, this would bring issues
for HYBRID scenario which SOS VM id is non-zero.
Now introducing a new VM id concept for DM/VHM hypercall APIs, that
return a relative VM id which is from SOS view when create VM for post-
launched VMs. DM/VHM could always treat their own vm id is "0". When they
make hypercalls, hypervisor will convert the VM id to the absolute id
when dispatch the hypercalls.
Tracked-On: #3214
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
Changes:
- In current design, the hypercall is only allowed calling from SOS or
trusty VM, so separate the trusty hypercalls from dispatch_hypercall().
The vm parameter which referenced by hcall_xxx() should be SOS VM;
- do not inject #UD for hypercalls from non-SOS, just return -ENODEV;
Tracked-On: #3214
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
Zephyr kernel is stripped ram image, its entry and load address
are explicitly defined in vm configurations, hypervisor will load
Zephyr directly based on these configurations.
Currently we only support boot Zephyr from protected mode.
Tracked-On: #3214
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Previously multiboot mods[0] is designed for kernel module for all
pre-launched VMs including SOS VM, and mods[0].mm_string is used
to store kernel cmdline. This design could not satisfy the requirement
of hybrid mode scenarios that each VM might use their own kernel image
also ramdisk image. To resolve this problem, we will use a tag in
mods mm_string field to specify the module type. If the tag could
be matched with os_config of VM configurations, the corresponding
module would be loaded;
Tracked-On: #3214
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Different kernel has different load method, it should be configurable
in vm configurations;
Tracked-On: #3214
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Currently the algorithm of direct_boot_sw_loader() is Linux bzImage specific,
so separate the bzImage specific loader code from it to make the api more
generic for other OS;
Tracked-On: #3214
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
The guest OS of ACRN will not be limited to Linux, so refine the struct
of sw_linux to more generic sw_module_info. Currently bootargs and ramdisk
are only supported modules but we can include more modules in future;
Tracked-On: #3214
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
ACRN supports LAPIC emulation for guests using x86 APICv. When guest OS/BIOS
switches from xAPIC to x2APIC mode of operation, ACRN also supports switching
froom LAPIC emulation to LAPIC passthrough to guest. User/developer needs to
configure GUEST_FLAG_LAPIC_PASSTHROUGH for guest_flags in the corresponding
VM's config for ACRN to enable LAPIC passthrough.
This patch does the following
1)Fixes a bug in the abovementioned feature. For a guest that is
configured with GUEST_FLAG_LAPIC_PASSTHROUGH, during the time period guest is
using xAPIC mode of LAPIC, virtual interrupts are not delivered. This can be
manifested as guest hang when it does not receive virtual timer interrupts.
2)ACRN exposes physical topology via CPUID leaf 0xb to LAPIC PT VMs. This patch
removes that condition and exposes virtual topology via CPUID leaf 0xb.
Tracked-On: #3136
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
for ctx->flags is protected by scheduler lock, so not need
to set lock again.
Tracked-On: #3130
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
now only SOS need decide boot with de-privilege or direct boot mode, while
for other pre-launched VMs, they should use direct boot mode.
this patch merge boot/guest/direct_boot_info.c &
boot/guest/deprivilege_boot_info.c into boot/guest/vboot_info.c,
and change init_direct_vboot_info() function name to init_general_vm_boot_info().
in init_vm_boot_info(), depend on get_sos_boot_mode(), SOS may choose to init
vm boot info by setting the vm_sw_loader to deprivilege specific one; for SOS
using DIRECT_BOOT_MODE and all other VMS, they will use general_sw_loader as
vm_sw_loader and go through init_general_vm_boot_info() for virtual boot vm
info filling.
this patch also move spurious handler initilization for de-privilege mode from
boot/guest/deprivilege_boot.c to boot/guest/vboot_info.c, and just set it in
deprivilege sw_loader before irq enabling.
Changes to be committed:
modified: Makefile
modified: arch/x86/guest/vm.c
modified: boot/guest/deprivilege_boot.c
deleted: boot/guest/deprivilege_boot_info.c
modified: boot/guest/direct_boot.c
renamed: boot/guest/direct_boot_info.c -> boot/guest/vboot_info.c
modified: boot/guest/vboot_wrapper.c
modified: boot/include/guest/deprivilege_boot.h
modified: boot/include/guest/direct_boot.h
modified: boot/include/guest/vboot.h
new file: boot/include/guest/vboot_info.h
modified: common/vm_load.c
modified: include/arch/x86/guest/vm.h
Tracked-On: #1842
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>
The definition of CONFIG_REMAIN_1G_PAGES has been removed by patch (4627cd4d HV: build: drop useless files).
So, remove the whole CONFIG_REMAIN_1G_PAGES in repo.
---
v1 -> v2:
Address Fei's comment about changing reserving_1g_pages from int32_t to int64_t
Address Jason's comment about adding MACRO NUM_REMAIN_1G_PAGES
Tracked-On: #3128
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Li, Fei1 <fei1.li@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
For pre-launched VMs and SOS, VM shutdown should not be executed in the
current VM context.
- implement NEED_SHUTDOWN_VM request so that the BSP of the target VM can shut
down the guest in idle thread.
- implement shutdown_vm_from_idle() to shut down target VM.
Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
now there are 2 static_checks.c,
./arch/x86/static_checks.c
./common/static_checks.c
they are used for static checks when build time,
this check should not belong to any HV layer from
modularization view, then add and move it to pre_build folder.
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
modified: Makefile
deleted: common/static_checks.c
renamed: arch/x86/static_checks.c -> pre_build/static_checks.c
Replace the vm state VM_STATE_INVALID to VM_POWERED_OFF.
Also replace is_valid_vm() with is_poweroff_vm().
Add API is_created_vm() to identify VM created state.
Tracked-On: #3082
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
IRTE is freed if ptirq entry is released from remove_msix_remapping() or
remove_intx_remapping(). But if it's called from ptdev_release_all_entries(),
e.g. SOS shutdown/reboot, IRTE is not freed.
This patch adds a release_cb() callback function to do any architectural
specific cleanup. In x86, it's used to release IRTE.
On VM shutdown, vpci_cleanup() needs to remove MSI/MSI-X remapping on
ptirq entries, thus it should be called before ptdev_release_all_entries().
Tracked-On: #2700
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Shared buffer is allocated by VM and is protected by SMAP.
Accessing to shared buffer between stac/clac pair will invalidate
SMAP protection.This patch is to remove these cases.
Fix minor stac/clac mis-usage,and add comments as stac/clac usage BKM
Tracked-On: #2526
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
ACRN cleans up the IOMMU domain and other data structures that
represents the state of device assigment to POST_LAUNCHED_VM. This is
with the help of hypercalls from SOS DM. Under scenarios where DM
execution can get terminated abruptly or due to bugs in DM, hypercalls
responsible for cleaning up ACRN cannot happen. This leaves ACRN
device representation/resource assignment in an incorrect state.
This patch cleans up the IOMMU resource and other data structures
upon shutdown of POST_LAUNCHED_VM.
Tracked-On: #2700
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The name NORMAL_VM does not clearly reflect the attribute that these VMs
are launched "later". POST_LAUNCHED_VM is closer to the fact
that these VMs are launched "later" by one of the VMs launched by ACRN.
Tracked-On: #3034
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
There are 2 reasons why SOS ve820 has to be separated from host e820:
- in hybrid mode, SOS may not own all the host memory.
- when SOS is being re-launched, it needs an untainted version of host
e820 to create SOS ve820.
This patch creates sos_e820 table for SOS and keeps host e820 intact
during SOS creation.
Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
According to SDM 10.6.1, if dest fields is 0xffU for register
ICR operation, that means the IPI is for broadcast.
According to SDM 10.12.9, 0xffffffffU of dest fields for x2APIC
means IPI is for broadcast.
We add new parameter to vlapic_calc_dest() to show whether the
dest is for broadcast. For IPI, we will set it according to
dest fields. For ioapic and MSI, we hardcode it to false because
no broadcast for ioapic and MSI.
Tracked-On: #3003
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch adds prefix 'p' before 'cpu' to physical cpu related functions.
And there is no code logic change.
Tracked-On: #2991
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Currently, the previous configurations about guest_flags set by DM will
not be cleared when shutdown the vm. Then it might bring issue for the
next dm-launched vm.
For example, if we create one vm with LAPIC_PASSTHROUGH flag and shutdown it.
Then the next dm-launched vm will has the LAPIC_PASSTHROUGH flag set no matter
whether we set it in DM.
This patch clears all the DM set flags when shtudown vm.
Tracked-On: #2991
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
- The target vm in most of hypercalls should be a NORMAL_VM, in some
exceptions it might be a SOS_VM, we should validate them.
- Please be aware that some hypercall might have limitation on specific
target vm like RT or SAFETY VMs, this leaves "TODO" in future;
- Unify the coding style:
int32_t hcall_foo(vm, target_vm_id, ...)
{
int32_t ret = -1;
...
if ((is_valid_vm(target_vm) && is_normal_vm(target_vm)) {
ret = ....
}
return ret;
}
Tracked-On: #2978
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The function vpci_set_ptdev_intr_info() is not compiled for
CONFIG_SHARING_MODE only any more so the #ifndef here is useless.
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Currently ptdev_release_all_entries() calls ptirq_release_entry() to
release ptdev IRQs, which is not sufficient because free_irq() is not
included in ptirq_release_entry().
Furthermore, function ptirq_deactivate_entry() and ptirq_release_entry()
have lots of overlaps which make the differences between them are not
clear.
This patch does:
- Adds ptirq_deactivate_entry() call in ptdev_release_all_entries() if the
irq has not already freed.
- Remove unnecessary code from ptirq_deactivate_entry() to make it do the
exact opposite to function ptirq_activate_entry(), except that
entry->allocated_pirq is not reset, which is not necessary.
- Added the missing del_timer() to ptirq_release_entry() to make it almost
opposite to ptirq_alloc_entry(); Added memset() to clear the whole entry,
which doesn't have impacts to the functionalities but keep the data structure
clean.
Tracked-On: #2700
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
If launch two UOS with same UUID by acrn-dm, current code path will
return same VM instance to the acrn-dm, this will crash the two UOS.
Check VM state and make sure it's in VM_STATE_INVALID state before
creating a VM.
Tracked-On: #2984
Signed-off-by: Cai Yulong <yulongc@hwtc.com.cn>
The previous will not check the EPT gpa correctly. This patch try to fix this.
Tracked-On: #2291
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
In the presence of SOS, ACRN uses fallback_iommu_domain which is the same
used by SOS, to assign domain to devices during ACRN init. Also it uses
fallback_iommu_domain when DM requests ACRN to remove device from UOS domain.
This patch changes the design of assign/remove_iommu_device to avoid the
concept of fallback_iommu_domain and its setup. This way ACRN can commonly
treat pre-launched VMs bringup w.r.t. IOMMU domain creation.
Tracked-On: #2965
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
add new hypercall get platform information,
such as physical CPU number.
Tracked-On: #2538
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Currently VM id of NORMAL_VM is allocated dymatically, we need to make
VM id statically for FuSa compliance.
This patch will pre-configure UUID for all VMs, then NORMAL_VM could
get its VM id/configuration from vm_configs array by indexing the UUID.
If UUID collisions is found in vm configs array, HV will refuse to
load the VM;
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The code mixed the usage on term of UUID and GUID, now use UUID to make
code more consistent, also will use lowercase (i.e. uuid) in variable name
definition.
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now the io_emul.c is relates with arch,io_req.c is common,
move some APIs from io_emul.c to io_req.c as common like these APIs:
register_pio/mmio_emulation_handler
dm_emulate_pio/mmio_complete
pio_default_read/write
mmio_default_access_handler
hv_emulate_pio/mmio etc
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
-- this api is related with arch_x86, then move to x86 folder
-- rename 'set_vhm_vector' to 'set_vhm_notification_vector'
-- rename 'acrn_vhm_vector' to 'acrn_vhm_notification_vector'
-- add an API 'get_vhm_notification_vector'
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
'vector_cnt' is vector count of MSI/MSIX, and it come from 'table_count'
in the the pci_msix struct. 'CONFIG_MAX_MSIX_TABLE_NUM' is the maximum
number of MSI-X tables per device, so 'vector_cnt' must be less then
or equal to 'CONFIG_MAX_MSIX_TABLE_NUM'.
Tracked-On: #2881
Signed-off-by: Tianhua Sun <tianhuax.s.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
After using get_vm_from_vmid(), vm pointer is always not NULL. But there are still many NULL pointer checks.
This commit replaced the NULL vm pointer check with a validation check which checks the vm status.
In addition, NULL check for pointer returned by get_sos_vm() and get_vm_config() is removed.
Tracked-On: #2520
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch makes make_reschedule_request support for kicking
off vCPU using INIT.
Tracked-On: #2865
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch checks if the GUEST_FLAG_RT is set when GUEST_FLAG_LAPIC_PASSTHROUGH is set.
If GUEST_FLAG_RT is not set while GUEST_FLAG_LAPIC_PASSTHROUGH is set, we will refuse
to boot the VM.
Meanwhile, this patch introduces a new API is_rt_vm.
Tracked-On: #2865
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In hypervisor fuzzing test, hypervisor will hang
if issuing HV_VM_SET_MEMORY_REGIONS hypercall after
target VM is destroyed.
this patch is to fix above vulnerability.
Tracked-On: #2849
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
- for all cases of referring guest bootargs size, replace MEM_2K with
CONFIG_MAX_BOOTARGS_SIZE for better readability.
- remove duplicated MAX_BOOTARGS_SIZE definition from vm_config.h.
Also fix one minor issue in general_sw_loader() which uses copy_to_gpa()
to copy a string. Since copy_to_gpa() makes use of memncpy_s() to do the
job, the size parameter should include the string null ternimator.
Tracked-On: #2806
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The init page tables installed in either cpu_primary.S or trampoline.S
are 1:1 mapping and won't be changed in the future.
The 'actual' hypervisor page table installed in enable_paging() is 1:1
mapping currently but it could be changed in the future. Both hva2hpa() and
hpa2hva() are implemented based on these page tables and can't be used
when the init page tables take effect.
This patch does the following cleanup:
- remove all hva2hpa()/hpa2hva() before calling enable_paging()
- get_hv_image_base() returns HVA, not HPA. So add hva2hpa() for all cases
that are called afte enable_paging().
Tracked-On: #2700
Signed-off-by: Zide Chen <zide.chen@intel.com>
Acked-by: Eddie Dong <Eddie.dong@intel.com>
to validate the ID and state of vCPU in below functions:
- hcall_set_vcpu_regs()
- hcall_notify_ioreq_finish()
- shell_vcpu_dumpreq()
Tracked-On: #861
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch simplifies `get_primary_vcpu` and `vcpu_from_vid`.
The target_vcpu could be get from the index directly.
Tracked-On: #1842
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
vcpu returned by get_primary_vcpu API is BSP vcpu of the VM. So
checking is_vcpu_bsp on vcpu is redundant.
Tracked-On: #2668
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
it removes hypervisor.h and just includes needed header files.
Tracked-On: #1842
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This issue is triggered when launch fuzzing test.
Fuzzing test thread will call destroy_vm(IC_DESTROY_VM)
to set the guest vCPU state to VCPU_ZOMBIE then VCPU_INIT
and then VCPU_OFFLINE, it will cause post-work can't resume
the guest vCPU and can't changes the state of the
corresponding I/O request slot to REQ_STATE_FREE.
so replace improper use of ASSERT with return error code.
Tracked-On: #2606
Signed-off-by: Tianhua Sun <tianhuax.s.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch is a modified one. It removes the usage
of acrn_vm struct from inside vtd.c.
It also puts struct iommu_domain inside vtd.h,
from vtd.c.
It modifies the signature of init_iommu_domain
in order to remove dependency on acrn_vm from
inside vtd.c.
Incorporated comments from Jason and Eddie.
Changed the name of sos_vm_domain to
fallback_iommu_domain
Removed any reference of sos_vm from vtd.[c|h]
files, including comments.
Tracked-On: #2496
Signed-off-by: Arindam Roy <arindam.roy@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
- add e820 info in struct acrn_vm;
- rename rebuild_sos_vm_e820() to create_sos_vm_e820();
- add create_prelaunched_vm_e820() for partition mode;
- rename create_e820_table() to create_zeropage_e820() and merge for
both sharing mode and partition mode;
- move create_xxx_vm_e820() to vm.c;
- move create_zeropage_e820() to vm_load.c;
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Rename vlapic_deliver_intr to vlapic_receive_intr: ioapic/msi device
deliver an interrupt to lapic.
Rename vlapic_pending_intr to vlapic_find_deliverable_intr: find a
deliverable interrupt which pending in irr and its priority large than ppr.
Rename vlapic_intr_accepted to vlapic_get_deliverable_intr: get the deliverable
interrupt from irr and set it in isr (which also raise ppr update)
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Panic should only be used when system booting. Once the system boot done,
it could never be used. While ASSERT could be used in some situations, such
as, there are some pre-assumption for some code, using ASSERT here for debug.
Tracked-On: #861
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@inte.com>
move 'struct e820_entry' 'E820_TYPE_XXX' from mmu.h
to e820.h
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
For VM with local apic pt for realtime scenatios, we support virtio device with PMD backend.
But we still need to inject MSI to notify the front-end, to avoid changing the front-end drivers.
Since the lapic is passed through, irq injection to vlapic won't work.
This commit fix it by sending IPI with vector need to inject.
Tracked-On: #2351
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
This commit extend lapic pass-through for DM launched VM, generally for hard RT scenarios.
Similar to the partition mode, the vlapic is working under the xapic mode at first, only
when x2apic mode is enabled, lapic is passed through, because the physical LAPICs are
under x2apic mode.
Main changes includes:
- add is_lapic_pt() to check if a vm is created with lapic pt or not, to combine
codes of partition mode and DM launched vm with lapic passthrough, including:
- reuse the irq delievery function and rename it to dispatch_interrupt_lapic_pt();
- reuse switch_apicv_mode_x2apic();
- reuse ICR handling codes to avoid malicious IPI;
- intercept ICR/APICID/LDR msr access when lapic_pt;
- for vm with lapic passthrough, irq is always disabled under root mode.
Tracked-On: #2351
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
after compile, the compiled code could change rsp, so use pure asm code
to avoid such problem which will cause schedule switch failure.
Tracked-On: #2410
Signed-off-by: Jason Chen CJ <jason.cj.chen@intel.com>