The cupid() can be replaced with cupid_subleaf, which is more clear.
Having both APIs makes reading difficult.
Tracked-On: #4526
Signed-off-by: Li Fei1 <fei1.li@intel.com>
For SOS VM, when the target platform has multiple IO-APICs, there
should be equal number of virtual IO-APICs.
This patch adds support for emulating multiple vIOAPICs per VM.
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
MADT is used to specify the GSI base for each IO-APIC and the number of
interrupt pins per IO-APIC is programmed into Max. Redir. Entry register of
that IO-APIC.
On platforms with multiple IO-APICs, there can be holes in the GSI space.
For example, on a platform with 2 IO-APICs, the following configuration has
a hole (from 24 to 31) in the GSI space.
IO-APIC 1: GSI base - 0, number of pins - 24
IO-APIC 2: GSI base - 32, number of pins - 8
This patch also adjusts the size for variables used to represent the total
number of IO-APICs on the system from uint16_t to uint8_t as the ACPI MADT
uses only 8-bits to indicate the unique IO-APIC IDs.
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
As ACRN prepares to support platforms with multiple IO-APICs,
GSI is a better way to represent physical and virtual INTx interrupt
source.
1) This patch replaces usage of "pin" with "gsi" whereever applicable
across the modules.
2) PIC pin to gsi is trickier and needs to consider the usage of
"Interrupt Source Override" structure in ACPI for the corresponding VM.
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
Reverts 538ba08c: hv:Add vpin to ptdev entry mapping for vpic/vioapic
ACRN uses an array of size per VM to store ptirq entries against the vIOAPIC pin
and an array of size per VM to store ptirq entries against the vPIC pin.
This is done to speed up "ptirq entry" lookup at runtime for Level triggered
interrupts in API ptirq_intx_ack used on EOI.
This patch switches the lookup API for INTx interrupts to the API,
ptirq_lookup_entry_by_sid
This could add delay to processing EOI for Level triggered interrupts.
Trade-off here is space saved for array/s of size CONFIG_MAX_IOAPIC_LINES with 8 bytes
per data. On a server platform, ACRN needs to emulate multiple vIOAPICs for
SOS VM, same as the number of physical IO-APICs. Thereby ACRN would need around
10 such arrays per VM.
Removes the need of "pic_pin" except for the APIs facing the hypercalls
hcall_set_ptdev_intr_info, hcall_reset_ptdev_intr_info
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
There're some cases the SOS (higher severity guest) needs to access the
post-launched VM (lower severity guest) PCI CFG space:
1. The SR-IOV PF needs to reset the VF
2. Some pass through device still need DM to handle some quirk.
In the case a device is assigned to a UOS and is not in a zombie state, the SOS
is able to access, if and only if the SOS has higher severity than the UOS.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
For a pre-launched VM, a region from PTDEV_HI_MMIO_START is used to store
64bit vBARs of PT devices which address is high than 4G. The region should
be located after all user memory space and be coverd by guest EPT address.
Tracked-On: #4458
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
ve820.c is a common file in arch/x86/guest/ now, so move function of
create_sos_vm_e820() to this file to make code structure clear;
Tracked-On: #4458
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
hypervisor/arch/x86/configs/$(BOARD)/ve820.c is used to store pre-launched
VM specific e820 entries according to memory configuration of customer.
It should be a scenario based configurations but we had to put it in per
board foler because of different board memory settings. This brings concerns
to customer on configuration orgnization.
Currently the file provides same e820 layout for all pre-launched VMs, but
they should have different e820 when their memory are configured differently.
Although we have acrn-config tool to generate ve802.c automatically, it
is not friendly to modify hardcoded ve820 layout manually, so the patch
changes the entries initialization method by calculating each entry item
in C code.
Tracked-On: #4458
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Currently ept_pages_info[] is initialized with first element only that force
VM of id 0 using SOS EPT pages. This is incorrect for logical partition and
hybrid scenario. Considering SOS_RAM_SIZE and UOS_RAM_SIZE are configured
separately, we should use different ept pages accordingly.
So, the PRE_VM_NUM/SOS_VM_NUM and MAX_POST_VM_NUM macros are introduced to
resolve this issue. The macros would be generated by acrn-config tool when
user configure ACRN for their specific scenario.
One more thing, that when UOS_RAM_SIZE is less then 2GB, the EPT address
range should be (4G + PLATFORM_HI_MMIO_SIZE).
Tracked-On: #4458
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
VM needs to check if it owns this device before deiniting it.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1. Renames DEFINE_IOAPIC_SID with DEFINE_INTX_SID as the virtual source can
be IOAPIC or PIC
2. Rename the src member of source_id.intx_id to ctlr to indicate interrupt
controller
2. Changes the type of src member of source_id.intx_id from uint32_t to
enum with INTX_CTLR_IOAPIC and INTX_CTLR_PIC
Tracked-On: #4447
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
This is to enable relocation for code32.
- RIP relative addressing is available in x86-64 only so we manually add
relocation delta to the target symbols to fixup code32.
- both code32 and code64 need to load GDT hence both need to fixup GDT
pointer. This patch declares separate GDT pointer cpu_primary64_gdt_ptr
for code64 to avoid double fixup.
- manually fixup cpu_primary64_gdt_ptr in code64, but not rely on relocate()
to do that. Otherwise it's very confusing that symbols from same file could
be fixed up externally by relocate() or self-relocated.
- to make it clear, define a new symbol ld_entry_end representing the end of
the boot code that needs manually fixup, and use this symbol in relocate()
to filter out all symbols belong to the entry sections.
Tracked-On: #4441
Reviewed-by: Fengwei Yin <fengwei.yin@intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
This patch adds RDT MBA support to detect, configure and
and setup MBA throttle registers based on VM configuration.
Tracked-On: #3725
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The init_one_dev_config is used to initialize a acrn_vm_pci_dev_config
SRIOV needs a explicit acrn_vm_pci_dev_config to create a VF vdev,so
refine it to return acrn_vm_pci_dev_config.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Current code avoid the rule 88 S in MISRA-C, so move xsaves and xrstors
assembler to individual functions.
Tracked-On: #4436
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Make the SRIOV-Capable device invisible from SOS if there is
no room for its all virtual functions.
v2: fix a issue that if a PF has been dropped, the subsequent PF
will be dropped too even there is room for its VFs.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The init value for XCR0 and XSS should be the same with spec:
In SDM Vol1 13.3:
XCR0[0] is associated with x87 state (see Section 13.5.1). XCR0[0] is
always 1. The other bits in XCR0 are all 0 coming out of RESET.
The IA32_XSS MSR (with MSR index DA0H) is zero coming out of RESET.
The previous code try to fix the xsave area leak to other VMs during init
phase, but bring the error to linux. Besides, it cannot avoid the
possible leak in running phase. Need find a better solution.
Tracked-On: #4430
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
is not set
This patch does the following,
1. Removes RDT code if CONFIG_RDT_ENABLED flag is
not set.
2. Set the CONFIG_RDT_ENABLED flag only on platforms
that support RDT so that build scripts will automatically
reflect the config.
Tracked-On: #3715
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
cache configuration.
This patch creates a generic infrastructure for
RDT resources instead of just L2 or L3 cache. This
patch also fixes L3 CAT config overwrite by L2 in
cases where both L2 and L3 CAT are supported.
Tracked-On: #3715
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
There can be times when user unknowinlgy enables
CONFIG_CAT_ENBALED SW flag, but the hardware might
not support L3 or L2 CAT. In such case software can
end up writing to the CAT MSRs which can cause
undefined results. The patch fixes the issue by
enabling CAT only when both HW as well software
via the CONFIG_CAT_ENABLED supports CAT.
The patch also address typo with "clos2prq_msr"
function name. It should be "clos2pqr_msr" instead.
PQR stands for platform qos register.
Tracked-On: #3715
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Upcoming intel platforms can support both L2 and L3
but our current code only supports either L2 or L3 CAT.
So split the MSRs so that we can support allocation
for both L2 and L3.
This patch does the following,
1. splits programming of L2 and L3 cache resource
based on the resource ID.
2. Replace generic platform_clos_array struct with resource
specific struct in all the existing board.c files.
Tracked-On: #3715
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
As part of rdt cat refactoring, goal is to combine all rdt
specific features such as CAT under one module. So renaming
rdt resouce specific files such as cat.c/.h to generic rdt.c/.h
files.
Tracked-On: #3715
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Initialize efi info of acrn mbi when boot from multiboot2 protocol, with
this patch hypervisor could get host efi info and pass it to Linux zeropage,
then make guest Linux possible to boot with efi environment;
Tracked-On: #4419
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The patch re-arch boot component header files by:
- moving multiboot.h from include/arch/x86/ to boot/include/ and keep
this header for multiboot1 protocol data struct only;
- moving multiboot related MACROs in cpu_primary.S to multiboot.h;
- creating an independent boot.h to store acrn specific boot information
for other files' reference;
Tracked-On: #4419
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
BVT (Borrowed virtual time) scheduler is used to schedule vCPUs on pCPU.
It has the concept of virtual time, vCPU with earliset virtual time is
dispatched first.
Main concepts:
tick timer:
a period tick is used to measure the physcial time in units of MCU
(minimum charing unit).
runqueue:
thread in the runqueue is ordered by virtual time.
weight:
each thread receives a share of the pCPU in proportion to its
weight.
context switch allowance:
the physcial time by which the current thread is allowed to advance
beyond the next runnable thread.
warp:
a thread with warp enabled will have a change to minus a value (Wi)
from virtual time to achieve higher priority.
virtual time:
AVT: actual virtual time, advance in proportional to weight.
EVT: effective virtual time.
EVT <- AVT - ( warp ? Wi : 0 )
SVT: scheduler virtual time, the minimum AVT in the runqueue.
Tracked-On: #4410
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1. Rename BOOT_CPU_ID to BSP_CPU_ID
2. Repace hardcoded value with BSP_CPU_ID when
ID of BSP is referenced.
Tracked-On: #4420
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Now we split passthrough PCI device from DM to HV, we could remove all the passthrough
PCI device unused code.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
To enable gvt-d,need to allow the GPU IOMMU.
While gvt-d hasn't been enabled on APL yet,
so let APL disable GPU IOMMU.
v2 -> v3:
* let APL platforms disable GPU IOMMU.
Tracked-On: #4405
Signed-off-by: Junming Liu <junming.liu@intel.com>
Reviewed-by: Wu Binbin <binbin.wu@intel.com>
In platforms that support CAT, when it is enabled by ACRN, i.e.
IA32_resourceType_MASK_n registers are programmed with customized values,
it has impacts to the whole system.
The per guest flag GUEST_FLAG_CLOS_REQUIRED suggests that CAT may be
enabled in some guests, but not in others who don't have this flag,
which is conceptually incorrect.
This patch removes GUEST_FLAG_CLOS_REQUIRED, and adds a new Kconfig
entry CAT_ENABLED for CAT enabling. When it's enabled, platform_clos_array[]
defines a set of system-wide Class of Service (COS, or CLOS), and the
per guest vm_configs[].clos associates the guest with particular CLOS.
Tracked-On: #2462
Signed-off-by: Zide Chen <zide.chen@intel.com>
1. Align the coding style for these MACROs
2. Align the values of fixed VECTORs
Tracked-On: #4348
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
is_polling_ioreq is more straightforward. Rename it.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Sometimes HV wants to know if there are pending interrupts of one vcpu.
Add .has_pending_intr interface in acrn_apicv_ops and return the pending
interrupts status by check IRRs of apicv.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Introduce two kinds of events for each vcpu,
VCPU_EVENT_IOREQ: for vcpu waiting for IO request completion
VCPU_EVENT_VIRTUAL_INTERRUPT: for vcpu waiting for virtual interrupts events
vcpu can wait for such events, and resume to run when the
event get signalled.
This patch also change IO request waiting/notifying to this way.
Tracked-On: #4329
Signed-off-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Since we restore BAR values when writing Command Register if necessary. We don't
need to trap FLR and do the BAR restore then.
Tracked-On: #3475
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Per SDM 10.12.5.1 vol.3, local APIC should keep LAPIC state after receiving
INIT. The local APIC ID register should also be preserved.
Tracked-On: #4267
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The patch abstract a vcpu_reset_internal() api for internal usage, the
function would not touch any vcpu state transition and just do vcpu reset
processing. It will be called by create_vcpu() and reset_vcpu().
The reset_vcpu() will act as a public api and should be called
only when vcpu receive INIT or vm reset/resume from S3. It should not be
called when do shutdown_vm() or hcall_sos_offline_cpu(), so the patch remove
reset_vcpu() in shutdown_vm() and hcall_sos_offline_cpu().
The patch also introduced reset_mode enum so that vcpu and vlapic could do
different context operation according to different reset mode;
Tracked-On: #4267
Signed-off-by: Victor Sun <victor.sun@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Some MACROs in lapic.h are duplicated with apicreg.h, and some MACROs are
never referenced, remove them.
Tracked-On: #4268
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add severity definitions for different scenarios. The static
guest severity is defined according to guest configurations.
Also add sanity check to make sure the severity for all guests
are correct.
Tracked-On: #4270
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
For system S5, ACRN had assumption that SOS shutdown will trigger
system shutdown. So the system shutdown logical is:
1. Trap SOS shutdown
2. Wait for all other guest shutdown
3. Shutdown system
The new logical is refined as:
If all guest is shutdown, shutdown whole system
Tracked-On: #4270
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
ACRN hypervisor should trap guest doing PCIe FLR. Besides, it should save some status
before doing the FLR and restore them later, only BARs values for now.
This patch will trap guest Device Capabilities Register write operation if the device
supports PCI Express Capability and check whether it wants to do device FLR. If it does,
call pdev_do_flr to do the job.
Tracked-On: #3465
Signed-off-by: Li Fei1 <fei1.li@intel.com>
We don't use INIT signal notification method now. This patch
removes them.
Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
There is a window where we may miss the current request in the
notification period when the work flow is as the following:
CPUx + + CPUr
| |
| +--+
| | | Handle pending req
| <--+
+--+ |
| | Set req flag |
<--+ |
+------------------>---+
| Send NMI | | Handle NMI
| <--+
| |
| |
| +--> vCPU enter
| |
+ +
So, this patch enables the NMI-window exiting to trigger the next vmexit
once there is no "virtual-NMI blocking" after vCPU enter into VMX non-root
mode. Then we can process the pending request on time.
Tracked-On: #3886
Acked-by: Eddie Dong <eddie.dong@intel.com>
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
The NMI for notification should not be inject to guest. So,
this patch drops NMI injection request when we use NMI
to notify vCPUs. Meanwhile, ACRN doesn't support vNMI well
and there is no well-designed way to check if the NMI is
for notification or for guest now. So, we take all the NMIs as
notificaton NMI for hard rtvm temporarily. It means that the
hard rtvm will never receive NMI with this patch applied.
TODO: vNMI support is not ready yet. we will add it later.
Tracked-On: #3886
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Reserved bits in a 8-bit PAT field has been checked in pat_mem_type_invalid.
Remove this redundant check "(PAT_FIELD_RSV_BITS & field) != 0UL" in
write_pat_msr.
Tracked-On: #1842
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>