About the MSI/MSI-X Capability, there're some fields of it would never been changed
once they had been initialized. So it's no need to reset them once the vdev instance
is still used. What need to reset are the fields which would been changed by guest
at runtime.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
replace spinlock_obtain/spinlock_release with spinlock_irqsave_obtain
and spinlock_irqrestore_release to avoid dead lock for vpic module.
this vpic lock may be accessed in ISR context like this path:
dispatch_interrupt->do_softirq->softirq_handlers
->ptirq_softirq->ptirq_handle_intx->vpic_set_irqline
Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
-- remove unnecessary lock in pci_mmcfg_read_cfg and
pci_mmcfg_write_cfg since the mmio operation is atomic
if the offest is aligned with 1/2/4 bytes.
-- move pci_is_valid_access to pci.h
Tracked-On: #4958
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
hv: pci: refine pci_find_vdev with hash
1. Refined pci_find_vdev with BDF-hashing for better performance
Tracked-On: #4857
Signed-off-by: Wang Qian <qian1.wang@intel.com>
Reviewed-by: Li Fei <Fei1.Li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Some passthrough devices require multiple MSI vectors, but don't
support MSI-X. In meanwhile, Linux kernel doesn't support continuous
vector allocation.
On native platform, this issue can be mitigated by IOMMU via interrupt
remapping. However, on ACRN, there is no vIOMMU.
vMSI-X on MSI emulation is one solution to mitigate this problem on ACRN.
This patch adds MSI-X emulation on MSI capability.
For the device needs to do MSI-X emulation, HV will hide MSI capability
and present MSI-X capability to guest.
The guest driver may need to modify to reqeust MSI-X vector.
For example:
ret = pci_alloc_irq_vectors(pdev, 1, STMMAC_MSI_VEC_MAX,
- PCI_IRQ_MSI);
+ PCI_IRQ_MSI | PCI_IRQ_MSIX);
To enable MSI-X emulation, the device should:
- 1. The device should be in vmsix_on_msi_devs array.
- 2. Support MSI, but don't support MSI-X.
- 3. MSI capability should support per-vector mask.
- 4. The device should have an unused BAR.
- 5. The device driver should not rely on PBA for functionality.
Tracked-On: #4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
For a ptirq_remapping_info entry, when build IRTE:
- If the caller provides a valid IRTE, use the IRET
- If the caller doesn't provide a valid IRTE, allocate a IRET when the
entry doesn't have a valid IRTE, in this case, the IRET will be freed
when free the entry.
Tracked-On:#4831
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now Host Bridge and PCI Bridge could only be added to SOS's acrn_vm_pci_dev_config.
So For UOS, we always emualte Host Bridge and PCI Bridge for it and assign PCI device
to it; for SOS, if it's the highest severity VM, we will assign Host Bridge and PCI
Bridge to it directly, otherwise, we will emulate them same as UOS.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
According PCI Code and ID Assignment Specification Revision 1.11, a PCI device
whose Base Class is 06h and Sub-Class is 00h is a Host bridge.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Guest may write a MSI-X capability register with only RW bits setting on. This works
well on native since the hardware will make sure RO register bits could not over-write.
However, the software needs more efforts to achieve this. This patch does this by
defining a RW permission mapping base on bits. When a guest tries to write a MSI-X
Capability register, only modify the RW bits on vCFG space.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Guest may write a MSI capability register with only RW bits setting on. This works
well on native since the hardware will make sure RO register bits could not over-write.
However, the software needs more efforts to achieve this. This patch does this by
defining a RW permission mapping base on bits. When a guest tries to write a MSI
Capability register, only modify the RW bits on vCFG space.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
There's no need to look up MSI ptirq entry by virtual SID any more since the MSI
ptirq entry would be removed before the device is assigned to a VM.
Now the logic of MSI interrupt remap could simplify as:
1. Add the MSI interrupt remap first;
2. If step is already done, just do the remap part.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
Reviewed-by: Grandhi, Sainath <sainath.grandhi@intel.com>
In commit 0a7770cb, we remove vm pointer in vpci structrue. So there's no need
for such pre-condition since vpci is embedded in vm structure. The vm can't be
NULL Once the vpci is not NULL.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
The existing code do separately for each VM when we deinit vpci of a VM. This is
not necessary. This patch use the common handling for all VMs: we first deassign
it from the (current) user, then give it back to its parent user.
When we deassign the vdev from the (current) user, we would de-initialize the
vMSI/VMSI-X remapping, so does the vMSI/vMSI-X data structure.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
Now we could know a device status by 'user' filed, like
---------------------------------------------------------------------------
| NULL | == vdev | != NULL && != vdev
vdev->user | device is de-init | used by itself VM | assigned to another VM
---------------------------------------------------------------------------
So we don't need to modify 'vpci' field accordingly.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
Add a new field 'parent_user' to record the parent user of the vdev. And refine
'new_owner' to 'user' to record who is the current user of the vdev. Like
-----------------------------------------------------------------------------------------------
vdev in | HV | pre-VM | SOS | post-VM
| | |vdev used by SOS|vdev used by post-VM|
-----------------------------------------------------------------------------------------------
parent_user| NULL(HV) | NULL(HV) | NULL(HV) | NULL(HV) | vdev in SOS
-----------------------------------------------------------------------------------------------
user | vdev in HV | vdev in pre-VM | vdev in SOS | vdev in post-VM | vdev in post-VM
-----------------------------------------------------------------------------------------------
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Acked-by: Eddie Dong<eddie.dong@Intel.com>
The virtual MSI information could be included in ptirq_remapping_info structrue,
there's no need to pass another input paramater for this puepose. So we could
remove the ptirq_msi_info input.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Most code in the if ... else is duplicated. We could put it out of the
conditional statement.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Conceptually, the devices unregistration sequence of the shutdown process should be
opposite to create.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
We could use container_of to get vm structure pointer from vpic. So vm
structure pointer is no need in vpic structure.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
We could use container_of to get vm structure pointer from vpci. So vm
structure pointer is no need in vpci structure.
Tracked-On: #4550
Signed-off-by: Li Fei1 <fei1.li@intel.com>
For SOS VM, when the target platform has multiple IO-APICs, there
should be equal number of virtual IO-APICs.
This patch adds support for emulating multiple vIOAPICs per VM.
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
As ACRN prepares to support platforms with multiple IO-APICs,
GSI is a better way to represent physical and virtual INTx interrupt
source.
1) This patch replaces usage of "pin" with "gsi" whereever applicable
across the modules.
2) PIC pin to gsi is trickier and needs to consider the usage of
"Interrupt Source Override" structure in ACPI for the corresponding VM.
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
Changes the mmio handler data from that of the acrn_vm struct to
the acrn_vioapic.
Add nr_pins and base_addr to the acrn_vioapic data structure.
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
Reverts 538ba08c: hv:Add vpin to ptdev entry mapping for vpic/vioapic
ACRN uses an array of size per VM to store ptirq entries against the vIOAPIC pin
and an array of size per VM to store ptirq entries against the vPIC pin.
This is done to speed up "ptirq entry" lookup at runtime for Level triggered
interrupts in API ptirq_intx_ack used on EOI.
This patch switches the lookup API for INTx interrupts to the API,
ptirq_lookup_entry_by_sid
This could add delay to processing EOI for Level triggered interrupts.
Trade-off here is space saved for array/s of size CONFIG_MAX_IOAPIC_LINES with 8 bytes
per data. On a server platform, ACRN needs to emulate multiple vIOAPICs for
SOS VM, same as the number of physical IO-APICs. Thereby ACRN would need around
10 such arrays per VM.
Removes the need of "pic_pin" except for the APIs facing the hypercalls
hcall_set_ptdev_intr_info, hcall_reset_ptdev_intr_info
Tracked-On: #4151
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
There're some PCI devices need special handler for vendor-specical feature or
capability CFG access. The Intel GPU is one of them. In order to keep the ACRN-HV
clean, we want to throw the qurik part of PCI CFG asccess to DM to handle.
To achieve this, we implement per-device policy base on whether it needs quirk handler
for a VM: each device could configure as "quirk pass through device" or not. For a
"quirk pass through device", we will handle the general part in HV and the quirk part
in DM. For a non "quirk pass through device", we will handle all the part in HV.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
There're some cases the SOS (higher severity guest) needs to access the
post-launched VM (lower severity guest) PCI CFG space:
1. The SR-IOV PF needs to reset the VF
2. Some pass through device still need DM to handle some quirk.
In the case a device is assigned to a UOS and is not in a zombie state, the SOS
is able to access, if and only if the SOS has higher severity than the UOS.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
To avoid information leakage, we need to ensure that the device is
inaccessble when it does not exist.
For SR-IOV disabled VF device, we have the following operations.
1. The configuration space accessing will get 0xFFFFFFFF as a
return value after set the device state to zombie.
2. The BAR MMIO EPT mapping are removed, the accesssing causes
EPT violation.
3. The device will be detached from IOMMU.
4. The IRQ pin and vector are released.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
change vmsi_read_cfg to read_vmsi_cfg, same applies to writing
change vmsix_read_cfg to read_vmsix_cfg, same applies to writing
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In order to add GVT-D support, we need pass through stolen memory and opregion memroy
to the post-launched VM. To implement this, we first reserve the GPA for stolen memory
and opregion memory through post-launched VM e820 table. Then we would build EPT mapping
between the GPA and the stolen memory and opregion memory real HPA. The last, we need to
return the GPA to post-launched VM if it wants to read the stolen memory and opregion
memory address and prevent post-launched VM to write the stolen memory and opregion memory
address register for now.
We do the GPA reserve and GPA to HPA EPT mapping in ACRN-DM and the stolen memory and
opregion memory CFG space register access emulation in ACRN-HV.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
VM needs to check if it owns this device before deiniting it.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Change enable_vf/disable_vf to create_vfs/disable_vfs
Change base member of pci_vbar to base_gpa
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The vf_bdf is not initialized when invoking pci_pdev_read_cfg function.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
We didn't support SR-IOV capability of PF in UOS for now, we should
hide the SR-IOV capability if we pass through the PF to a UOS.
For now, we don't support assignment of PF to a UOS.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
Emulate Device ID, Vendor ID and MSE(Memory Space Enable) bit in
configuration space for an assigned VF, initialize assgined VF Bars.
The Device ID comes from PF's SRIOV capability
The Vendor ID comes from PF's Vendor ID
The PCI MSE bit always be set when VM reads from an assigned VF.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
If a VF instance is disabled, we didn’t remove the vdev instance,
only set the vdev as a zombie vdev instance, indicating that it
cannot be accessed anymore.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Change name find_vdev to find_available_vdev and add comments
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The VF BARs are initialized by its PF SRIOV capability
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Refine coding style to wrap msix map/unmap operations, clean up repeated
assignments for msix mmio_hpa and mmio_size.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add _v prefix for some function name to indicate this function wants to operate
on virtual CFG space or virtual BAR register.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
Removed the pci_vdev_write_cfg_u8/u16/u32 APIs and only used
pci_vdev_write_cfg as the API for writing vdev's cfgdata
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add cfg_header_read_cfg and cfg_header_write_cfg to handle the 1st 64B
CFG Space header PCI configuration space.
Only Command and Status Registers are pass through;
Only Command and Status Registers and Base Address Registers are writable.
In order to implement this, we add two type bit mask for per 4B register:
pass through mask and read-only mask. When pass through bit mask is set, this
means this bit of this 4B register is pass through, otherwise, it is virtualized;
When read-only mask is set, this means this bit of this 4B register is read-only,
otherwise, it's writable. We should write it to physical CFG space or virtual
CFG space base on whether the pass through bit mask is set or not.
Tracked-On: #4371
Signed-off-by: Li Fei1 <fei1.li@intel.com>
1. Renames DEFINE_IOAPIC_SID with DEFINE_INTX_SID as the virtual source can
be IOAPIC or PIC
2. Rename the src member of source_id.intx_id to ctlr to indicate interrupt
controller
2. Changes the type of src member of source_id.intx_id from uint32_t to
enum with INTX_CTLR_IOAPIC and INTX_CTLR_PIC
Tracked-On: #4447
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
for vpci_bridge it is better just write the virtual configure space,
so move out the PCI bridge phyiscal cfg write to pci.c
also add some rules in config pci bridge.
Tracked-On: #3381
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
reduce the use of similar APIs (particularly the name confusion) for
CFG space read/write.
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Make the name of the functions more accurate
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
create new pdev and vdev structures for a SRIOV VF device initialization
Tracked-On: #4433
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>