Issue description:
-----------------
Machine Check Error on Page Size Change
Instruction fetch may cause machine check error if page size
and memory type was changed without invalidation on some
processors[1][2]. Malicious guest kernel could trigger this issue.
This issue applies to both primary page table and extended page
tables (EPT), however the primary page table is controlled by
hypervisor only. This patch mitigates the situation in EPT.
Mitigation details:
------------------
Implement non-execute huge pages in EPT.
This patch series clears the execute permission (bit 2) in the
EPT entries for large pages. When EPT violation is triggered by
guest instruction fetch, hypervisor converts the large page to
smaller 4 KB pages and restore the execute permission, and then
re-execute the guest instruction.
The current patch turns on the mitigation by default.
The follow-up patches will conditionally turn on/off the feature
per processor model.
[1] Refer to erratum KBL002 in "7th Generation Intel Processor
Family and 8th Generation Intel Processor Family for U Quad Core
Platforms Specification Update"
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/7th-gen-core-family-spec-update.pdf
[2] Refer to erratum SKL002 in "6th Generation Intel Processor
Family Specification Update"
https://www.intel.com/content/www/us/en/products/docs/processors/core/desktop-6th-gen-core-family-spec-update.html
Tracked-On: #4120
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
Cacheline is flushed on EPT entry change, no need to invalidate cache globally
when VM created per VM.
Tracked-On: #4120
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
EPT tables are shared by MMU and IOMMU.
Some IOMMUs don't support page-walk coherency, the cpu cache of EPT entires
should be flushed to memory after modifications, so that the modifications
are visible to the IOMMUs.
This patch adds a new interface to flush the cache of modified EPT entires.
There are different implementations for EPT/PPT entries:
- For PPT, there is no need to flush the cpu cache after update.
- For EPT, need to call iommu_flush_cache to make the modifications visible
to IOMMUs.
Tracked-On: #4120
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
VT-d shares the EPT tables as the second level translation tables.
For the IOMMUs that don't support page-walk coherecy, cpu cache should
be flushed for the IOMMU EPT entries that are modified.
For the current implementation, EPT tables for translating from GPA to HPA
for EPT/IOMMU are not modified after VM is created, so cpu cache invlidation is
done once per VM before starting execution of VM.
However, this may be changed, runtime EPT modification is possible.
When cpu cache of EPT entries is invalidated when modification, there is no need
invalidate cpu cache globally per VM.
This patch exports iommu_flush_cache for EPT entry cache invlidation operations.
- IOMMUs share the same copy of EPT table, cpu cache should be flushed if any of
the IOMMU active doesn't support page-walk coherency.
- In the context of ACRN, GPA to HPA mapping relationship is not changed after
VM created, skip flushing iotlb to avoid potential performance penalty.
Tracked-On: #4120
Signed-off-by: Binbin Wu <binbin.wu@intel.com>
Reviewed-by: Anthony Xu <anthony.xu@intel.com>
AP trampoline code should be accessile to hypervisor only,
Unmap this memory region from service VM's EPT mapping
for security reason..
Tracked-On: #4091
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
1. Print warning message instead of panic when
the caller try to modify the attribute for
memory region or delete memory region that
are not present.
2. To avoid above warning message for memory region
below 1M,its attribute may be updated by Service
VM when updating MTTR setting.
Tracked-On: #4091
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Reviewed-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In lib/string.c, strncmp doesn't consider condition "n_arg=0",
just add a process to "n_arg=0".
Tracked-On: #4093
Tracked-On: projectacrn/acrn-hypervisor#3466
Signed-off-by: YanX Fu <yanx.fu@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
this patch is to fix error debug message
for invalid 'param' case, there is no string
variable for '%s' output, which will potenially
trigger hypervisor crash as it may access random
memroy address and trigger SMAP violation.
Tracked-On: #4092
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
In release environment, binary files must be stripped in
order to remove debugging code sections and symbol information
that aid attackers in the process of disassembly and reverse
engineering.
Use '-s' linking option to remove symbol table and relocation
information from release binaries.
Tracked-On: #3427
Signed-off-by: Tianhua Sun <tianhuax.s.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
'pcpu_id' should be less than CONFIG_MAX_PCPU_NUM,
else 'per_cpu_data' will overflow. This commit fixes
this potential overflow issue.
Tracked-On: #3397
Signed-off-by: Tianhua Sun <tianhuax.s.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Possible buffer overflow will happen in vlapic_set_tmr()
and vlapic_update_ppr(),this path is to fix them.
Tracked-On: #1252
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The 'boot_params' and 'entry' might be dereferenced after they were
positively checked for NULL. Refine checking logic to fix the issue.
Tracked-On: #2979
Signed-off-by: Qi Yadong <yadong.qi@intel.com>
Acked-by: Zhu Bing <bing.zhu@intel.com>
In the presence of SOS, ACRN uses fallback_iommu_domain which is the same
used by SOS, to assign domain to devices during ACRN init. Also it uses
fallback_iommu_domain when DM requests ACRN to remove device from UOS domain.
This patch changes the design of assign/remove_iommu_device to avoid the
concept of fallback_iommu_domain and its setup. This way ACRN can commonly
treat pre-launched VMs bringup w.r.t. IOMMU domain creation.
Tracked-On: #2965
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
In the current design, logic partition scenario is supported
on KBL NUC i7 since there is no related configuration and
no the cooresponding boot loader supporting.
The boot loader supporting is done in the previous patch.
Add some configurations such physical PCI devices information,
virtual e820 table etc for KBL NUC i7 to enable logical
partition scenario.
In the logical partition of KBL NUC i7, there are two
pre-launched VM, this pre-launched VM doesn't support
local APIC passthrough now. The hypervisor is booted through
GRUB.
TODO: In future, Local APIC passthrough and some real time
fetures are needed for the logic partition scenario of KBL
NUC i7.
V5-->V6:
Update "Tracked-On"
Tracked-On: #2944
Signed-off-by: Xiangyang Wu <xiangyang.wu@linux.intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
In the current design, hypervisor only detects two kinds of
multiboot compiliant firwares (UEFI loader and non-UEFI loader),
It can't detect other multiboot compliant firware (such GRUB
loader) and can't detect UEFI loader explicitly since loader
name is not supported by UEFI loader (efi stub). In the
logical partition scenario on KBL NUC i7, one multiboot
compliant firware is used to boot hypervisor and load guest
OS image, and firware runtime service shall be disable to
avoid interference. So GRUB can be selected as a candidate
to enable logical partition scenario on KBL NUC i7.
Update firware detection and operations selecting logic to detect
more multiboot compiliant firware (such as GRUB and UEFI loader)
explicitly, different operations is selected according to the
boot load name through a static mapping table between boot load name
and firmware operations. GRUB loader can use the SBL operations
to handle multiboot information parsing and vm booting since
these multiboot compiliant firmware (SBL/ABL/GRUB) which boots
hypervisor (binary format), provides multiboot information and
the start address of guest OS image(binary format) in the same way,
and only runs on boot time.
From MISRA C view, viarble array is not allowed, so define the
static array for above mapping table;
From security view, it is better use strncmp insteads of strcmp.
TODO: In future, need to redesign boot moudle to suport different
boot loader, different VM boot, and different guest OS.
V2-->V3:
Update firmware detection logic to handle GRUB loader
which is need to provided multiboot information to the
hypervisor (such as the address of guest OS image);
V3-->V4:
Update firmware detection and operations selecting logic
to enable GRUB loader support in hypervisor;
V4-->V5:
Separte UEFI loader name supporting in a separate patch,
and update commit comment to make patch clearer.
V5-->V6:
Update "Tracked-On"
Tracked-On: #2944
Signed-off-by: Xiangyang Wu <xiangyang.wu@linux.intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
add new hypercall get platform information,
such as physical CPU number.
Tracked-On: #2538
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
ACRN builds mptable for pre-launched VMs. It uses CONFIG_PARTITION_MODE
to compile mptable source code and related support. This patch removes
the macro and checks if the type of VM is pre-launched to build mptable.
Tracked-On: #2941
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Enhance "vm_list" command in shell to Show VM UUID;
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Currently VM id of NORMAL_VM is allocated dymatically, we need to make
VM id statically for FuSa compliance.
This patch will pre-configure UUID for all VMs, then NORMAL_VM could
get its VM id/configuration from vm_configs array by indexing the UUID.
If UUID collisions is found in vm configs array, HV will refuse to
load the VM;
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Return true if vm configs is sanitized successfully, otherwise return false;
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The code mixed the usage on term of UUID and GUID, now use UUID to make
code more consistent, also will use lowercase (i.e. uuid) in variable name
definition.
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
1) In x2apic mode, when read ICR, we want to read a 64-bits value.
2) In x2apic mode, write self-IPI will trap out through MSR write when VID isn't enabled.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
We could call vlapic API directly, remove vlapic_rdmsr/wrmsr to make things easier.
Tracked-On: #1842
Signed-off-by: Li, Fei1 <fei1.li@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
renamed: include/dm/pci.h -> include/hw/pci.h
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Now the io_emul.c is relates with arch,io_req.c is common,
move some APIs from io_emul.c to io_req.c as common like these APIs:
register_pio/mmio_emulation_handler
dm_emulate_pio/mmio_complete
pio_default_read/write
mmio_default_access_handler
hv_emulate_pio/mmio etc
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Move ‘emul_pio[]/default_io_read/default_io_write’
from struct vm_arch to struct acrn_vm
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
-- this api is related with arch_x86, then move to x86 folder
-- rename 'set_vhm_vector' to 'set_vhm_notification_vector'
-- rename 'acrn_vhm_vector' to 'acrn_vhm_notification_vector'
-- add an API 'get_vhm_notification_vector'
Tracked-On: #1842
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Jason Chen CJ <jason.cj.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
For Pre-launched VMs, ACRN uses mptable for reporting APIC IDs to guest OS.
In current code, ACRN uses physical LAPIC IDs for vLAPIC IDs.
This patch is to let ACRN use vCPU id for vLAPIC IDs and also report the same
when building mptable. ACRN should still use physical LAPIC IDs for SOS
because host ACPI tables are passthru to SOS.
Tracked-On: #2934
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@Intel.com>
ACRN uses CONFIG_PARTITION_MODE macro to compile out CPU_IRQ_ENABLE/DISABLE
APIs. With vector remapping enabled for pre-launched VMs, this is of no use.
And for VMs with LAPIC pass-thru, interrupts stay disabled in vmexit loop
with the help of is_lapic_pt() API.
Tracked-On: #2903
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Now the MAX supported VM number is defined explicitly for each scenario,
so move this config from Kconfig to VM configuration.
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Previously we use unified vm_config.c for all scenarios and use MACROs
for each configuration items, then the initialization of vm_configs[]
becomes more complicated when definition of MACROs increase, so change
the coding style that all configurable items could be explicitly shown in
vm_configuration.c to make code more readable.
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Add, update '@return' and '@retval' statements to the API descriptions related
to vCPU operations.
Tracked-On: #1844
Signed-off-by: Geoffroy Van Cutsem <geoffroy.vancutsem@intel.com>
As vector re-mapping is enabled for pre-launched/partition mode VMs,
there is no more need for separate interrupt routine i.e.
partition_mode_dispatch_interrupt.
Tracked-On: #2879
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
For pre-launched VMs MSI/MSI-x configuration writes are not intercepted by ACRN.
It is pass-thru and interrupts land in ACRN and the guest vector is injected into
the VM's vLAPIC. With this patch, ACRN intercepts MSI/MSI-x config writes and take
the code path to remap interrupt vector/APIC ID as it does for SOS/UOS.
Tracked-On: #2879
Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com>
Reviewed-by: Eddie Dong <eddie.dong@intel.com>
To keep consistency between HV and DM about PM1A_CNT_ADDR,
it is better to replace the PM1A_CNT related MACROs used in DM
with VIRTUAL_PM1A_CNT related MACROs in acrn_common.h.
Tracked-On: #2865
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <Eddie.dong@intel.com>
This patch mainly does the following:
- Replace prefix RT_VM_ with VIRTUAL_.
- Remove the check of "addr != RT_VM_PM1A_CNT_ADDR" as the handler is specific for this addr.
- Add comments about the meaning of return value.
Tracked-On: #2865
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Intel SDM Vol3 23.8 says:
The INIT signal is blocked whenever a logical processor is in VMX root operation.
It is not blocked in VMX nonroot operation. Instead, INITs cause VM exits
So, there is no side-effect to send INIT signal regardless of pcpu active status.
Tracked-On: #2865
Signed-off-by: Kaige Fu <kaige.fu@intel.com>
Acked-by: Anthony Xu <anthony.xu@intel.com>
'vector_cnt' is vector count of MSI/MSIX, and it come from 'table_count'
in the the pci_msix struct. 'CONFIG_MAX_MSIX_TABLE_NUM' is the maximum
number of MSI-X tables per device, so 'vector_cnt' must be less then
or equal to 'CONFIG_MAX_MSIX_TABLE_NUM'.
Tracked-On: #2881
Signed-off-by: Tianhua Sun <tianhuax.s.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
All if . . else if constructs shall be
terminated with an else statement.
Tracked-On: #861
Signed-off-by: Huihuang Shi <huihuang.shi@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com
The pt_dev.c in board folder is replaced by the one in scenarios folder,
so remove them.
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
After using get_vm_from_vmid(), vm pointer is always not NULL. But there are still many NULL pointer checks.
This commit replaced the NULL vm pointer check with a validation check which checks the vm status.
In addition, NULL check for pointer returned by get_sos_vm() and get_vm_config() is removed.
Tracked-On: #2520
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This commit renmaes guest_no in shell_to_sos_console() to vm_id, which is commonly used in ACRN.
Tracked-On: #2520
Signed-off-by: Yan, Like <like.yan@intel.com>
For hv shell commands with vm_id option, need to check it is in the valid range [0, CONFIG_MAX_VM].
This commit checks the input vm_id, if it's not in the valid range, an error message will be given,
and vm_id will be assgined to default 0U.
Tracked-On: #2520
Signed-off-by: Yan, Like <like.yan@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The CLOS is initialized to 0 for each scenarios. User could modify this
configuration in its vm_configurations.h;
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In this scenario, hypervisor will run two logical partition VMs.
Please note that the Kconfig of Hypervisor mode will be removed
gradually. In current Kconfig setting, the CONFIG_PARTITION_MODE
is still kept for now for back-compatibility.
Tracked-On: #2291
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>