Currently the sched event handling may encounter data race problem, and
as a result some vcpus might be stalled forever.
One example can be wbinvd handling where more than 1 vcpus are doing
wbinvd concurrently. The following is a possible execution of 3 vcpus:
-------
0 1 2
req [Note: 0]
req bit0 set [Note: 1]
IPI -> 0
req bit2 set
IPI -> 2
VMExit
req bit2 cleared
wait
vcpu2 descheduled
VMExit
req bit0 cleared
wait
vcpu0 descheduled
signal 0
event0->set=true
wake 0
signal 2
event2->set=true [Note: 3]
wake 2
vcpu2 scheduled
event2->set=false
resume
req
req bit0 set
IPI -> 0
req bit1 set
IPI -> 1
(doesn't matter)
vcpu0 scheduled [Note: 4]
signal 0
event0->set=true
(no wake) [Note: 2]
event0->set=false (the rest doesn't matter)
resume
Any VMExit
req bit0 cleared
wait
idle running
(blocked forever)
Notes:
0: req: vcpu_make_request(vcpu, ACRN_REQUEST_WAIT_WBINVD).
1: req bit: Bit in pending_req_bits. Bit0 stands for bit for vcpu0.
2: In function signal_event, At this time the event->waiting_thread
is not NULL, so wake_thread will not execute
3: eventX: struct sched_event of vcpuX.
4: In function wait_event, the lock does not strictly cover the execution between
schedule() and event->set=false, so other threads may kick in.
-----
As shown in above example, before the last random VMExit, vcpu0 ended up
with request bit set but event->set==false, so blocked forever.
This patch proposes to change event->set from a boolean variable to an
integer. The semantic is very similar to a semaphore. The wait_event
will add 1 to this value, and block when this value is > 0, whereas signal_event
will decrease this value by 1.
It may happen that this value was decreased to a negative number but that
is OK. As long as the wait_event and signal_event are paired and
program order is observed (that is, wait_event always happens-before signal_event
on a single vcpu), this value will eventually be 0.
Tracked-On: #6405
Signed-off-by: Yifan Liu <yifan1.liu@intel.com>
In the default config file for hybrid scenario, zephyr image was
configured as KERNEL_RAWIMAGE. Now we change them to KERNEL_ELF
for all the platforms. And also kernel mods are changed from
Zephyr_RawImage to Zephyr_ElfImage
Tracked-On: #6323
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
This is a simply implement for the 32bit and 64bit elf loader.
The loading function first reads the image header, and finds the program
entries that are marked as PT_LOAD, then loads segments from elf file to
guest ram. After that, it finds the bss section in the elf section entries, and
clear the ram area it points to.
Limitations:
1. The e_type of the elf image must be ET_EXEC(executable). Relocatable or
dynamic code is not supported.
2. The loader only copies program segments that has a p_type of
PT_LOAD(loadable segment). Other segments are ignored.
3. The loader doesn’t support Sections that are relocatable
(sh_type is SHT_REL or SHT_RELA)
4. The 64bit elf’s entry address must below 4G.
5. The elf is assumed to be able to put segments to valid guest memory.
Tracked-On: #6323
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
This patch adds a function elf_loader() to load elf image.
It checks the elf header, get its 32/64 bit type, then calls
the corresponding loading routines, which are empty, and
will be realized later.
Tracked-On: #6323
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Source: https://github.com/freebsd/freebsd-src/blob/main/sys/sys/elf_common.h
Trimed to meet the minimal requirements for the Zephyr elf file to be loaded
Also added elf file header data struct and program/section entry data structs.
Tracked-On: #6323
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
In order to make better sense, vm_elf_loader, vm_bzimage_loader and
vm_rawimage_loader are changed to elf_loaer, bzimage_loaer and
rawimage_loader.
Tracked-On: #6323
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Remove the acpi loading function from elf_loader, rawimage_loaer and
bzimage_loader, and call it together in vm_sw_loader.
Now the vm_sw_loader's job is not just loading sw, so we rename it to
prepare_os_image.
Tracked-On: #6323
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
For the guest OS loaders, prapare_loading_xxx are not accurate for
what those functions actually do. Now they are changed to load_xxx:
load_rawimage, load_bzimage.
And the 'bsp' expression is confusing in the comments for
init_vcpu_protect_mode_regs, changed to a better way.
Tracked-On: #6323
Signed-off-by: Zhou, Wu <wu.zhou@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
vboot_info.h declares vm loader function also, so rename the file name to
vboot.h;
Tracked-On: #6323
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
The patch splits the vm_load.c to three parts, the loader function of bzImage
kernel is moved to bzimage_loader.c, the loader function of raw image kernel
is moved to rawimage_loader.c, the stub is still stayed in vm_load.c to load
the corresponding kernel loader function. Each loader function could be
isolated by CONFIG_GUEST_KERNEL_XXX macro which generated by config tool.
Tracked-On: #6323
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Change if condition to switch in vm_sw_loader() so that the sw loader
could be compiled conditionally.
Tracked-On: #6323
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
Rename KERNEL_ZEPHYR to KERNEL_RAWIMAGE. Added new type "KERNEL_ELF".
Add CONFIG_GUEST_KERNEL_RAWIMAGE, CONFIG_GUEST_KERNEL_ELF and/or
CONFIG_GUEST_KERNEL_BZIMAGE to config.h if it's configured.
Tracked-On: #6323
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Reviewed-by: Victor Sun <victor.sun@intel.com>
Previously we only support loading raw format of zephyr image as prelaunched
Zephyr VM, this would cause guest F segment overridden issue because the zephyr
raw image covers memory space from 0x1000 to 0x100000 upper. To fix this issue,
we should support ELF format image loading so that parse and load the multiple
segments from ELF image directly.
Tracked-On: #6323
Signed-off-by: Victor Sun <victor.sun@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
On some boards it is seen that the log area of the physical TPM2 is
programmed to be 0. If TPM2 is passed through to a pre-launched VM in such
cases, a piece of memory starting from GPA 0 will be unmapped from the
Service VM, leading to Service VM crash due to early BIOS corruption
checks.
This patch temporarily disables TPM2 passthrough on such platforms. A
thorough fix should be proposed later to gracefully handle such cases.
Tracked-On: #6288
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Using xml.etree.ElementTree to parse the untrusted data is known to
raise security issue. Replaced it using defusedxml.
Tracked-On: #6342
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
The subprocess module is needed for calling package from python script.
Add #nosec for subprocess module importing.
Tracked-On: #6342
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
ACRN does not support the variable range vMTRR. The default
memory type of vMTRR is UC. With this vMTRR emulation guest VM
such as Linux refuses to map the MMIO address space as WB. In
order to get better performance SHM BAR of ivshmem is mapped
with PAT ignored and memory type of SHM BAR is fixed to WB.
Tracked-On: #6389
Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
update the board xml and related scenario xml files because the
PR 6352 update the board inspector code to get more board info.
Tracked-On: #6298
Signed-off-by: Kunhui-Li <kunhuix.li@intel.com>
Extract the log area address from TPM2 acpi table and add node
log_area_start_address to board.xml. This emelment is used by host_pa of
mmiodevs.
Tracked-On: #6320
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
"port->cb" in 'virtio_console_notify_tx()'
function maybe NULL when malicious inputs
are injected from virtio frondend in guest.
Tracked-On: #6388
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
acrn-dm --windows parameter is used for specify
the Windows ORACLE virtio device driver,which is
used when enable secure boot in Windows, otherwise
the default REDHAT virtio device driver will be
used which is not certified by Microsoft
Signed-off-by: Liu, Hang1 <hang1.liu@intel.com>
Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
There could be multiple ways in the scenario configuration to specify that
no INTx allocation is explicitly allocation to a prelaunched VM:
* Do not have a `pt_intx` node at all.
* Have a `pt_intx` node with no text.
* Have a `pt_intx` node with a text that has nothing but whitespaces,
tabs and newlines.
The current implementation only supports the first way, and will cause
build-time failures when a scenario configuration uses the latter two. The
following changes are introduced by this patch to fix such errors.
* The INTx static allocator queries the text() of `pt_intx` nodes
directly to gracefully handle `pt_intx` nodes with no text.
* The XSLT of pt_intx.c clears all kinds of white spaces from the text of
`pt_intx` nodes before calculating the set of allocated INTx mappings.
Tracked-On: #6287
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
This patch refines the configuration data description so that it is
consistent with the current implementation of the configuration tools.
Tracked-On: #6377
Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Add SECURITY_VM_FIXUP config for Security VM whether it needs to do fixup
for TPM2 and SMBIOS
Tracked-On: #6320
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Signed-off-by: Fei Li <fei1.li@intel.com>
Create virtual acpi table of tpm2 based on the raw data if the TPM2
device is presented and the passthrough tpm2 is enabled.
Refine the arguments of bin_gen.py. The --board and --scenario take the
path to the XMLs as the argument. The allocation.xml is needed for
bin_gen.py to generate tpm2 acpi table.
Refine the condition of tpm2_acpi_gen. The tpm2 device "MSFT0101" can be
present in device id or compatible_id(CID). Check both attributes and
child node of tpm2 device.
Tracked-On: #6320
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
If passthrough TPM2 is enabled and the log area is present, allocates
the log_area_start_address with the size log_area_minimum_length(256K).
Tracked-On: #6320
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Create python script tpm2 which parse the tpm2 acpi table datas. Add
this parsed data to the <device id="MSFT0101" description="TPM 2.0 Device"> of board.xml.
Tracked-On: #6320
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Relocate ACPI address to 0x7fe00000 and ACPI NVS to 0x7ff00000 correspondingly.
In this case, we could include TPM event log region [0x7ffb0000, 0x80000000)
into ACPI NVS.
Tracked-On: #6320
Signed-off-by: Fei Li <fei1.li@intel.com>
ACRN used to prepare the vTPM2 ACPI Table for pre-launched VM at the build stage
using config tools. This is OK if the TPM2 ACPI Table never changes. However,
TPM2 ACPI Table may be changed in some conditions: change BIOS configuration or
update BIOS.
This patch do TPM2 fixup to update the vTPM2 ACPI Table and TPM2 MMIO resource
configuration according to the physical TPM2 ACPI Table.
Tracked-On: #6366
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Signed-off-by: Fei Li <fei1.li@intel.com>
1. add a name field to indicate what the MMIO Device is.
2. add two more MMIO resource to the acrn_mmiodev data structure.
Tracked-On: #6366
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
Signed-off-by: Fei Li <fei1.li@intel.com>
Update launch xml to use image to launch hard rt vm since
we changed the platforms from two disk to only one NVME disk.
Tracked-On: #6315
Signed-off-by: Kunhui-Li <kunhuix.li@intel.com>
Update lpc slot to origin value 1 from 31 because GOP driver has assumption
to config space layout of the device on 00:1f.0.
Tracked-On: #6340
Signed-off-by: Kunhui-Li <kunhuix.li@intel.com>
ACRN could run without XSAVE Capability. So remove XSAVE dependence to support
more (hardware or virtual) platforms.
Tracked-On: #6287
Signed-off-by: Fei Li <fei1.li@intel.com>
Check whether condition is met before check whether time is out after iommu_read32.
This is because iommu_read32 would cause time out on some virtual platform in
spite of the current DMAR status meets the pre_condition.
Tracked-On: #6371
Signed-off-by: Fei Li <fei1.li@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
When a platform reboots or shuts down, the contents of RAM are not immediately
lost but begins to decay. During this period, there is a short timeframe during
which an attacker can turn the platform back on to boot into a program that
dumps the contents of memory (e.g., cold boot attacks). Encryption keys and
other secrets can be easily compromised through this method.
We already erasing the guest memory data when the guest is shut down normally.
However, if the guest is shut down abnormally, the contents of RAM may still
there. This patch mitigate this kind reset attack for a DM launched VM by
erasing the guest memory data by the guest has been created.
Tracked-On: #6061
Signed-off-by: Li Fei1 <fei1.li@intel.com>
If the MAX_MSIX_TABLE_NUM is specified in scenario.xml. Return the
largest number from count of MSI, table_size of MSIX or
MAX_MSIX_TABLE_NUM of scenario.xml.
Tracked-On: #6235
Signed-off-by: Yang,Yu-chu <yu-chu.yang@intel.com>
Replace Kconfig related scheduler configuration with scenario XML
file configuration in tutorials because the Kconfig related files
have been removed in the other PR 6358.
Signed-off-by: Kunhui-Li <kunhuix.li@intel.com>
The ACPI specification allows both assigning to buffers and indexing to a
certain byte of a buffer using the Index operator. This patch adds the
implementation of these two operations in the interpreter.
Tracked-On: #6287
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
The PackageElementList builder takes variadic arguments, each of which is
an element of the package to be created, not a single argument being the
list of the elements. This patch fix the call to PackageElementList in
build_value() where the wrong type of argument was passed.
Tracked-On: #6287
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
It is typical in AML resource descriptors to have 0-length region
descriptors which are typically templates of resources that are not
assigned on the current platform. For such regions, the `base + length - 1`
formula does not calculate the max of the region properly.
This patch updates the resource descriptor parsers to use max = min when
the length of the region is 0.
Tracked-On: #6287
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
When parsing an AML object representing a host bridge, the current board
inspector may encounter the following issues:
1. The host DSDT may contain multiple host bridge instances, with some of
them not being present. In this case the _BBN of these instances may
evaluate to the same value that coincide with the bus assigned to an
existing host bridge, leading to multiple PCI bus nodes with the same
bus number and thus confusion in later information extraction phases.
2. Methods of a host bridge may refer to the PCI configuration space of
itself (which is typically Device 0, Function 0 under that
bus). However, such objects may not have an _ADR object as the bus
number is encoded by the _BBN object instead.
This patch fixes the issues above.
Tracked-On: #6287
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
If HV enable trigger #GP for uc-lock, and is about to emulate guest uc-lock
instructions, should trap guest #GP. Guest uc-lock instrucction trigger #GP,
cause vmexit for #GP, HV handle this vmexit and emulate uc-lock
instruction.
Tracked-On: #6299
Signed-off-by: Tao Yuhong <yuhong.tao@intel.com>
This patch enables the interpretation of the following AML objects.
* OnesOp. A OnesOp object always evaluates to a 64-bit integer with all bits
set to 1. It is assumed that the host DSDT is always revision 2 or above,
which is typically the case on modern platforms.
* DefMatch. A DefMatch object evaluates to the index of the first
element (starting from a given index) in a package that matches the given
two predicates. If a match is not found, the constant Ones is returned.
* DefSizeOf. A DefSizeOf object evaluates to the byte length of a buffer, the
length of a string (without the terminating NUL character) or the number of
elements in a package.
Tracked-On: #6287
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
In commit e5ba06cbe8 ("board_inspector: a workaround to an incorrect
interpretation") a workaround is introduced to check the data type of
predicate operands. That commit assumes that both operands must be exactly
integers, which is not usually the case as operation fields or strings can
also be used in predicates.
This patch applies the following conversions on both operands when
evaluating a predicate:
1. Try converting both operands to integers
2. If either conversion in step 1 fails, try regarding both operands as
strings.
3. If either operand is not a string, return the default
result (i.e. False).
Fixes: e5ba06cbe8 ("board_inspector: a workaround to an incorrect interpretation")
Tracked-On: #6287
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
The method `enter_scope` of class `Context` is a reduced implementation of
`change_scope` which assumes that the given name is simply a NameSeg. This
method is currently only used when a new named scope is opened by a
DefDevice object for historical reasons.
As the other named-scope-opening objects all use `change_scope` which can
handle arbitrary NameString, this patch unifies the code by removing
`enter_scope` and replacing the only occurrence with `change_scope`. This
also resolves the parsing of AML templates in board XMLs where device names
can be more than a simple NameSeg.
Tracked-On: #6287
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
User could use make targz-pkg command to generate tar package in
build directory,which could help user simplify the process
of installing acrn hypervisor in target board. user need to copy the
tarball package to target board,and extract it to "/" directory.
Tracked-On: #6355
Signed-off-by: liu hang1 <hang1.liu@intel.com>
Reviewed-by: VanCutsem, Geoffroy <geoffroy.vancutsem@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>