The new way to boot from TDX firmware (e.g. td-shim) is using the
combination of '--platform tdx=on' with '--firmware tdshim'.
Fixes: #5309
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 067e2b1e33)
This release has been tracked in our new [roadmap project ](https://github.com/orgs/cloud-hypervisor/projects/6) as iteration v27.0.
**Community Engagement**
A new mailing list has been created to support broader community discussions.
Please consider [subscribing](https://lists.cloudhypervisor.org/g/dev/); an announcement of a regular meeting will be
announced via this list shortly.
**Prebuilt Packages**
Prebuilt packages are now available. Please see this [document](https://github.com/cloud-hypervisor/obs-packaging/blob/main/README.md)
on how to install. These packages also include packages for the different
firmware options available.
**Network Device MTU Exposed to Guest**
The MTU for the TAP device associated with a virtio-net device is now exposed
to the guest. If the user provides a MTU with --net mtu=.. then that MTU is
applied to created TAP interfaces. This functionality is also exposed for
vhost-user-net devices including those created with the reference backend.
**Boot Tracing**
Support for generating a trace report for the boot time has been added
including a script for generating an SVG from that trace.
**Simplified Build Feature Flags**
The set of feature flags, for e.g. experimental features, have been simplified:
* msvh and kvm features provide support for those specific hypervisors
(with kvm enabled by default),
* tdx provides support for Intel TDX; and although there is no MSHV support
now it is now possible to compile with the mshv feature,
* tracing adds support for boot tracing,
* guest_debug now covers both support for gdbing a guest (formerly gdb
feature) and dumping guest memory.
The following feature flags were removed as the functionality was enabled by
default: amx, fwdebug, cmos and common.
**Asynchronous Kernel Loading**
AArch64 has gained support for loading the guest kernel asynchronously like
x86-64.
**GDB Support for AArch64**
GDB stub support (accessed through --gdb under guest_debug feature) is now
available on AArch64 as well as as x86-64.
**Notable Bug Fixes**
* This version incorporates a version of virtio-queue that addresses an issue
where a rogue guest can potentially DoS the VMM,
* Improvements around PTY handling for virtio-console and serial devices,
* Improved error handling in virtio devices.
**Deprecations**
Deprecated features will be removed in a subsequent release and users should
plan to use alternatives.
* Booting legacy firmware (compiled without a PVH header) has been deprecated.
All the firmware options (Cloud Hypervisor OVMF and Rust Hypervisor Firmware)
support booting with PVH so support for loading firmware in a legacy mode is no
longer needed. This functionality will be removed in the next release.
Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v27.0
Note: To have the new API of loading firmware for booting (e.g. boot
from td-shim), a specific commit revision after the v27.0 release is
used as the Cloud Hypervisor version from the 'versions.yaml'.
Fixes: #5309
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit fe61070426)
- tools: release: fix bogus version check
- Last backport for 2.5.2
- stable-2.5: fix cargo vendor
- stable-2.5 backports
309756db95 release: Adapt kata-deploy for 2.5.2
a818771750 tools: release: fix bogus version check
52993b91b7 runtime: store the user name in hypervisor config
30a8166f4a runtime: make StopVM thread-safe
7033c97cd2 runtime: add more debug logs for non-root user operation
e8ec0c402f stable-2.5: fix cargo vendor
d92ada72de kernel: upgrade guest kernel support to 5.19.2
565fdf8263 kernel: fix for set_kmem_limit error
f174fac0d6 sandbox_test: Add test to verify memory hotplug behavior
928654b5cd sandbox: don't hotplug too much memory at once
1c0e6b4356 hypervisor: Add GetTotalMemoryMB to interface
8f40927df8 kernel: Add CONFIG_CGROUP_HUGETLB=y as part of the cgroup fragments
Signed-off-by: Greg Kurz <groug@kaod.org>
kata-deploy files must be adapted to a new release. The cases where it
happens are when the release goes from -> to:
* main -> stable:
* kata-deploy-stable / kata-cleanup-stable: are removed
* stable -> stable:
* kata-deploy / kata-cleanup: bump the release to the new one.
There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.
Signed-off-by: Greg Kurz <groug@kaod.org>
Shell expands `*"rc"*` to the top-level `src` directory. This results
in comparing a version with a directory name. This doesn't make sense
and causes the script to choose the wrong branch of the `if`.
The intent of the check is actually to detect `rc` in the version.
Fixes: #5283
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 421729f991)
Signed-off-by: Greg Kurz <groug@kaod.org>
The user name will be used to delete the user instead of relying on
uid lookup because uid can be reused.
Fixes: #5155
Signed-off-by: Feng Wang <feng.wang@databricks.com>
(cherry picked from commit f914319874,
fixed minor conflict because stable-2.5 doesn't have commit
ed0f1d0b32)
Signed-off-by: Greg Kurz <groug@kaod.org>
StopVM can be invoked by multiple threads and needs to be thread-safe
Fixes: #5155
Signed-off-by: Feng Wang <feng.wang@databricks.com>
(cherry picked from commit 5cafe21770)
Signed-off-by: Greg Kurz <groug@kaod.org>
Previously the logging was insufficient and made debugging difficult
Fixes: #5155
Signed-off-by: Feng Wang <feng.wang@databricks.com>
(cherry picked from commit c3015927a3)
Signed-off-by: Greg Kurz <groug@kaod.org>
Augment the mock hypervisor so that we can validate that ACPI memory hotplug
is carried out as expected.
We'll augment the number of memory slots in the hypervisor config each
time the memory of the hypervisor is changed. In this way we can ensure
that large memory hotplugs are broken up into appropriately sized
pieces in the unit test.
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
If we're using ACPI hotplug for memory, there's a limitation on the
amount of memory which can be hotplugged at a single time.
During hotplug, we'll allocate memory for the memmap for each page,
resulting in a 64 byte per 4KiB page allocation. As an example, hotplugging 12GiB
of memory requires ~192 MiB of *free* memory, which is about the limit
we should expect for an idle 256 MiB guest (conservative heuristic of 75%
of provided memory).
From experimentation, at pod creation time we can reliably add 48 times
what is provided to the guest. (a factor of 48 results in using 75% of
provided memory for hotplug). Using prior example of a guest with 256Mi
RAM, 256 Mi * 48 = 12 Gi; 12GiB is upper end of what we should expect
can be hotplugged successfully into the guest.
Note: It isn't expected that we'll need to hotplug large amounts of RAM
after workloads have already started -- container additions are expected
to occur first in pod lifecycle. Based on this, we expect that provided
memory should be freely available for hotplug.
If virtio-mem is being utilized, there isn't such a limitation - we can
hotplug the max allowed memory at a single time.
Fixes: #4847
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
It'll be useful to get the total memory provided to the guest
(hotplugged + coldplugged). We'll use this information when calcualting
how much memory we can add at a time when utilizing ACPI hotplug.
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Kata guest os cgroup is not work properly kata guest kernel config option
CONFIG_CGROUP_HUGETLB is not set, leading to:
root@clr-b08d402cc29d44719bb582392b7b3466 ls /sys/fs/cgroup/hugetlb/
ls: cannot access '/sys/fs/cgroup/hugetlb/': No such file or directory
Fixes: #4953
Signed-off-by: Miao Xia <xia.miao1@zte.com.cn>
kata-deploy files must be adapted to a new release. The cases where it
happens are when the release goes from -> to:
* main -> stable:
* kata-deploy-stable / kata-cleanup-stable: are removed
* stable -> stable:
* kata-deploy / kata-cleanup: bump the release to the new one.
There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Kata guest os cgroup is not work properly kata guest kernel config option
CONFIG_CGROUP_HUGETLB is not set, leading to:
root@clr-b08d402cc29d44719bb582392b7b3466 ls /sys/fs/cgroup/hugetlb/
ls: cannot access '/sys/fs/cgroup/hugetlb/': No such file or directory
Fixes: #4953
Signed-off-by: Miao Xia <xia.miao1@zte.com.cn>
(cherry picked from commit 731d39df45)
The root span should exist the duration of the trace. Defer ending span
until the end of the trace instead of end of function. Add the span to
the service struct to do so.
Fixes#4902
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
(cherry picked from commit fcc1e0c617)
In some cases do_create_container may return an error, mostly due to
`container.start(process)` call. This commit will do some rollback
works if this function failed.
Fixes: #4749
Signed-off-by: Bin Liu <bin@hyper.sh>
(cherry picked from commit 09672eb2da)
CI is failing with the issue:
"package `time v0.3.14` cannot be built because it requires rustc 1.59.0
or newer, while the currently active rustc version is 1.58.1"
Fixes: #1000
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Let's ensure we have the latest longterm maintained kernel as part of
our release.
This brings in `CONFIG_SPECULATION_MITIGATIONS=y`.
Fixes: #5051
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The new 'payload' interface now contains the 'kernel' and 'initramfs'
config.
Fixes: #4952
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 3a597c2742)
Highlights from the Cloud Hypervisor release v26.0:
**SMBIOS Improvements via `--platform`**
`--platform` and the appropriate API structure has gained support for supplying
OEM strings (primarily used to communicate metadata to systemd in the guest)
**Unified Binary MSHV and KVM Support**
Support for both the MSHV and KVM hypervisors can be compiled into the same
binary with the detection of the hypervisor to use made at runtime.
**Notable Bug Fixes**
* The prefetchable flag is preserved on BARs for VFIO devices
* PCI Express capabilties for functionality we do not support are now filtered
out
* GDB breakpoint support is more reliable
* SIGINT and SIGTERM signals are now handled before the VM has booted
* Multiple API event loop handling bug fixes
* Incorrect assumptions in virtio queue numbering were addressed, allowing
thevirtio-fs driver in OVMF to be used
* VHDX file format header fix
* The same VFIO device cannot be added twice
* SMBIOS tables were being incorrectly generated
**Deprecations**
Deprecated features will be removed in a subsequent release and users should
plan to use alternatives.
The top-level `kernel` and `initramfs` members on the `VmConfig` have been
moved inside a `PayloadConfig` as the `payload` member. The OpenAPI document
has been updated to reflect the change and the old API members continue to
function and are mapped to the new version. The expectation is that these old
versions will be removed in the v28.0 release.
**Removals**
The following functionality has been removed:
The unused poll_queue parameter has been removed from --disk and
equivalent. This was residual from the removal of the vhost-user-block
spawning feature.
Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v26.0Fixes: #4952
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 50ea071834)
These patches were backported from main branch:
05b2096c0 release: Adapt kata-deploy for 2.5.0
1b930156c build: Fix clh source build as normal user
01c889fb6 runtime: Fix DisableSelinux config
59bd5c2e0 container: kill all of the processes in this container
22c005f55 nydus: upgrade nydus/nydus-snapshotter version
8220e5478 runtime: add unlock before return in sendReq
4f0ca40e0 versions: Update Firecracker version to v1.1.0
da24fd88e clh: Don't crash if no network device is set by the upper layer
ed25d2cf5 versions: Update Cloud Hypervisor to v25.0
dfc1413e4 action: extend commit message line limit to 150 bytes
Depends-on: github.com/kata-containers/tests#5032
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
kata-deploy files must be adapted to a new release. The cases where it
happens are when the release goes from -> to:
* main -> stable:
* kata-deploy-stable / kata-cleanup-stable: are removed
* stable -> stable:
* kata-deploy / kata-cleanup: bump the release to the new one.
There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
While running make as non-privileged user, the make errors out with
the following message:
"INFO: Build cloud-hypervisor enabling the following features: tdx
Got permission denied while trying to connect to the Docker daemon
socket at unix:///var/run/docker.sock: Post
"http://%2Fvar%2Frun%2Fdocker.sock/v1.24/images/create?fromImage=cloudhypervisor%2Fdev&tag=20220524-0":
dial unix /var/run/docker.sock: connect: permission denied"
Even though the user may be part of docker group, the clh build from
source does a docker in docker build. It is necessary for the user of
the nested container to be part of docker build for the build to
succeed.
Fixes#4594
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Enable Kata runtime to handle `disable_selinux` flag properly in order
to be able to change the status by the runtime configuration whether the
runtime applies the SELinux label to VMM process.
Fixes: #4599
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
When a container terminated, we should make sure there's no processes
left after destroying the container.
Before this commit, kata-agent depended on the kernel's pidns
to destroy all of the process in a container after the 1 process
exit in a container. This is true for those container using a
separated pidns, but for the case of shared pidns within the
sandbox, the container exit wouldn't trigger the pidns terminated,
and there would be some daemon process left in this container, this
wasn't expected.
Fixes: #4663
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>