This PR updates the firecracker version to 1.3.3 which includes the following
changes
Fixed passing through cache information from host in CPUID leaf 0x80000006.
A race condition that has been identified between the API thread and the VMM
thread due to a misconfiguration of the api_event_fd.
Fixes#7089
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Kubernetes and Containerd will help calculate the Sandbox Size and pass it to
Kata Containers through annotations.
In order to accommodate this favorable change and be compatible with the past,
we have implemented the handling of the number of vCPUs in runtime-rs. This is
This is slightly different from the original runtime-go design.
This doc introduce how we handle vCPU size in runtime-rs.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
In this commit, we refactored the logic of static resource management.
We defined the sandbox size calculated from PodSandbox's annotation and
SingleContainer's spec as initial size, which will always be the sandbox
size when booting the VM.
The configuration static_sandbox_resource_mgmt controls whether we will
modify the sandbox size in the following container operation.
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Some vmms, such as dragonball, will actively help us
perform online cpu operations when doing cpu hotplug.
Under the old onlineCpuMem interface, it is difficult
to adapt to this situation.
So we modify the semantics of nb_cpus in onlineCpuMemRequest.
In the original semantics, nb_cpus represents the number of
newly added CPUs that need to be online. The modified
semantics become that the number of online CPUs in the guest
needs to be guaranteed.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
The declaration of the cpu number in the cpuset is greater
than the actual number of vcpus, which will cause an error when
updating the cgroup in the guest.
This problem is difficult to solve, so we temporarily clean up
the cpuset in the container spec before passing in the agent.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Update the resource when delete container, which is in
stop_process in runtime-rs.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Updating vCPU resources and memory resources of the sandbox and
updating cgroups on the host will always happening together, and
they are all updated based on the linux resources declarations of
all the containers.
So we merge update_cgroups into the update_linux_resources, so we
can better manage the resources allocated to one pod in the host.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Support vcpu resizing on runtime side:
1. Calculate vcpu numbers in resource_manager using all the containers'
linux_resources in the spec.
2. Call the hypervisor(vmm) to do the vcpu resize.
3. Call the agent to online vcpus.
Fixes: #5030
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Nobody has volunteered to maintain the (currently broken) snap build, so
remove it.
Fixes: #6769.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
After we support memory resize in Dragonball, we need to update
Cargo.lock in runtime-rs.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Adding vhost and vhost-net to the kernel modules. These do not require
any kernel module parameters to be checked. Currently, kernel params is
a required field. Make this as optional. Could make this as <Option>,
but making this a slice instead, as a module could have multiple kernel
params. Refactor the function that checks are for kernel modules into
two with one specifically checking if the module is loaded and other
checking for module parameters.
Refactor some of the tests to take into account these changes.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
This adds the glibc flavor of CLH to the list of assets as preparation
for #6839. Mariner Kata is only tested with glibc.
Fixes: #7026
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
We introduce virtio-balloon device to support memory resize.
virtio-balloon device could reclaim memory from guest to host.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
We introduce virtio-mem device to support memory resize. virtio-mem
device could hot-plug more memory blocks to guest and could also
hot-unplug them from guest.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
As block/direct volume use similar steps of device adding,
so making full use of block volume code is a better way to
handle direct volume.
the only different point is that direct volume will use
DirectVolume and get_volume_mount_info to parse mountinfo.json
from the direct volume path. That's to say, direct volume needs
the help of `kata-ctl direct-volume ...`.
Details seen at Advanced Topics:
[How to run Kata Containers with kinds of Block Volumes]
docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.md
Fixes: #5656
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
In order to support virtio-mem and virtio-balloon devices, we need to
extend DeviceOpContext with VmConfigInfo and InstanceInfo.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
The key aspects of the DM implementation refactoring as below:
1. reduce duplicated code
Many scenarios have similar steps when adding devices. so to reduce
duplicated code, we should create a common method abstracted and use
it in various scenarios.
do_handle_device:
(1) new_device with DeviceConfig and return device_id;
(2) try_add_device with device_id and do really add device;
(3) return device info of device's info;
2. return full info of Device Trait get_device_info
replace the original type DeviceConfig with full info DeviceType.
3. refactor find_device method.
Fixes: #5656
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
After we have a guest kernel with builtin initramfs which
provide the rootfs measurement capability and Kata rootfs
image with hash device, we need set related root hash value
and measure config to the kernel params in kata configuration file.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Integrate initramfs into guest kernel as one binary,
which will be measured by the firmware together.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
The init.sh in initramfs will parse the verity scheme,
roothash, root device and setup the root device accordingly.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Generate rootfs hash data during creating the kata rootfs,
current kata image only have one partition, we add another
partition as hash device to save hash data of rootfs data blocks.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Add required kernel config for dm-crypt/dm-integrity/dm-verity
and related crypto config.
Add userspace command line tools for disk encryption support
and ext4 file system utilities.
Fixes: #6674
Signed-off-by: Arron Wang <arron.wang@intel.com>
This PR updates the link to the correspondent Developer Guide at the
enabling full containerd debug that we have for kata 2.0 documentation.
Fixes#7034
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
When the version of libc is upgraded to 0.2.145, older getrandom could not adapt
to new API, and this will make agent-ctl fail to compile.
We upgrade the version of `rand`, so the low version of getrandom will no longer
need.
Fixes: #7032
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Fixes: #5401, #6654
- Switch kata-ctl from eprintln!()/println!() to structured logging via
the logging library which uses slog.
- Adds a new create_term_logger() library call which enables printing
log messages to the terminal via a less verbose / more human readable
terminal format with colors.
- Adds --log-level argument to select the minimum log level of printed messages.
- Adds --json-logging argument to switch to logging in JSON format.
Co-authored-by: Byron Marohn <byron.marohn@intel.com>
Co-authored-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Jayant Singh <jayant.singh@intel.com>
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Kelby Madal-Hellmuth <kelby.madal-hellmuth@intel.com>
Signed-off-by: Liz Lawrens <liz.lawrens@intel.com>
Github Actions reads and runs workflow files from the main branch,
rather than from the PR branch. This means that PRs that modify workflow
files aren't being tested with the updated workflows coming from the PR,
but rather with the old workflows from the main branch. AFAIK, this
behavior isn't avoidable for workflow files (but is for other scripts).
This makes it very hard to reliably test workflow changes before they're
actually merged into main and leads to issues that we have to hotifx
(see #6983, #6995).
This PR aims to mitigate that by extracting the commands used in
workflows to a separate script file. The way our CI is set up, those
script files are read from the PR branch and thus changes would be
reflected in the CI checks.
Fixes: #6971
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>