The pullimage handler behaves differently for the sandbox and container,
and it uses the annotations to differentiate between the two.
The annotation name used by crio for container-type being different than
the containerd one, we need to check both.
Fixes: #8399
Signed-off-by: Julien Ropé <jrope@redhat.com>
The release candidates for image-rs, AA, and td-shim have
now been tagged with v0.8.0. Point the respective toml
and yaml files to this tag.
Fixes: #8352
Signed-off-by: Chris Porter <porter@ibm.com>
Restricting access to agent endpoints using agent-config.toml is
expected to be deprecated in the main branch. Therefore, in
preparation of merging this script with its main branch version,
install default settings for main branch's kata-opa service.
coco-default.rego blocks access to the same kata agent endpoints
that are blocked by agent-config.toml. For additional information,
search for "default-policy.rego" in main branch's rootfs.sh.
Fixes: #8219
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Bump version of image-rs and reqwest.
Build image-rs with the signature-simple-xrss feature flag.
Fixes: #8221
Signed-off-by: Matthew Arnold <mattarno@uk.ibm.com>
This is to define `SEALED_SECRET` as yes for a make target `cc-rootfs-initrd-tarball`,
which makes a service `confidential-data-hub` available with an initrd-based VM creation.
Fixes: #8210
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
kata-shim CCv0 does not propagate dynamically
updated k8s secret values due to incorrect
file name matching. This patch fixes the the wrong file
name matching for k8s secret volume paths.
Note that this problem has already fixed in the main
branch.
Fixes: #8208
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
We're facing errors in the operator CI, which are related to the
measured rootfs.
For now, let's skip the cache builds (as those were dropped anyways for
this branch) and ensure we do a clean build, and then check if the
problem persists.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The guesthook config was missing which prevented handling
of GPUs with remote hypervisor
Fixes: #8172
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
This is a patch that should **NOT** be forward ported to main, as there
we want to take a cleaner approach on configuring specific snapshotters
for specific runtime handlers.
However, for CC, for the v0.8.0 release of CC, this is good enough as it
is, and it'll allow us to set one snapshotter for all the deployments
done with the CoCo Operator.
This is the Kata Containers counterpart of the work, and there's still
work to be done on the Confidential Containers in order to make it work
as expected, as:
* Confidential Containers Operator has to expose to the users which
snapshotter will be configured
* Confidential Containers Opereator, specifically the pre-install hook,
will have to take care of actually installing and configuring the
snapshotter, so it can be used.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Fix setting the image service policy path when there
is a policy path in the agent config.
Fixes#8049
Signed-off-by: Matthew Arnold <mattarno@uk.ibm.com>
The name of the tarballs changed on main, but we didn't follow up
changing this on the CCv0 branch. :-/
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Fix the arch error when downloading the nydus tarball.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Signed-off-by: Steven Horsman <steven@uk.ibm.com>
(cherry picked from commit f6df3d6efb)
Guest rootfs will aligned to 128M, we may exceed the rootfs
with several megabytes but the rootfs will add 128M.
Fixes: #8009
Signed-off-by: Wang, Arron <arron.wang@intel.com>
As we'll need to test with both the vanilla and the forked versions of
containerd, let's make sure we'll specify both entries as part of our
versions.yaml file, and we can read whatever we need accordingly as part
of our tests jobs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we do not employ a forked containerd, we utilize the KataVirtualVolume
which storing the image url supplied by snapshotter as an integral part of `CreateContainer`.
Within this process, we store the image information in rootfs.storage and pass this image url through `CreateContainerRequest`.
This approach distinguishes itself from the use of `PullImageRequest`, as rootfs.storage is already set and initialized at this stage.
To maintain clarity and avoid any need for modification to the `OverlayfsHandler`,we introduce the `ImagePullHandler`.
This dedicated handler is responsible for orchestrating the image-pulling logic within the guest environment.
This logic encompasses tasks such as calling the image-rs to download and unpack the image into `/run/kata-containers/{container_id}/images`,
followed by a bind mount to `/run/kata-containers/{container_id}`.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Without using forked containerd, the kata-agent wouldn't receive the `PullImageRequest`.
To using nydus-snapshotter, kata-agent can pass the image url and container id to image-rs
to handle pulling image.So we need to redefine functions of pulling image in the guest to support
both PullImageRequest and remote snapshotter.
1) Extract codes for setting proxy environment variables into a separate function `set_proxy_env_vars`.
2) Create a separate function `handle_attestation_agent` to handle attestation agent
initialization.
3) Create a separate function `common_image_pull` for image pull logic.
4) Extract codes for unpacking pause image into a separate function `unpack_pause_image` and pass the necessary parameters to customize the behavior.
Fixes#7790
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
Co-authored-by: jordan9500 <jordan.jackson@ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
This PR is to prevent rootfs.sh from running twice by fixing the typo `initrd-image`.
Fixes: #7980
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
This PR is to remove the build redundancy of `kernel` and `cc-rootfs-initrd` by making `cc-se-image` built based on them at the second build stage.
Fixes: #7949
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
- Switch api-server-rest to use the Makefile rather than
directly calling cargo for multi-platform support and decoupling
Fixes: #7947
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We should configure the Rust environment when AGENT_SOURCE_BIN is empty or AA_KBC is not empty.
Fixes#7877
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
When creating a container with a raw disk image using virtio-blk,
the guest does not have the upper directory and worker directory present.
Therefore, it is necessary to create these directories before mounting the filesystem with overlay.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
We utilize the KataVirtualVolume which storing the dm-verity info
and the path of disk image on the host supplied by snapshotter as an integral part of `CreateContainer`.
Within this process, we copy the verity info and the disk image path to mount slice to create a block device by virtio-blk.
Then storing the `lowerdir` in rootfs.storage which is the mountpoint of the verity path through `CreateContainerRequest`.
To maintain clarity and avoid any need for modification to the `VirtioBlkPciHandler`,we introduce the `DmVerityHandler`.
This dedicated handler is responsible for calling image-rs to create verity device and mount the device to the `lowerdir` within the guest environment.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
1) Creating storage for each `extraoption` in rootFs.Options,
and then aggregates all storages into `containerStorages`.
2) Creating storage for other data volumes and push them into `volumeStorages`.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
To enhance the construction and administration of `Katavirtualvolume` storages,
this commit expands the 'sharedFile' structure to manage both
rootfs storages(`containerStorages`) including `Katavirtualvolume` and other data volumes storages(`volumeStorages`).
NOTE: `volumeStorages` is intended for future extensions to support Kubernetes data volumes.
Currently, `KataVirtualVolume` is exclusively employed for container rootfs, hence only `containerStorages` is actively utilized.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
The snapshotter will place `KataVirtualVolume` information
into 'rootfs.options' and commence with the prefix 'io.katacontainers.volume='.
The purpose of this commit is to transform the encapsulated KataVirtualVolume data into device information.
Fixes#7792
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
Co-authored-by: Feng Wang <feng.wang@databricks.com>
Co-authored-by: Samuel Ortiz <sameo@linux.intel.com>
Co-authored-by: Wedson Almeida Filho <walmeida@microsoft.com>
Add the corresponding data structure in the runtime part according to
kata-containers/kata-containers/pull/7698.
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
- In rust 1.72, clippy warned clippy::non-minimal-cfg
as the cfg has only one condition, so doesn't
need to be wrapped in the any combinator.
Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Allow `clippy::redundant-closure-call`
which has issues with the guard function passed into
the `run_if_auto_values` macro
Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The bindgen generated code is triggering lots of
ambiguous-glob-reexports warnings in rust 1.70+
Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- In rust 1.72, clippy warned clippy::non-minimal-cfg
as the cfg has only one condition, so doesn't
need to be wrapped in the all combinators.
Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Allow `clippy::redundant-closure-call` in `from_cmdline`
which has issues with the guard function passed into
the `parse_cmdline_param` macro
Fixes: #7902
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
1. Directly support CgroupManager::freeze through systemd API.
2. Avoid always passing unit_name by storing it into DBusClient.
3. Realize CgroupManager::destroy more accurately by killing systemd unit rather than stop it.
4. Ignore no such unit error when destroying systemd unit.
5. Update zbus version and corresponding interface file.
Acknowledgement: error handling for no such systemd unit error refers to
Fixes: #7080, #7142, #7143, #7166
Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(cherry picked from commit 470d065415)
All the patches have already been merged upstream and they've just been
cherry-picked to this branch.
Fixes: #7885
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit fde34610cd)
Conflicts:
tools/packaging/kernel/kata_config_version
We're bumping here in order to make our lives easier backporting EROFS
patches needed for the CC related work.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit dc6a4588a2)
If 'rest_api' is configured, start api-server-rest after
attestation-agent and confidential-data-hub.
Fixes: #7555
Signed-off-by: Biao Lu <biao.lu@intel.com>
Add configuration for 'rest api server'.
Optional configurations are
'agent.rest_api=attestation' will enable attestation api
'agent.rest_api=resource' will enable resource api
'agent.rest_api=all' will enable all (attestation and resource) api
Fixes: #7555
Signed-off-by: Biao lu <biao.lu@intel.com>
confidential-data-hub depends attestation-agent, and
confidential-data-hab need to start before rpc server, so move the
function 'init_attestation_agent' from image_rpc.rs to main.rs and
launch confidential-data-hub after 'init_attestation_agent'.
Fixes: #7544
Signed-off-by: Biao Lu <biao.lu@intel.com>
This PR is to skip installing docker-compose-plugin while buiding a `build-kata-deploy` image for s390x|ppc64le.
It is a temporary solution to fix current CI failures for s390x regarding `hash sum mismatch`.
Fixes: #7848
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
(cherry picked from commit 2efda20c77)
This kernel update is needed in order to get the latest and greatest
commits related to EROFS, which will be used for allowing sharing the
container images between the guest and host for Confidential Containers
using the tarfs mode of EROFS.
We're removing a few options here, because:
* SECURITY_SELINUX_CHECKREQPROT_VALUE was deprecated as part of
a7e4676e8e2c.
* CONFIG_IP_NF_TARGET_CLUSTERIP was removed as part of 9db5d918e2c0.
* CONFIG_NET_SCH_CBQ was removed as part of 051d44209842.
Fixes: #7845
Backports: #7846
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When set SEALED_SECRET to "yes", the kata-agent is built with
sealed-secret capability, default value is "no".
Fixes: #7544
Signed-off-by: Biao Lu <biao.lu@intel.com>
When a storage device is used by more than one container, the second
and forth instances will cause storage device reference count leakage,
thus cause storage device leakage. The reason is:
add_storages() will increase reference count of existing storage device,
but forget to add the device to the `mount_list` array, thus leak the
reference count.
Fixes: #7820
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Today I learned, I must say.
When running a basic script, such as:
```bash
#/usr/bin/env bash
set -o errexit
set -o pipefail
set -o errtrace
cat junk && echo "hello"
echo "didn't fail"
cat junk
echo "hello"
echo "didn't fail"
```
One will get as a result:
```bash
cat: junk: No such file or directory
didn't fail
cat: junk: No such file or directory
```
Meaning that although there was an error on `cat junk && echo "hello"`,
and the `echo "hello"` part was not executed, an error was not reported
for that failure.
On the second part, though, it just breaks and returns an error as
expected.
Small scripts aside, this is exactly what was happening with the
attestation-agent, where a `make ... && make install ...` was being
called, make was failing but not actually breaking the script.
Let's change the logic and avoid such situations in the future, as it
caused our CI to be broken for quite some time without a simple way to
detect that line in the huge amount of logs left behind.
Here goes a reference to the documentation:
```
-e Exit immediately if a pipeline (which may consist
of a single simple command), a list, or a compound
command (see SHELL GRAMMAR above), exits with a
non-zero status. The shell does not exit if the
command that fails is part of the command list
immediately following a while or until keyword,
part of the test following the if or elif reserved
words, part of any command executed in a && or ||
list except the command following the final && or
||, any command in a pipeline but the last, or if
the command's return value is being inverted with
!. If a compound command other than a subshell
returns a non-zero status because a command failed
while -e was being ignored, the shell does not
exit. A trap on ERR, if set, is executed before
the shell exits. This option applies to the shell
environment and each subshell environment
separately (see COMMAND EXECUTION ENVIRONMENT
above), and may cause subshells to exit before
executing all the commands in the subshell.
If a compound command or shell function executes
in a context where -e is being ignored, none of
the commands executed within the compound command
or function body will be affected by the -e
setting, even if -e is set and a command returns a
failure status. If a compound command or shell
function sets -e while executing in a context
where -e is ignored, that setting will not have
any effect until the compound command or the
command containing the function call completes.
```
This comes from https://www.man7.org/linux/man-pages/man1/bash.1.htmlFixes: #7793
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Bump image-rs and attestation-agent to use the latest guest-components
with the rust clap version fix
Fixes: #7580
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Refine storage related code by:
- remove the STORAGE_HANDLER_LIST
- define type alias
- move code near to its caller
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Introduce StorageDevice and StorageHandlerManager, which will be used
to refine storage device management for kata-agent.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Simplify the way to manage storage objects, and introduce
StorageStateCommon structures for coming extensions.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
docker install now creates a group with gid 999 which happens to match what we
need to get docker-in-docker to work. Remove the group first as we don't need
it.
Fixes: #7726
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 3b881fbc0e)
The directory is a host path mount and cannot be removed from within the
container. What we actually want to remove is whatever is inside that
directory.
This may raise errors like:
```
rm: cannot remove '/opt/kata/': Device or resource busy
```
Fixes: #7746
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Introduce structure KataVirtualVolume to to encapsulate information
for extra mount options and direct volumes, so we could build a common
infrastructure to handle these cases.
Fixes: #7699
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
We can simply use `rm -f` all over the place and avoid the container
returning any error.
Fixes: #7733
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 5cba38c175)
The default `kata` runtime class would get created with the `kata`
handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong
hypervisor and broke CI.
Fixes: #7681
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Without this library the builds are failing with the following error:
```
...
error: failed to run custom build command for `devicemapper-sys v0.1.5`
Caused by: process didn't exit successfully:
`/kata-containers/src/agent/target/release/build/devicemapper-sys-d8eae524a127e049/build-script-build`
(exit status: 101) --- stderr thread 'main' panicked at 'Unable to
find libclang: "couldn't find any valid shared libraries matching:
['libclang.so', 'libclang-*.so', 'libclang.so.*', 'libclang-*.so.*'],
set the `LIBCLANG_PATH` environment variable to a path where one of
these files can be found (invalid: [])"',
/root/.cargo/registry/src/github.com-1ecc6299db9ec823/bindgen-0.63.0/./lib.rs:2338:31
```
Fixes: #7580
Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Add k0s support to kata-deploy, in the very same way kata-containers
already supports k3s, and rke2.
k0s support requires v1.27.1, which is noted as part of the kata-deploy
documentation, as it's the way to use dynamic configuration on
containerd CRI runtimes.
This support will only be part of the `main` branch, as it's not a bug
fix that can be backported to the `stable-3.2` branch, and this is also
noted as part of the documentation.
Fixes: #7548
Signed-off-by: Steve Fan <29133953+stevefan1999-personal@users.noreply.github.com>
The GHA runners are not exactly powerful, which makes the static-checks
take way too long (almost an hour).
Let's give a try and move those to the same size of Azure instances used
as part of our CI, and probably have this time reduced.
Fixes: #7446
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Specify the supported version of Firecracker using our `versions.yaml`
to improve the maintainability of the documentation.
Fixes: #7610
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
This patch upgrades Firecracker version from v1.1.0 to v1.4.0.
* Generate swagger models for v1.4.0 (from `firecracker.yaml`)
- The version of go-swagger used is v0.30.0
* The firecracker v1.4.0 includes the following changes.
- Added
* Added support for custom CPU templates allowing users to adjust vCPU features
exposed to the guest via CPUID, MSRs and ARM registers.
* Introduced V1N1 static CPU template for ARM to represent Neoverse V1 CPU
as Neoverse N1.
* Added support for the virtio-rng entropy device. The device is optional. A
single device can be enabled per VM using the /entropy endpoint.
* Added a cpu-template-helper tool for assisting with creating and managing
custom CPU templates.
- Changed
* Set FDP_EXCPTN_ONLY bit (CPUID.7h.0:EBX[6]) and ZERO_FCS_FDS bit
(CPUID.7h.0:EBX[13]) in Intel's CPUID normalization process.
- Fixed
* Fixed feature flags in T2S CPU template on Intel Ice Lake.
* Fixed CPUID leaf 0xb to be exposed to guests running on AMD host.
* Fixed a performance regression in the jailer logic for closing open file
descriptors.
* A race condition that has been identified between the API thread and the VMM
thread due to a misconfiguration of the api_event_fd.
* Fixed CPUID leaf 0x1 to disable perfmon and debug feature on x86 host.
* Fixed passing through cache information from host in CPUID leaf 0x80000006.
* Fixed the T2S CPU template to set the RRSBA bit of the IA32_ARCH_CAPABILITIES
MSR to 1 in accordance with an Intel microcode update.
* Fixed the T2CL CPU template to pass through the RSBA and RRSBA bits of the
IA32_ARCH_CAPABILITIES MSR from the host in accordance with an Intel microcode
update.
* Fixed passing through cache information from host in CPUID leaf 0x80000005.
* Fixed the T2A CPU template to disable SVM (nested virtualization).
* Fixed the T2A CPU template to set EferLmsleUnsupported bit
(CPUID.80000008h:EBX[20]), which indicates that EFER[LMSLE] is not supported.
Fixes: #7610
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
This PR computes average results for TF bench.
Additionally, it improves the data parsing from
all running containers.
Fixes: #7603
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
The build context should be the folder where the Dockerfile is present,
otherwise the files copied into the image won't be found.
Fixes: #7595
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Following the example on https://github.com/docker/build-push-action,
it's clear that the actions to "Set up QEMU" and "Set up Docker Buildx"
are missing.
Let's add them, and also take the advantage to bump the
build-push-action to its v4, which, by the way, had a typo on its name
(build-and-push-action does **NOT** exist, build-push-action does).
Fixes: #7595
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Right now this is not being used, but it'll as the image generated for
the confidential tests have that as part of their tag.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
One of the joys to rely on the `pull_request_target` is to only be able
to catch those after those are merged.
Fixes: #7595
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
With these 2 simple checks we can ensure that we do not regress on the
behaviour of allowing the runtime classes / default runtime class to be
created by the kata-deploy payload.
Fixes: #7491
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add here the image we'll be using for unencrypted confidential
tests. Later on, we'll make sure to build and use this image as part of
our CI.
The image can easily be built as a multi-arch image, and has `cpuid`
installed in case of `x86_64` build, so it can be used to detect whether
we're running on a TEE guest without having to rely on `dmesg | grep
...`.
Fixes: #7595
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We don't have to do any sed to replace the runtimeclass being used by
the moment we start taking advantage of the `DEFAULT_SHIM` environment
variable exposed merged in the previous commits.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
There are many places where the code currently creates new `Vec`
instances when it's not really needed. The result is a perf hit because
it allocates memory, copies all elements, then frees the memory; in some
cases, copying elements also involves extra allocations (e.g., when
elements are strings, or structs containing strings).
This patch addresses a number of these cases.
Fixes: #7203
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Instead of using package manager to install bats, building
this from source. This gives us the updated version of bats
which supports functions such as setup_file and
teardown_file.
We can use these functions into our current tests.
Fixes: #7597
Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
This PR changes the metrics workflow in order to just install
kata once, and run the checks for multiple hypervisor variations.
In this way we save time avoiding installing kata for each
hypervisor to be tested.
Fixes: #7578
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This pull request is mainly for updating vm-memory and vmm-sys-util.
The affacted crates include:
- vm-memory: from 0.9.0 to 0.10.0
- vmm-sys-util: from 0.10.0 to 0.11.0
- virtio-queue: from 0.6.0 to 0.7.0
- fuse-backend-rs: from 0.10.4 to 0.10.5
- linux-loader: from 0.6.0 to 0.8.0
- nydus-api: from 0.3.0 to 0.3.1
- nydus-rafs: from 0.3.1 to 0.3.2
- nydus-storage: from 0.6.3 to 0.6.4
Fixes: #0000
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
(cherry picked from commit b23c5ed155)
- Add support for setting the sandbox name and namespace
in the hypervisor config, which is needed in the remote hypervisor
implementation to get the pod name and namespace for the remote pod
create request
Fixes: #7588
Co-authored-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Co-authored-by: Yohei Ueda <yohei@jp.ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Refine implementation of mount by:
- log message with `path.display()` instead of `{:?}`
- add prefix "_" to unused variables
- pass by reference instead of by value to avoid creating redundant
array
- exactly matching prefix "fsgid=" instead of "fsgid"
- avoid redundant clone() operations
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
There's a bug in function update_ephemeral_mounts() which only handles
the first storage object and ignores all other storage objects.
Fixes: #7551
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Simplify function online_cpu_memory() by on calling update_cpuset_path()
for containers with cpuset configured.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Refine style of code related to sandbox by:
- remove unnecessary comments for caller to take lock, we have already taken
`&mut self`.
- change "*count < 1 " to "*count == 0", `count` is type of u32.
- make remove_sandbox_storage() to take `&mut self` instead of `&self`.
- group related function to each others
- avoid search the map twice in function find_process()
- avoid unwrap() in function run_oom_event_monitor()
- avoid unwrap() in online_resources()
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Avoid unwrap() in function do_remove_container(), and also make
implmementation symmetric for both timeout and non-timeout cases.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Optimize agent rpc implementation by:
- avoid clone objects when possible
- avoid unwrap() when possible
- explictly drop object to ensure order
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
This pull request is mainly for updating vm-memory and vmm-sys-util.
The affacted crates include:
- vm-memory: from 0.9.0 to 0.10.0
- vmm-sys-util: from 0.10.0 to 0.11.0
- virtio-queue: from 0.6.0 to 0.7.0
- fuse-backend-rs: from 0.10.4 to 0.10.5
- linux-loader: from 0.6.0 to 0.8.0
- nydus-api: from 0.3.0 to 0.3.1
- nydus-rafs: from 0.3.1 to 0.3.2
- nydus-storage: from 0.6.3 to 0.6.4
Fixes: #0000
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
This PR renames the mobilenet tensorflow test to have a more specific
tensorflow name mainly because tensorflow has different configurations
and we will add more tensorflow tests so we want to distinguish each
tensorflow test.
Fixes#7571
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds the iperf network metrics to the github actions
for kata metrics.
Fixes#7535
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
It's CCv0 specific for now, and it's needed as the Operator is now
delegating the runtimeclass creation to the kata-deploy daemonset.
Fixes: #7550
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It's CCv0 specific for now, and it's needed as the Operator is now
delegating the runtimeclass creation to the kata-deploy daemonset.
Fixes: #7550
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It's CCv0 specific for now, and it's needed as the Operator is now
delegating the runtimeclass creation to the kata-deploy daemonset.
Fixes: #7550
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Steve pointed this out, and I was able to get it fixed as part of
cc-payload-amd64.yaml but I missed the cc-payload-after-push-amd64.yaml
one.
Fixes: #7433
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
These calls cause two extra atomic instructions each time they're used,
one to increment and another one to decrement the refcount.
Since we don't need them because the referred value is guaranteed to
outlive the function, remove the calls.
Fixes: #7190
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
When the mounted block device isn't a layer, we want to mount it into
containers, but since it's already mounted with the correct fs (e.g.,
tar, ext4, etc.) in the pod, we just bind-mount it into the container.
Fixes: #7536
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
When at least one `io.katacontainers.fs-opt.layer` option is added to
the rootfs, it gets inserted into the VM as a layer, and the file system
is mounted as an overlay of all layers using the overlayfs driver.
Additionally, if the `io.katacontainers.fs-opt.block_device=file` option
is present in a layer, it is mounted as a block device backed by a file
on the host.
Fixes: #7536
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
This causes the overlay-fs driver to add the `upperdir` and `workdir`
options to an overlay-fs mount so that the mount becomes writable using
a discardable directory under the container id.
Fixes: #7536
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
This is so that file systems don't fail when we pass kata-specific
options from the snapshotter to kata.
Fixes: #7536
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Version 0.10.5, which was just released, breaks `nydus-storage`.
This is a workaround to fix the CI which is blocking other PRs.
Fixes: #7541
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Allow `clippy::redundant_clone` in the agent's unit tests
because rustc>=1.70 shows the errors as false-negatives.
These `clone()` are required because the following codes
refer to the variable, but the clippy analyzes them by mistake,
using the conservative and limited approach.
Ref. https://rust-lang.github.io/rust-clippy/master/index.html#/redundant_cloneFixes: #7534
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
As 3.2.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup
tags back to "latest", and re-add the kata-deploy-stable and the
kata-cleanup-stable files.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Kata containers as VM-based containers are allowed to run in the host
netns. That is, the network is able to isolate in the L2. The network
performance will benefit from this architecture, which eliminates as many
hops as possible. We called it a Directly Attachable Network (DAN for
short).
The network devices are placed at the host netns by the CNI plugins. The
configs are saved at {dan_conf}/{sandbox_id}.json in the format of JSON,
including device name, type, and network info. At the very beginning stage,
the DAN only supports host tap devices. More devices, like the DPDK, will
be supported in later versions.
The format of file looks like as below:
```json
{
"netns": "/path/to/netns",
"devices": [{
"name": "eth0",
"guest_mac": "xx:xx:xx:xx:xx",
"device": {
"type": "vhost-user",
"path": "/tmp/test",
"queue_num": 1,
"queue_size": 1
},
"network_info": {
"interface": {
"ip_addresses": ["192.168.0.1/24"],
"mtu": 1500,
"ntype": "tuntap",
"flags": 0
},
"routes": [{
"dest": "172.18.0.0/16",
"source": "172.18.0.1",
"gateway": "172.18.31.1",
"scope": 0,
"flags": 0
}],
"neighbors": [{
"ip_address": "192.168.0.3/16",
"device": "",
"state": 0,
"flags": 0,
"hardware_addr": "xx:xx:xx:xx:xx"
}]
}
}]
}
```
Fixes: #1922
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
This commit provides a new way to name the containers used
in the launch-times-test in this form:
'kata_launch_times_RANDOM_NUMBER', where RANDOM_NUMBER is
in the 0-1000 range.
Fixes: #7529
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
- ci-on-push: Make the CI also run for the stable-* branches
- ci: k8s: Do not fail when gathering info on AKS nodes
- kata-deploy: enable cross build for non-x86
- runtime-rs: add support for gather metrics in runtime-rs
- kata-ctl: add monitor subcommand for runtime-rs
- release: release-note.sh: Fix typos and reference to images
- metrics: Add sysbench performance test
- Simplify implementation of runtime-rs/service
6ad16d497 release: Adapt kata-deploy for 3.2.0-rc0
025596b28 ci-on-push: Make the CI also run for the stable-* branches
7ffc0c122 static-build: enable cross build for qemu
35d6d86ab static-build: enable cross-build for image build
2205fb9d0 static-build: enable cross build for virtiofsd
11631c681 static-build: enable cross build for shim-v2
7923de899 static-build: cross build kernel
e2c31fce2 kata-deploy: enable cross build for kata deploy script
2fc5f0e2e kata-depoly: prepare env for cross build in lib.sh
f5e9985af release: release-note.sh: Fix typos and reference to images
f910c66d6 ci: k8s: Do not fail when gathering info on AKS nodes
632818176 metrics: Add k8s sysbench documentation
b3901c46d runtime-rs: ignore errors during clean up sandbox resources
5a1b5d367 metrics: Add sysbench pod yaml
ad413d164 metrics: Add sysbench dockerfile
151256011 metrics: Add sysbench performance test
62e328ca5 runtime-rs: refine implementation of TaskService
458e1bc71 runtime-rs: make send_message() as an method of ServiceManager
1cc1c81c9 runtime-rs: fix possibe bug in ServiceManager::run()
1a5f90dc3 runtime-rs: simplify implementation of service crate
731e7c763 kata-ctl: add monitor subcommand for runtime-rs The previous kata-monitor in golang could not communicate with runtime-rs to gather metrics due to different sandbox addresses. This PR adds the subcommand monitor in kata-ctl to gather metrics from runtime-rs and monitor itself.
d74639d8c kata-ctl: provide the global TIMEOUT for creating MgmtClient
02cc4fe9d runtime-rs: add support for gather metrics in runtime-rs
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
kata-deploy files must be adapted to a new release. The cases where it
happens are when the release goes from -> to:
* main -> stable:
* kata-deploy-stable / kata-cleanup-stable: are removed
* stable -> stable:
* kata-deploy / kata-cleanup: bump the release to the new one.
There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
when interacting with systemd. We have occasionally faced issues with
compatibility between the systemctl version used inside the kata-deploy
container and the systemd version on the host. Instead of using a containerized
systemctl with bind mounted sockets, nsenter the host and run systemctl from
there. This provides less coupling between the kata-deploy container and the
host.
Fixes: #7511
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
As we only support one stable branch, it'll be used as part of the
stable-3.2 and onwards.
Fixes: #7518
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Depends on mutiarch feature of ubuntu, we can set up cross build
environment easily and achive as good build performance as native
build.
Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
It's too long a time to cross build agent based on docker buildx, thus
we cross build rootfs based on a container with cross compile toolchain
of gcc and rust with musl libc. Then we get fast build just like native
build.
rootfs initrd cross build is disabled as no cross compile tolchain for
rust with musl lib if found for alpine and based on docker buildx takes
too long a time.
Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Based on messense/rust-musl-cross which offer cross build musl lib
environment to cross compile virtiofsd.
Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
shim-v2 has go and rust code. For rust code, we use messense/rust-musl-cross
to build for speed up as it doesn't depends on qemu emulation. Build go
code based on docker buildx as it doesn't support cross build now.
Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
kata-deploy-binaries-in-docker.sh is the entry to build kata components.
set some environment to facilitate the following cross build work.
Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
We leverage three env, TARGET_ARCH means the buid target tuple;
ARCH nearly the same meaning with TARGET_ARCH but has been widely
used in kata; CROSS_BUILD means if you want to do cross compile.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
diferent -> different
And also let's make sure we escape the backticks around the kata-deploy
environment variables, otherwise bash will try to interpret those.
Fixes: #7497
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Otherwise the VM deletion may not delete, leaving us with several
machines behind.
Fixes: #7509
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This requires the GITHUB_UPLOAD_TOKEN. While we're here, let's also fix
the name of the action and remove the "-tarball" suffix, as it's not
really a tarball.
Fixes: #7497
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
`stage` has been added, but only hooked up to the amd64 logic, leaving
arm64 and s390x behind.
Let's fix this right now, and make sure no error occurs when passing
this down to the yaml files.
Fixes: #7497
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 7c857d38c1.
I've misunderstood the error given by github action, let's fix this in
the next commit.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Otherwise we'll face the following error as part of our GHA:
```
The workflow is not valid.
kata-containers/kata-containers/.github/workflows/release-$foo.yaml
(Line: 13, Col: 14): Invalid input, stage is not defined in the
referenced workflow.
```
Fixes: #7497
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure we can rely on the tests passing down whether they want
to be tested using Node Feataure Discovery or not.
Right now, only the TDX job has this option set to "true", all the other
jobs have this option set to "false".
We can and have to merge this one before merging the NFD related patches
as:
1) It causes no harm in exporting this environment variable, but not
having it used
2) It will allow us to test the NFD after this one is merged, as changes
in the yaml file, in the case of the pull_request_target event, are
not taken into consideration before they're merged
Fixes: #7495
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
If modeVFIO is enabled we need 1st to attach the VFIO control group
device /dev/vfio/vfio an 2nd the actuall device(s) afterwards.Sort the
devices starting with device #1 being the VFIO control group device and
the next the actuall device(s)
/dev/vfio/<group>
Fixes: #7493
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
- tests: Add `k8s-volume` and `k8s-file-volume` tests to GHA CI
- metrics: Update boot time for kata metrics
- metrics: Add FIO report files for kata metrics
- kata-deploy: Allow runtimeclasses to be created by the daemonset
- runtime-rs: change block index to 0
- agent: fix typo in constant
- metrics: Add FIO benchmark for metrics tests
- gha: dragonball: Run only on the dragonball labeled machine
- tests: Fix `k8s-job` test
- agent,libs: Remove unused 'mut' keywords
- runtime-rs: remove unneeded 'mut' keywords
- tests: QoL improvements for running tests locally
- agent: exclude symlinks from recursive ownership change
- cache: kernel: Fix kernel caching
- runk: Add Docker guide to README
- metrics: General improvements to json.bash script
- kata-deploy: Allow shim creation based on what's passed to the daemonset
- gha: ci: Add skeleton of vfio job
- s390x: Fixing device.Bus assignment
- release: Mention the container images used to build the project
- kata-deploy-binaries: kernel_cache: Take module_dir into account
- ci: nydus: Fix typo in "source"
- gha: ci: Add no-op nydus tests to our CI
- Dragonball: migrate dragonball-sandbox crates to Kata
- ci: gha: Add cri-containerd tests (but still do not enable them)
- packaging/tools: Add kata-debug and use it as part of our CI
- cache: kernel: Consider changes in tools/packaging/kernel
- kata-deploy: Properly get the path of the versions.yaml file
- kata-deploy: Add VERSION and versions.yaml to the final tarball
- metrics: Add C-Ray performance test
- metrics: enable TensorFlow benchmark to be run on gha
- metrics: Add function to memory inside container script
- Revert "metrics: Replace backslashes used to escape double quoted key in jq expr"
- versions: Bump virtiofsd to v1.7.0
- metrics: stop hypervirsor and shim at init_env stage
- ci: k8s: Adapt "source ..." to the new location of gha-run.sh
- ci: Move `tests/integration/gha-run.sh` to `tests/integration/kuberentes/` ... and also remove KUBECONFIG from the tdx envs
- versions: Update kernel to version v6.1.x
- agent: Fix exec hang issues with a backgroud process
- agent: Ignore already mounted dev/fs/pseudo-fs
- ci: k8s: Bring TDX tests back
- metrics: Update machine learning documentation
- gha: ci: cri-containerd: Fix KATA_HYPERVSIOR typo
- tests: Add MobileNet Tensorflow performance benchmark
- metrics: replace backslashes used to escape double quoted jq key expr.
- runtime-rs: enhancement of Device Manager for network endpoints.
- feat(Tracing): tracing in Rust runtime
- runtime-rs: ignore unconfigured network interfaces
- metrics: Stop running kata-env before kata is properly installed.
- metrics: use rm -f to remove the oldest continerd config file.
- kernel: Update kernel config name
- kata-deploy: Add a debug option to kata-deploy (and also use it as part of our CI)
- runtime-rs: add parameter for propagation of (u)mount events
- kata-ctl: Move GuestProtection code to kata-sys-util
- tests: Add function before function name in common.bash for metrics
- tests: Add metrics storage documentation
- metrics: Fix metrics ts generator to treat numbers as decimals
- gha: ci: Add cri-containerd tests skeleton -- follow up 1
- dragonball/agent: Add some optimization for Makefile and bugfixes of unit tests on aarch64
- metrics: Enable blogbench test
- tests: Add machine learning performance tests
- tests: gha: ci: Add cri-containerd tests skeleton
- metrics: Enable memory inside container metrics
- tools: Use a consistent target name when building mariner initrd
- gha: ci: Gather info about the node / pods
- runtime-rs: Do not scan network if network model is "none"
- gha: k8s: tdx: Temporarily disable TDX tests
- metrics: Update memory usage script
- gha: Cancel previous jobs if a PR is updated
- gha: nightly: Fix long name of AKS clusters issue and make the CI easier to test
- README: Add badge for our Nightly CI
- gha: Do not run all the tests if only docs are updated
- bugfix: plus default_memory when calculating mem size
- gha: ci: Use github.sha to get the last commit reference
- dragonball: Don't fail if a request asks for more CPUs than allowed
- gha: ci: Fix refernce passed to checkout@v3
- gha: ci: Avoid using env also in the ci-nightly and payload-after-push
- gha: k8s: Ensure cluster doesn't exist before creating it
- gha: ci: More follow up fixes after adding a nightly CI
- tests: Enable running k8s tests on Mariner
- gha: ci: Avoid using env unless it's really needed
- gha: ci: Follow up fixes for the nightly jobs
- tests: Enable memory usage metrics tests
- gha: Add nightly jobs
- metrics: storing metrics workflow artifacts
- gha: k8s: Ensure tests are running on a specific namespace
- metrics: Adds blogbench and webtool metrics tests
- gha: dragonball: Correctly propagate PATH update
- versions: Upgrade to Cloud Hypervisor v33.0
- Convert `is_allowed`, `ttrpc_error` and `sl` to functions
- gha: release: Use a specific release of hub
- metrics: Add checkmetrics to gha-run.sh for metrics CI
- packaging: Fix indentation of build.sh script at ovmf
- doc: Add documentation for the virtualization reference architecture
- gpu: Update kernel building to the latest changes
- runtime: fix PCIe topology for GPUDirect use-case
- metrics: Add memory footprint tests
- runtime: Add "none" as a shared_fs option
- metrics: Uniformity across function names in gha-run.sh
- runtime-rs: support physical endpoint using device manager
- runtime-rs: bugfix for direct volume path's validation.
- metrics: Fix retrieving hypervisor version on metrics
- runtime-rs: fix build error on AArch64
- checkmetrics: Add checkmetrics makefile and documentation
- docs: Add boot time metrics documentation
- runtime-rs: add support spdk/vhost-user based volume.
- static-build: Remove kata-version parameter
- dragonball: avoid obtaining lock twice in create_stdio_console
- metrics: Add checkmetrics for kata metrics CI
- metrics: enable launch-times test on gha-run metrics script
- docs: Add general metrics documentation
- add support vfio device manager
- gha: Don't automatically trigger CI
- kata-ctl: Check for vm capability
- docs: fix spelling of "crate"
- packaging: Fix indentation in init.sh script
- gha: Fix gha actions
- metrics: install kata and launch-times test
- tests: Move tests helper script to this repo
- tests: Add json script for metrics tests
- Cherry pick initramfs caching updates from CCv0
- gha: Fix format for run launchtimes metrics yaml
- tests: Add tests lib common script
- Fix deprecated virtiofsd args (go shim only)
- gha: Add base branch on SHA on pull requst
- gha: ci-on-push: Run metrics tests
- docs: Update Developer Guide
- runtime-rs: Enhance flexibility of virtio-fs config
- versions: Update firecracker version to 1.3.3
- tools: Fix no-op builds
- runtime-rs: update Cargo.lock
- gha: Fix `stage` definition in matrix
- feat(runtime): vcpu resize capability
- packaging: Remove snap package
- gha: Add new build targets for Mariner
- Dragonball: support resize memory
- Port Measured rootfs feature from CCv0 branch to main
- add support direct volume and refactor device manager
- gha: Fix gha-run.sh and unbreak CI
- kata-ctl: Switch to slog logging; add --log-level and --json-logging arguments
- log-parser: Update log parser link at README
- gha: aks: Extract `run` commands to a script
- runtime-rs: handle copy files when share_fs is not available
- agent-ctl: fix the compile error
- agent: fix the issue of exec hang with a backgroud process
- runtime-rs: bugfix: update Cargo.lock
- gha: aks: Use short SHA in cluster name
- README: Display badge for the "Publish Artefacts" job and update the Kata Containers logo
- kata-deploy: Change how we get the Ubuntu k8s key
- gha: aks: Ensure host_os is used everywhere needed
- kubernetes: add agnhost command in pod yaml
- main | release: Standardize kata static file name
- packaging: make BUILDER_REGISTRY configurable
- gha: aks: Add the host_os as part of the aks cluster's name
- kernel: Modify build-kernel.sh to accomodate for changes in version.yaml
- gha: Fix Mariner cluster creation
- gha: Unbreak CI and fix cluster creation step
- Dragonball: support vcpu hotplug on aarch64
- runtime-rs/sandbox_bindmounts: add support for sandbox bindmounts
- runtime-rs/kata-ctl: Enhancement of DirectVolumeMount.
- gha: Create Mariner host as part of k8s tests
- netlink: Fix the issue of update_interface
- gha: Increase timeout for AKS jobs and give more time to start running the tests
- runtime: sending SIGKILL to qemu
- dragonball: convert BlockDeviceMgr and VirtioNetDeviceMgr functions to methods
- dragonball: Remove virtio-net and vsock devices gracefully
- kata-deploy: Improve shim backup / restore
- doc: Update git commands
- kata-deploy: Fix indentation on kata deploy merge script
8353aae41 ci: k8s: Rework get_nodes_and_pods_info()
6ad5d7112 ci: k8s: Do not gather node info before running the tests
5261e3a60 ci: k8s: Group messages to improve readability
9cc6b5f46 ci: k8s: Get logs from kata-deploy
9d285c622 ci: k8s: Let kata-deploy take care of the runtimeclasses
87568ed98 gha: Test split out runtimeclasses are in sync with all-in-one file
39192c608 kata-deploy: Print variables passed to the script
0e157be6f kata-deploy: Allow runtimeclasses to be created by the daemonset
a27433324 kata-deploy: Change default values of DEBUG
69535b808 kata-deploy: runtimeclass: Split out entries
9e1710674 kata-runtimeClasses: Alphabetically sort the enrties
6222bd910 tests: Add k8s-file-volume test
187a72d38 tests: Add k8s-volume test
0c8427035 metrics: Add boot time value for qemu
6520dfee3 metrics: Update boot time for kata metrics
ff2279061 metrics: Update runtime and configuration paths
a5d4e3388 metrics: Add compare virtiofsd dax script
5e937fa62 metrics: Update general FIO tests
b0bea47c5 metrics: Add makefile to report generator
73c57b9a1 metrics: Add FIO report files for kata metrics
c8fcd29d9 runtime-rs: use device manager to handle virtio-pmem
901c19225 runtime-rs: support configure vm_rootfs_driver
5d6199f9b runtime-rs: use device manager to handle vm rootfs
20f1f62a2 runtime-rs: change block index to 0
662f87539 metrics: Add general FIO makefile
c5a87eed2 tests: gha: Add timeout to cluster creation
6daeb08e6 tests: k8s: Clean up node debuggers after running
3aa6c77a0 gha: dragonball: Run only on the dragonball labeled machine
37641a543 metrics: Add example config for fio jobs
314aec73d agent: fix typo in constant
4703434b1 tests: k8s: Allow using custom resource group
350f3f70b tests: Import `common.bash` in `run_kubernetes_tests.sh`
d7f04a64a tests: k8s: Leave `runtimeclass_workloads/` alone
bdde6aa94 tests: k8s: Split deployment and testing commands
91a0b3b40 tests: aks: Simply delete cluster when cleaning up
3c1044d9d metrics: Update FIO paths for k8s runner
6177a0db3 metrics: Add env files for FIO
a45900324 metrics: Add fio exec
ea198fddc metrics: Add FIO runner k8s
8f7ef41c1 metrics: Add FIO vendor code
6293c17bd metrics: Add FIO benchmark for metrics tests
ff4cfcd8a runk: Add Docker guide to README
c8ac56569 cache: kernel: Harmonize commit with fetching side
81775ab1b cache: kernel: Fix SEV kernel caching
717f775f3 gha: ci: Add skeleton of vfio job
b9f100b39 agent,libs: Remove unused 'mut' keywords
a56f96bb2 kata-deploy: Allow shim creation based on what's passed to the daemonset
4a5ab38f1 metrics: General improvements to json.bash script
d4eba3698 kata-deploy-binaries: kernel_cache: Take module_dir into account
b7c9867d6 release: Mention the container images used to build the project
7c4b59781 ci: nydus: Fix typo in "source"
6a680e241 gha: ci: Add placeholder for the nydus tests as part of the CI
fb4f7a002 gha: nydus: Add a no-op GHA for nydus
4a207a16f gha: nydus: Bring tests as they are from the tests repo
2c8f83424 runtime-rs: remove unneeded 'mut' keywords
1fc715bc6 s390x: Add AP Attach/Detach test
e91f5edba ci: cri-containerd: Fix default typo for testContainerStart()
8b8aef09a ci: cri-containerd: Temporarily disable TestContainerSwap
56767001c ci: cri-containerd: Add namespace / uid to the pods
a84773652 ci: cri-containerd: Always use sudo to call crictl
99ba86a1b ci: cri-containerd: Add /usr/local/go/bin to the PATH
7f3b30999 ci: cri-containerd: Add `function` before each function
fde22d6bc ci: cri-containerd: Assume podman is always used
9465a0496 ci: cri-containerd: Adapt "source ..." to this repo
df8d14411 ci: cri-containerd: Remove CI variable
f90570aef ci: cri-containerd: Remove unused runc_runtime_bin
c3637039f ci: cri-containerd: Remove KILL_VMM_TEST env var
bc4919f9b ci: cri-containerd: Always run shim-v2 tests
f9e332c6d ci: cri-containerd: Stop cloning containerd
cfd662fee ci: cri-containerd: Remove ununsed SNAP_CI var
d36c3395c ci: cri-containerd: Update copyright
b5be8a4a8 ci: cri-containerd: Move integration-tests.sh as it was
f2e00c95c ci: cri-containerd: Populate install_dependencies()
897955252 versions: Add "latest" field for cri-tools
1bbcbafa6 ci: Add clone_cri_container()
f66c68a2b ci: Add install_cri_tools()
4dd828414 ci: Add install_cri_containerd()
ad47d1b9f ci: Add download_github_project_tarball()
788c562a9 ci: Add get_latest_patch_release_from_a_github_project()
6742f3a89 ci: Use `function` before each install_go.sh function
5eacecffc ci: Adjust paths for install_go.sh
8ed1595f9 ci: Update copyright for install_go.sh
6123d0db2 ci: Move install_go.sh as it was
8653be71b ci: Do not take cross-build into consideration for kata-arch.sh
6a76bf92c ci: Fix style / identation if kata-arch.sh
72743851c ci: Add `function` before each kata-arch.sh function
9f6d4892c ci: Update copyright for kata-arch.sh
6f73a7283 ci: Move kata-arch.sh as it was
3615d7343 ci: Add get_from_kata_deps()
34779491e gha: kubernetes: Avoid declaring repo_root_dir
f3738beac tests: Use $HOME/go as fallback for $GOPATH
b87ed2741 tests: Move `ensure_yq` to common.bash
124e39033 tests: common: Fix quoting when globbing
db77c9a43 tests: Make install_kata take care of the links
13715db1f tests: Do not call `install_check_metrics` when installing kata
630634c5d ci: k8s: Group logs to make them easier to read
228b30f31 ci: k8s: Gather node info during the cleanup
81f99543e ci: k8s: Cleanup cluster before deleting it
38a7b5325 packaging/tools: Add kata-debug
ae6e8d2b3 kata-deploy: Properly get the path of the versions.yaml file
309e23255 cache: kernel: Consider changes in tools/packaging/kernel
59fdd69b8 kata-deploy: Add VERSION and versions.yaml to the final tarball
5dddd7c5d release: Upload versions.yaml as part of the release
bad3ac84b metrics: Rename C-Ray to cpu performance tests
87d99a71e versions: Remove "kernel-experimental"
545de5042 vfio: Fix tests
62aa6750e vfio: Added better handling of VFIO Control Devices
dd422ccb6 vfio: Remove obsolete HotplugVFIOonRootBus
114542e2b s390x: Fixing device.Bus assignment
371a118ad agent: exclude symlinks from recursive ownership change
e64edf41e metrics: Add tensorflow function in gha-run script
67a6fff4f metrics: Enable tensorflow benchmark on gha
01450deb6 Revert "metrics: Replace backslashes used to escape double quoted key in jq expr."
843006805 metrics: Add function to memory inside container script
bbd3c1b6a Dragonball: migrate dragonball-sandbox crates to Kata
fad801d0f ci: k8s: Adapt "source ..." to the new location of gha-run.sh
55e2f0955 metrics: stop hypervirsor and shim at init_env stage
556e663fc metrics: Add disk link to general metrics README
98c121709 metrics: Add C-Ray README
8e7d9926e metrics: Add C-Ray Dockerfile
e2ee76978 metrics: Add C-Ray performance test
2ee2cd307 ci: k8s: Move gha-run.sh to the kubernetes dir
88eaff533 ci: tdx: Adjust KUBECONFIG
c09e268a1 versions: Downgrade SEV(-SNP) kernel back to v5.19.x
6a7a32365 versions: Bump virtiofsd to v1.7.0
ac5f5353b ci: k8s: Bring TDX tests back
950b89ffa versions: Update kernel to version v6.1.38
8ccc1e5c9 metrics: Update machine learning documentation
f50d2b066 gha: ci: cri-containerd: Fix KATA_HYPERVSIOR typo
620b94597 metrics: Add Tensorflow Mobilenet documentation
6c91af0a2 agent: Fix exec hang issues with a backgroud process
59f4731bb metrics: Stop running kata-env before kata is properly installed.
468f017e2 metrics: Replace backslashes used to escape double quoted key in jq expr.
64f013f3b ci: k8s: Enable debug when running the tests
8f4b1df9c kata-deploy: Give users the ability to run it on DEBUG mode
2c8dfde16 kernel: Update kernel config name
150e54d02 runtime-rs: ignore unconfigured network interfaces
3ae02f920 metrics: use rm -f to remove older continerd config file.
a864d0e34 tests: Add tensorflow mobilenet dockerfile
788d2a254 tests: Add tensorflow mobilenet performance test
3fed61e7a tests: Add storage link to general metrics documentation
b34dda4ca tests: Add storage blogbench metrics documentation
6787c6390 runtime-rs: add parameter for propagation of (u)mount events
6e5679bc4 tests: Add function before function name in common.bash for metrics
62080f83c kata-sys-util: Fix compilation errors
02d99caf6 static-checks: Make cargo clippy pass.
982420682 agent: Make the static checks pass for agent
61e4032b0 kata-ctl: Remove all utility functions to get platform protection
a24dbdc78 kata-sys-util: Move utilities to get platform protection
dacdf7c28 kata-ctl: Remove cpu related functions from kata-ctl
f5d195717 kata-sys-util: Move additional functionality to cpu.rs
304b9d914 kata-sys-util: Move CPU info functions
7319cff77 ci: cri-containerd: Add LTS / Active versions for containerd
2a957d41c ci: cri-containerd: Export GOPATH
75a294b74 ci: cri-containerd: Ensure deps are installed
6924d14df metrics: Fix metrics ts generator to treat numbers as decimals
9e048c8ee checkmetrics: Add blogbench read value for qemu
2935aeb7d checkmetrics: Add blogbench write value for qemu
02031e29a checkmetrics: Add blogbench read value for clh
107fae033 checkmetrics: Add blogbench write value for clh
8c75c2f4b metrics: Update blogbench Dockerfile
49723a9ec metrics: Add double quotes to variables
dc67d902e metrics: Enable blogbench test
438fe3b82 gha: ci: Add cri-containerd tests skeleton
bd08d745f tests: metrics: Move metrics specific function to metrics gha-run.sh
3ffd48bc1 tests: common: Move a few utility functions to common.bash
7f961461b tests: Add machine learning README
bb2ef4ca3 tests: Add `function` before each function
063f7aa7c tests: Add Pytorch Dockerfile
1af03b9b3 tests: Add Pytorch performance test
4cecd6237 tests: Add tensorflow Dockerfile
c4094f62c tests: Add metrics machine learning performance tests
89b622dcb gha: k8s: tdx: Temporarily disable TDX tests
8c9d08e87 gha: ci: Gather info about the node / pods
283f809dd runtime-rs: Enhancing Device Manager for network endpoints.
a65291ad7 agent: rustjail: update test_mknod_dev
46b81dd7d agent: clippy: fix cargo clippy warnings
c4771d9e8 agent: Makefile: enable set SECCOMP dynamically
a88212e2c utils.mk: update BUILD_TYPE argument
883b4db38 dragonball: fix cargo test on aarch64
6822029c8 runtime-rs: Do not scan network if network model is "none"
ce54e43eb metrics: Update memory usage script
fbc2a91ab gha: Cancel previous jobs if a PR is updated
307cfc8f7 tools: Use a consistent target name when building mariner initrd
d780cc08f gha: nightly: Also use `workflow_dispatch` to trigger it
b99ff3026 gha: nightly: Fix name size limit for AKS
aedc586e1 dragonball: Makefile: add coverage target
310e069f7 checkmetrics: Enable checkmetrics for memory inside test
1363fbbf1 README: Add badge for our Nightly CI
1776b18fa gha: Do not run all the tests if only docs are updated
28c29b248 bugfix: plus default_memory when calculating mem size
0c1cbd01d gha: ci: after-push: Use github.sha to get the last commit reference
37a955678 gha: ci: nightly: Use github.sha to get the last commit reference
ed23b47c7 tracing: Add tracing to runtime-rs
96e9374d4 dragonball: Don't fail if a request asks for more CPUs than allowed
38f0aaa51 Revert "gha: k8s: dragonball: Skip k8s-number-cpus"
828a72183 gha: k8s: dragonball: Skip k8s-oom
a79505b66 gha: k8s: dragonball: Skip k8s-number-cpus
275c84e7b Revert "agent: fix the issue of exec hang with a backgroud process"
2be342023 checkmetrics: Add memory usage inside container value for qemu
6ca34f949 checkmetrics: Add memory inside container value for clh
6c6892423 metrics: Enable memory inside container metrics
0ad298895 gha: ci: Fix refernce passed to checkout@v3
86904909a gha: ci: Avoid using env also in the ci-nightly and payload-after-push
f72cb2fc1 agent: Remove shadowed function, add slog-term
1d05b9cc7 gha: ci: Pass down secrets to ci-on-push / ci-nightly
c5b4164cb gha: ci: Fix tarball-suffix passed to the metrics tests
07810bf71 agent: Ignore already mounted dev/fs/pseudo-fs
11e3ccfa4 gha: ci: Avoid using env unless it's really needed
c45f646b9 gha: k8s: Ensure cluster doesn't exist before creating it
1a7bbcd39 gha: ci: Fix typo pull_requesst -> pull_request
ddf4afb96 gha: ci: Fix set-fake-pr-number job
8a0a66655 gha: ci: schedule expects a list, not a map
5c0269dc5 gha: ci: Add pr-number input to the correct job
de83cd9de gha: ci: Use $VAR instead of ${{ env.VAR }}
6acce83e1 metrics: Fix the call to check_metrics function
e067d1833 gha: Add a nightly CI job
7c0de8703 gha: k8s: Ensure tests are running on a specific namespace
106e30571 gha: Create a re-usable `ci.yaml` file
cc3993d86 gha: Pass event specific info from the caller workflow
4e396e728 metrics: Add function keyword to to helper metrics functions
1ca17c2f7 metrics: storing metrics workflow artifacts
5a61065ab checkmetrics: Add checkmetrics value for memory usage in qemu
78086ed1f checkmetrics: Add memory usage value for clh
1c3dbafbf metrics: Fix function of how to retrieve multiple values
18968f428 metrics: Add function to have uniformity
35d096b60 metrics: Adds blogbench and webtool metrics tests
d8f90e89d metrics: Rename function at memory usage script
b9d66e0d5 metrics: Fix double quotes variables in memory usage script
476a11194 tests: Enable memory usage metrics tests
b568c7f7d tests/integration: Provide default value for KATA_HOST_OS
d6e96ea06 tests/integration: Use AzureLinux instead of Mariner
40c46c75e tests/integration: Perform yq install in run_tests()
d8b8f7e94 metrics: Enable launch tests time metrics
72fd562bd gha: release: Use a specific release of hub
0502354b4 checkmetrics: Add checkmetrics json for qemu
b481ef188 makefile: Add -buildvcs=false flag to go build
e94aaed3c ci_worker: Add checkmetrics ci worker for cloud hypervisor
917576e6f metrics: Add double quotes in all variables
cc8f0a24e metrics: Add checkmetrics to gha-run.sh for metrics CI
477856c1e gha: dragonball: Correctly propagate PATH update
1c211cd73 gha: Swap asset/release in build matrix
0152c9aba tools: Introduce `USE_CACHE` environment variable
2b5975689 tests: Build CLH with glibc for Mariner
80c78eadc tests: Use baked-in kernel with Mariner
532755ce3 tests: Build Mariner rootfs initrd
6a21e20c6 runtime: Add "none" as a shared_fs option
5681caad5 versions: Upgrade to Cloud Hypervisor v33.0
b2ce8b4d6 metrics: Add memory footprint tests to the CI
d035955ef doc: Add documentation for the virtualization reference architecture
0f454d0c0 gpu: Fixing typos for PCIe topology changes
6bb2ea819 packaging: Fix indentation of build.sh script at ovmf
0504bd725 agent: convert the `sl` macros to functions
0860fbd41 agent: convert the `ttrpc_error` macro to a function
0e5d6ce6d agent: convert the `is_allowed` macro to a function
f680fc52b agent: change `AGENT_CONFIG`'s lazy type to just `AgentConfig`
beb706368 metrics: Uniformity across function names
1f3e837e4 runtime-rs: fix build error on AArch64
6fd25968c runtime-rs: bugfix for direct volume path's validation.
415578cf3 docs: Add general README
bff4672f7 runtime-rs: support physical endpoint using device manager
32cba7e44 metrics: Fix retrieving hypervisor version on metrics
aa7946de4 checkmetrics: Add general checkmetrics documentation
2fac2b72f checkmetrics: Add checkmetrics makefile
e45899ae0 docs: Add time tests documentation reference
28130d3ce docs: Add boot time metrics documentation
0df2fc270 runtime-rs: add support spdk/vhost-user based volume.
17198089e vendor: Add vendor checkmetrics dependencies
f1dfea6e8 docs: Add metrics documentation reference
8330fb8ee gpu: Update unit tests
859359424 metrics: enable launch-times test on gha-run metrics script
c4ee601bf metrics: Add checkmetrics for kata metrics CI
e0d6475b4 gha: Don't automatically trigger CI
b535c7cbd tests: Enable running k8s tests on Mariner
71071bdb6 docs: Add general metrics documentation
610f7986e check: Relax the unrestricted_guest check when running in a VM
1b406b9d0 kata-ctl:Implement functionality to check host is capable of running VM
adf88eaa8 static-build: Remove kata-version parameter
09720babc docs: fix spelling of "crate"
7185afc50 gha: Fix gha actions
21294b868 packaging: Fix indentation in init.sh script
fad3ac9f5 metrics: install kata and launch-times test
4bbfcfaf1 tests: Move tests helper script to this repo
f152f0e8c metrics: Add launch-times to metrics tests
59510cfee runtime-rs: add support vfio device based volume
1e3b372bb runtime-rs: add support vfio device manager
6b0848930 gha: Fix format for run launchtimes metrics yaml
3cefa43e7 tests: Add json script for metrics tests
6a3710055 initramfs: Build dependencies as part of the Dockerfile
aa2380fdd packaging: Add infra to push the initramfs builder image
1c7fcc6cb packaging: Use existing image to build the initramfs
a43ea24df virtiofsd: Convert legacy `-o` sub-options to their `--` replacement
8e00dc694 virtiofsd: Drop `-o no_posix_lock`
2a15ad978 virtiofsd: Stop using deprecated `-f` option
c3043a6c6 tests: Add tests lib common script
b16e0de73 gha: Add base branch on SHA on pull requst
72f2cb84e gpu: Reset cold or hot plug after overriding
fbacc0964 gpu: PCIe topology, consider vhost-user-block in Virt
bc152b114 gha: ci-on-push: Run metrics tests
dad731d5c docs: Update Developer Guide
b11246c3a gpu: Various fixes for virt machine type
40101ea7d vfio: Added annotation for hot(cold) plug
8f0d4e261 vfio: Cleanup of Cold and Hot Plug
b5c4677e0 vfio: Rearrange the bus assignemnt
b1aa8c8a2 gpu: Moved the PCIe configs to drivers
55a66eb7f gpu: Add config to TOML
da42801c3 gpu: Add config settings tests for hot-plug
de39fb7d3 runtime: Add support for GPUDirect and GPUDirect RDMA PCIe topology
9318e022a gpu: Add CC relates configs
b7932be4b gpu: Add Arm64 Kernel Settings
211b0ab26 gpu: Update Kernel Config
5f103003d gpu: Update kernel building to the latest changes
35e4938e8 tools: Fix no-op builds
347385b4e runtime-rs: Enhance flexibility of virtio-fs config
21d227853 versions: Update firecracker version to 1.3.3
0e2379909 gha: Fix `stage` definition in matrix
ae2cfa826 doc: add vcpu handlint doc for runtime-rs
7b1e67819 fix(clippy): fix clippy error
67972ec48 feat(runtime-rs): calculate initial size
aaa96c749 feat(runtime-rs): modify onlineCpuMemRequest
d66f7572d feat(runtime-rs): clear cpuset in runtime side
a0385e138 feat(runtime-rs): update linux resource when stop_process
a39e1e6cd feat(runtime-rs): merge the update_cgroups in update_linux_resources
fa6dff9f7 feat(runtime-rs): support vcpu resizing on runtime side
8cb4238b4 packaging: Remove snap package
213773998 runtime-rs: update Cargo.lock
56d2ea9b7 kata-ctl: Refactor kernel module check
9f7a45996 gha: Add `rootfs-initrd-mariner` build target
f28a62164 gha: Add `cloud-hypervisor-glibc` build target
8fb7ab751 dragonball: introduce virtio-balloon device
7ed949497 dragonball: introduce virtio-mem device
776a15e09 runtime-rs: add support direct volume.
a8e0f51c5 dragonball: extend DeviceOpContext
abae11404 runtime-rs: refactor device manager implementation
210a15794 dragonball: avoid obtaining lock twice in create_stdio_console
69668ce87 tests: gha-run: Use correct env variable for repo
f487199ed gha: aks: Fix argument in call to gha-run.sh
f6afae9c7 packaging: Add rootfs-image-tdx-tarball target
f62b2670c config: Add root hash value and measure config to kernel params
008058807 kernel: Integrate initramfs into Guest kernel
28b264562 initramfs: Add build script to generate initramfs
5cb02a806 image-build: generate root hash as an separate partition for rootfs
31c0ad207 packaging: Add cryptsetup support in Guest kernel and rootfs
980d084f4 log-parser: Update log parser link at README
410bc1814 agent-ctl: fix the compile error
77519fd12 kata-ctl: Switch to slog logging; add --log-level, --json-logging args
aab603096 gha: aks: Extract `run` commands to a script
e4eb664d2 runtime-rs: update rust to 1.69.0
ed37715e0 runtime-rs: handle copy files when share_fs is not available
5f6fc3ed7 runtime-rs: bugfix: update Cargo.lock
1c6d22c80 gha: aks: Use short SHA in cluster name
3c1f6d36d readme: Update Kata Containers logo
388684113 readme: Add status badge for the "Publish Artefacts" job
26f752038 kata-deploy: Change how we get the Ubuntu k8s key
aebd3b47d gha: aks: Ensure host_os is used everywhere needed
0c8282c22 gha: aks: Add the host_os as part of the aks cluster's name
4b89a6bda release: Standardize kata static file name
9228815ad kernel: Modify build-kernel.sh to accomodate for changes in version.yaml
03027a739 gha: Fix Mariner cluster creation
43e73bdef packaging: make BUILDER_REGISTRY configurable
ffe3157a4 dragonball: add arm64 patches for upcall
560442e6e dragonball: add vcpu_boot_onlined vector
e31772cfe dragonball: add support resize_vcpu on aarch64
64c764c14 dragonball: update dbs-boot to v0.4.0
fd9b41464 dragonball: update comment for init_microvm
af16d3fca gha: Unbreak CI and fix cluster creation step
5ddc4f94c runtime-rs/kata-ctl: Enhancement of DirectVolumeMount.
25d2fb0fd agent: fix the issue of exec hang with a backgroud process
4af4ced1a gha: Create Mariner host as part of k8s tests
eee7aae71 runtime-rs/sandbox_bindmounts: add support for sandbox bindmounts
557b84081 gha: aks: Wait longer to start running the tests
c04c872c4 gha: aks: Increase the timeout time
428041624 kata-deploy: Improve shim backup / restore
14c3f1e9f kata-deploy: Fix indentation on kata deploy merge script
0e47cfc4c runtime: sending SIGKILL to qemu
6a0035e41 doc: Update git commands
433b5add4 kubernetes: add agnhost command in pod yaml
c477ac551 dragonball: Convert VirtioNetDeviceMgr function to method
4659facb7 dragonball: Convert BlockDeviceMgr function to method
ee6deef09 dragonball: Remove virtio-net and vsock devices gracefully
2bda92fac netlink: Fix the issue of update_interface
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The links for pods and kubelets no longer work so update to new links
with relevant info.
Fixes#7487
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Multiple instances of task service may get registered by
ServiceManager::run(), fix it by making operation symmetric.
Fixes: #7479
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
This PR rounds the axelnet and resnet results in order to extract
properly the result.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR will avoid to have the strconv.atoi parsing error when we
are retrieving the results from the json.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR moves the checkmetrics to gha-run script to gathered
tensorflow information.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The previous kata-monitor in golang could not communicate with runtime-rs
to gather metrics due to different sandbox addresses.
This PR adds the subcommand monitor in kata-ctl to gather metrics from
runtime-rs and monitor itself.
Fixes: #5017
Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
Several functions in kata-ctl need to establish a connection with runtime-rs through MgmtClient.
This PR provides a global TIMEOUT to avoid multiple definitions.
Fixes: #5017
Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
1. Implemented metrics collection for runtime-rs shim and dragonball hypervisor.
2. Described the current supported metrics in runtime-rs.(docs/design/kata-metrics-in-runtime-rs.md)
Fixes: #5017
Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
We were doing "if - else if - else", while bash expects "if - elif -
else", and that should never have happened in the first place, but it
happend as part of b8b73939eaFixes: #7422
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The amount of info we've added seemed unnecessary, and ends up making
our lives even harder when trying to find errors.
Let's just rely on the kata-debug container to collect the needed info
for us.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It's been proven to not be useful, and ends up making things more
confusing due to the amount of logs printed.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure we can debug kata-deploy in case something goes wrong
during its execution.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is needed in order to not lose track of what's been created and
what's been added here and there.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will help folks to debug / understand what's been passed to the
kata-deploy.sh script.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's allow the daemonset to create the runtimeclasses, which will
decrease one manual step a user of kata-deploy should take, and also
help us in the Confidential Containers land as the Operator can just
delegate it to this script.
Fixes: #7409
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This can be easily done as there was no official release with the
previous values.
The reason we're doing so is because when using `yq` to replace the
value, even when forcing `--tag '!!str' "yes"`, the content is placed
without quotes, causing errors in our CI.
While here, we're also removing the fallback value for DEBUG, as it is
**always** set in the kata-deploy.yaml file.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will make things simpler to only create the handlers defined by the
kata-deploy user.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will become handy in the near future, as we want to have separate
enrties for each file, while still keeping this one.
Having the entries sorted will make our lives easier to test those are
always in sync.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This imports the k8s-file-volume test from the tests repo and modifies
it slightly to set up the host volume on the AKS host.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This imports the k8s-volume test from the tests repo and modifies it
slightly to set up the host volume on the AKS host.
Fixes: #6566
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
On main we will not have this problem as we can easily configure which
shims will be installed according to an environment variable passed to
the kata-deploy.yaml file.
However, on CCV0, at least for now, we better keep the list of shims
separated by architecture, as we've found out that s390x CoCo Operator
CI is breaking because we try to install a shim that's not even built
for that architecture (dragonball).
Fixes: #7422
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Add ImageRequestTimeout field in the config struct, set RequestTimeout
by configured image request timeout, add image_request_timeout to
default configuration files, add image request timeout to annotations
and add image timeout annotation to sandbox config documentation.
exp:
configure the image request timout in the configuration:
[image]
image_request_timeout = 300
configure the image request timeout in the yaml:
annotations:
"io.katacontainers.config.runtime.image_request_timeout": "300"
Fixes: #7389
Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>
This is another piece that got dropped as part of
6f552b010c and is causing regressions on
the operator tests.
Fixes: #7422
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This deletes node debugger pods after execution since their presence may
affect tests that assume only test workloads pods are present.
For example, in `k8s-job` we wait for *any* pod to be in the `Succeeded`
state before proceeding, which causes failures.
Fixes: #7452
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Static checks for dragonball are landing on any of the self-hosted
runners, and the reason for that is because "self-hosted" was the label
selector used.
Let's use "dragonball" instead, as the machine has that label as well.
Fixes: #7464
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need the executable bit set because it is preserved into the
runtime-payload-ci image.
Fixes: #7460
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This simply allows setting a custom resource group when debugging
locally, so as to prevent name collisions and not pollute the namespace.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Makes it so that `setup.sh` doesn't make changes in
`runtimeclass_workloads/` directly. Instead we treat that as a template
directory and we use the new directory `runtimeclass_workloads_work/` as
a work dir.
This has two advantages:
* Allows rerunning tests without the assumption that `setup.sh` must be
idempotent. E.g. the `set_runtime_class()` step would break.
* Doesn't pollute your git environment with a bunch of changes when
developing.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This splits deploying Kata and running the tests into separate commands
to make it possible to rerun tests locally without having to redeploy
Kata each time.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This is to change a prefix from `confidential-containers` to `kata` for IBM SE image build.
Fixes: #7444
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Now that the shim-v2 for CCv0 has been rebuilt with the correct path,
let's re-enable the cache.
Fixes: #7422
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR adds the FIO benchmark scripts and resources for the metrics
tests section.
Fixes#7441
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's re-enabled caching for the following components, as those were
rebuilt with the new prefix:
* cc-rootfs-image
* cc-rootfs-initrd
* cc-tdx-rootfs-image
* cc-tdx-td-shim
* cc-sev-rootfs-initrd
"cc-se-image" was part of the list, but we never had a target for it.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We should not return, in case cache is not used, before actually
downloading the root_hash_*.txt provided by the other components,
otherwise the job used to do the caching will always fail.
Fixes: #7422
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Otherwise we'd have to build the component every single time as the main
version is different from the CC one.
Fixes: #7422
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Otherwise we'd have to build the component every single time as the main
version is different from the CC one.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Otherwise we'd have to build the component every single time as the main
version is different from the CC one.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Otherwise we'd have to build the component every single time as the main
version is different from the CC one.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
kata-deploy-binaries.sh uses the last commit in
tools/packaging/static-build/kernel for its version check, while the cache
generation uses tools/packaging/kernel. Use tools/packaging/static-build/kernel
as $kata_config_version is already part of the version string and covers any
changes to tools/packaging/kernel.
Fixes: #7403
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The SEV kernel cache calls create_cache_asset() twice, once for the kernel and
once for modules. Both calls need to use the same version string, otherwise the
second call overwrites the "latest" file of the first one and the cache is not
used.
Fixes: #7403
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This job will run on a nested virt capable Azure VM (improving test
concurrency). This is just a placeholder while we adapt the test to GHA.
Fixes: #6555
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Remove unused `mut` because the agent compilation fails
when the rust compiler is >= 1.71. This is related to #7425Fixes: #7438
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
Kudos to Magnus who has a better vision than I do.
exprimental -> experimental
Fixes: #7422
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of hardcoding shims as part of the script, let's ensure we can
allow them to be created based on environment variables passed to the
daemonset.
This change brings no functionality change as the default values in the
daemonset are exactly what has been used as part of the scripts.
Fixes: #7407
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
`module_dir` has been passed to the function but was never assigned to a
var, leading to errors when trying to use it.
Fixes: #7416
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit d4eba36980)
We must use "edk2-staging-tdx" instead of "edk2-tdx". The reason for
that is versions diverging between main and CCv0.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're building SEV kernel from the main branch, we can stop relying
on the path produced by the one from the CCv0 branch (which is now
removed).
Fixes: #7422
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's just rely on whatever we have on main. The big execption here is
TDVF, but we have a big note saying to not update the version n this
branch.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We can just rely on the hypervisors builds from `main`, with the TDX one
being the only discrepancy here.
However, we have a big note in the versions.yaml to **not** update the
TDX hypervisor versions on this branch, so we should be good.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR adds general improvements like putting function before function
name and consistency in how we declare variables and so on to have
uniformity across the metrics scripts.
Fixes#7429
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
`module_dir` has been passed to the function but was never assigned to a
var, leading to errors when trying to use it.
Fixes: #7416
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We should source from `nydus_dir`, instead of `cri_containerd_dir`, and
that was a leftover from fb4f7a002c.
Fixes: #6543
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will triger the nydus tests, but as they currently are they'll just
return "okay" without actually executing.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This newly added GHA does nothing, is not even triggered, and it's just
a placeholder that we'll grow in the next commits / PRs, so we can
actually start running the nydus tests as part of our CI.
Fixes: #6543
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Now that we have propper AP device support add a
unit test for testing the correct Attach/Detach of AP devices.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Otherwise crictl will fail to remove them with:
```
getting sandbox status of pod "$pod": metadata.Name, metadata.Namespace
or metadata.Uid is not in metadata "..."
```
A huge shout out to Steven Horsman for helping to debug this one.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
For this set of tests, we'll always be using podman in order to avoid
having containerd pulled in by docker.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We don't need the env var, we just need to restrict the test according
to the KATA_HYPERVISOR used, as right now it's very specifict to QEMU.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We only have shim-v2 as the runtime type, so we always need to run tests
using it. :-)
We had to adjust the script in order to properly run the tests with the
current logic.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's move the `integration/containerd/cri/integration-tests.sh` file
from the tests repo to this one.
The file has been moved as it is, it's not used, and in the following
commits we'll clean it up before actually using it.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's install all the dependencies needed for running the
`cri-containerd` tests.
The list of dependencies we have are:
* From the system
- build-essential
- jq
- podman-docker
* From our own repo
- yq
- go
* From GitHub projects
- containerd
- cri-tools
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we don't want to disrupt what we have on the `tests` repo, let's
create a "latest" entry and use that for the GitHub actions tests.
Once we deprecate the `tests` repo we can decide whether we want to
stick to using "latest" or switch back to "version".
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will simply clone containerd repo, specifically on a tag
we want to use to test.
This can be expanded for different projects, and it will be the case as
soon as we grow the tests. But, for now, let's keep it simple.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will install cri-tools in the host, and soon enough (as
part of this PR) we'll be using it to install cri-tools as part of the
cri-containerd tests.
I've decided to have this as part of the `common.bash` as other tests
that will be added in the future will require cri-tools to be installed
as well.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will install cri-containerd in the host, and soon enough
(as part of this PR) we'll be using it to install cri-containerd as part
of the cri-containerd tests.
I've decided to have this as part of the `common.bash` as other tests
that will be added in the future will require cri-containerd to be
installed as well.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will hel us to get the tarball, from a github project,
that we're going to use as part of our tests.
Right now this is not used anywhere, but it'll soon enough (as part of
this series) be used to download the cri-containerd / cri-tools / cni
tarballs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will help us to get the latest patch release from a
GitHub project.
The idea behind this function is that we don't have to keep updating
versions.yaml that frequently (or worse, have it outdated as it
currently is), and always test against the latest patch release of a
given project's version that we care about.
Although right now this is not used anywhere, this will be used with the
coming cri-containerd tests, which will be part of this series.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's adjust paths for what we source and the scripts we call, after
moving from the tests repo to this one.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's move `.ci/install_go.sh` file from the tests repo to this one.
The file has been moved as it is, it's not used, and in the following
commits we'll clean it up before actually using it.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Right now we'd need to import lib.sh just in order to get cross-build
information for rust, and it seems a little bit premature to do so at
this stage and only for rust.
Let's skip it and keep this transition simple.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's move `.ci/kata-arch.sh` file from the tests repo to this one.
The file has been moved as it is, it's not used, and in the following
commits we'll clean it up before actually using it.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
First of all, I'm 100% aware that I'm duplicating this function here as
I've copied it from the packaging stuff, and I'm not exactly proud of
that.
However, right now it seems a little bit premature to combine that set
of scripts with this set of scripts in a single one and make them used
by both pieces of our project.
Anyways, this functions helps to get information from the
`versions.yaml` file, and it'll be used as part of the cri-containerd
tests and a few others in the future.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is already declared as part of the `common.bash` file, so let's
just make sure we use it from there.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Considering that someone may want to run the tests locally, we shouldn't
rely on having GITHUB_WORKSPACE exported, and fallback to $HOME/go if
needed.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When the glob star is inside quotes, there is only one iteration of the loop
and b holds all matches at once. Move the glob out of the quotes so that we
actually iterate over matched paths.
Fixes: #6543
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The `install_kata` function was moved from the metrics' `gha-run.sh`
file to the `common.bash` in the commit 3ffd48bc16, but I didn't notice
that it brought with it a call to `install_check_metrics`, which is
totally unrelated to installing Kata Containers.
Let's remove the call so the function is a little bit less specific, and
move the call to install_check_metrics to the metrics `gha-run.sh` file.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will help us to in two fronts:
* catching possible issues related to kata-deploy cleanup
* do more (like, in the future, collect logs) after the tests run
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
kata-debug is a tool that is used as part of the Kata Containers CI to gather
information from the node, in order to help debugging issues with Kata
Containers.
As one can imagine, this can be expanded and used outside of the CI context,
and any contribution back to the script is very much welcome.
The resulting container is stored at the [Kata Containers quay.io
space](https://quay.io/repository/kata-containers/kata-debug) and can
be used as shown below:
```sh
kubectl debug $NODE_NAME -it --image=quay.io/kata-containers/kata-debug:latest
```
Fixes: #7397
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need to correctly get the full path of the versions.yaml file as part
of the merge-builds.sh script, as we do a `pushd` there and that leads
to a fail merging the artefacts as the `versions.yaml` file does not
exists in that path.
Fixes: #7405
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Any change in the script used to build the kernel should invalidate the
cache.
Fixes: #7403
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
v0.7.0 of guest-components has been released, so let's
use the new tag for the attestation-agent
Fixes: #7400
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Let's make things simpler to figure out which version of Kata
Containers has been deployed, and also which artefacts come with it.
This will help us immensely in the future, for the TEEs use case, so we
can easily know whether we can deploy a specific guest kernel for a
specific host kernel.
Fixes: #7394
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Although this file is far away from being a SBOM, it'll help folks to
easily visualise which components are part of a release, and even have
SBOMs generated from that.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've not been using nor shipping this kernel for a very long time.
Regardless, we're leaving behind the logic in the kernel scripts to
build it, in case it becomes necessary in the future.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Allow triggering this action manually, since we noticed it being skipped on
push exactly when we needed it the most.
Fixes: #7353
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Removing HotplugVFIOonRootBus which is obsolete with the latest PCI
topology changes, users can set cold_plug_vfio or hot_plug_vfio either
in the configuration.toml or via annotations.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
The device.Bus was reset if a specific combination of
configuration parameters were not met. With the new
PCIe topology this should not happen anymore
Fixes: #7381
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
currently when fsGroup is used with direct-assign, kata agent
recursively changes ownership and permission for each file including
symlinks. However the problem with symlinks is, the permission of
the symlink itself may not be same as the underlying file. So while
doing recursive ownership and permission changes we should skip
symlinks.
Fixes: #7364
Signed-off-by: Alakesh Haloi <a_haloi@apple.com>
This PR adds the tensorflow function in gha-run script in order to
be triggered in the gha.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds function before function of the variables at the memory
inside container script in order to have uniformity across the script.
Fixes#7386
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
In order to make it easier for developers to contribute to Dragonball,
we decide to migrate all dragonball-sandbox crates to Kata.
fixes: #7262
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
We need to do proper sandbox sizing when we're doing cold-plug introduce CDI,
the de-facto standard for enabling devices in containers. containerd
will pass-through annotations for accumulated CPU,Memory and now CDI
devices. With that information sandbox sizing can be derived correctly.
Fixes: #7331
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
This PR kills the hypervisor and the kata shim in the
init_env stage prior to launch any metric test.
Additionally this PR adds info messages in the main blocks
of the blogbench test to help in debugging.
Fixes: #7366
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR adds C-Ray performance test in order to be part of the kata
metrics CI.
Fixes#7375
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
We don't need to export KUBECONFIG there. Let's just make sure we have
the server correctly setup and avoid doing that.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
CC-GPU seems to have issues with v6.1, so downgrade the kernels used for
SEV-SNP to a known-working version. It is worth mentioning that TDX is also
still on 5.19.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The k8s.gcr.io is deprecated for a while now and has been redirected to
registry.k8s.io. However on some bare-metal machines in our testing
pools that redirection is not working, so let's just replace the
registries.
Fixes#6461
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Update image-rs, which is part of the guest-components repo, to the commit that
will become the v0.7.0 tag.
Fixes: #7353
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Now that we have a new TDX machine plugged into our CI, let's re-enable
the TDX tests.
Fixes: #7368
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Kernel v6.1.38 is the current latest LTS version, switch to it. No
patches should be necessary. Some CONFIG options have been removed:
- CONFIG_MEMCG_SWAP is covered by CONFIG_SWAP and CONFIG_MEMCG
- CONFIG_ARCH_RANDOM is unconditionally compiled in
- CONFIG_ARM64_CRYPTO is covered by CONFIG_CRYPTO and ARCH=arm64
Fixes: #6086
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This PR updates the machine learning documentation related with
Tensorflow and Pytorch benchmarks.
Fixes#7359
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds the Tensorflow mobilinet documentation for the machine
learning README.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Issue #4747 and pull request #4748 fix exec hang issues where the exec
command hangs when a process's stdout is not closed. However, the PR might
cause the exec command not to work as expected, leading to CI failure. The
PR was reverted in #7042. This PR resolves the exec hang issues and has
undergone 1000 rounds of testing to verify that it would not cause any CI
failures.
Fixes: #4747
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
DEFSERVICEOFFLOAD controls whether images are pulled inside
the guest. This should always be set for CoCo, not just
when we use MEASURED_ROOTFS.
Fixes: #7350
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
This PR makes kata-env is called only after some metrics have
completed his workload. This fixes a bug that occurs when
kata-env was being called before kata is already installed on the
testing platform.
Fixes: #7348
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR uses squared brackets in a jq expression to access
key values corresponding to metric results in json format.
The values are the data inputs into the checkmetrics tool.
Fixes: #7319
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This will help us to gather more information about Kata Containers in
case of failure.
Fixes: #7343
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The DEBUG env var introduced to the kata-deploy / kata-cleanup yaml file
will be responsible for:
* Setting up the CRI Engine to run with the debug log level set to debug
* The default is usually info
* Setting up Kata Containers to enable:
* debug logs
* debug console
* agent logs
This will help a lot folks trying to debug Kata Containers while using
kata-deploy, and also help us to always run with DEBUG=yes as part of
our CI.
Fixes: #7342
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Fixes: #7294
When installing the kernel config adjust the name like
the vmlinuz and vmlinux files so that any added suffixes
are also reflected in the kernel config name.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
In order to run kata metrics we need to check that the containerd
config file is properly set. When this is not the case, we
need to remove that file, and generate a valid one.
This PR runs rm -f in order to ignore errors in case the
file to delete does not exist.
Fixes: #7336
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR adds the storage metrics documentation for blogbench for kata
metrics.
Fixes#7329
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Add an extra parameter in `bind_mount_unchecked` to specify
the propagation type: "shared" or "slave".
Fixes: #7017
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
This PR adds function before the function name in common.bash script
in order to have uniformity across all the script.
Fixes#7327
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Since these have been added to kata-sys-util, remove these from
kata-ctl. Change all invocations to get platform protection to make use
of kata-sys-util.
Fixes: #7144
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Remove cpu related functions which have been moved to kata-sys-util.
Change invocations in kata-ctl to make use of functions now moved to
kata-sys-util.
Signed-off-by: Nathan Whyte <nathanwhyte35@gmail.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Make certain imports architecture specific as these are not used on all
architectures.
Move additional constants and functionality to cpu.rs.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Move get_single_cpu_info and get_cpu_flags into kata-sys-util.
Add new functions that get a list of flags and check if a flag
exists in that list.
Fixes#6383
Signed-off-by: Nathan Whyte <nathanwhyte35@gmail.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
As we'll be testing against the LTS and the Active versions of
containers, let's add those entries to the versions.yaml file and make
sure we export what we want to use for the tests as an env var.
The approach taken should not break the current way of getting the
containerd version.
LTS and Active versions of containerd can be found at:
https://containerd.io/releases/#support-horizonFixes: #6543
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure this is exported, as it'll be needed in order to install
`yq`, which will be used to get the versions of the dependencies to be
installed.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure we install the needed dependencies for running the
`cri-containerd` tests.
Right now this commit is basically adding a placeholder, and later on,
when we'll actually be able to test the job, we'll add the logic of
installing the needed dependencies.
The obvious dependencies we've spotted so far are:
* From the OS
* jq
* curl (already present)
* From our repo
* yq (using the install_yq script)
* From GitHub
* cri-containerd
* cri-tools
* cni plugins
We may need a few more packages, but we will only figure this out as
part of the actual work.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Use bc tool to perform math operations even when variables contain
values with leading zero.
Fixes: #7317
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR adds double quotes to variables in the blogbench script to
have uniformity across all the tests.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR builds the foundation for us to start migrating the
cri-containerd tests from Jenkins to GitHub Actions.
Right now the test does nothing and should always finish successfully.
The coming PRs will actually introduce logic to the `gha-run.sh` script
where we'll be able to run the tests and make sure those pass before
having them actually merged.
Fixes: #6543
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Those functions were originally introduced as part of the
`metrics/gha-run.sh` file, but those will be very hand at the time we
start adding more tests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
TDX tests need to be temporarily disabled as the current machine
allocated for this will be off for some time, and a new machine only
will be added next week.
Fixes: #7307
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Currently, network endpoints are separate from the device manager
and need to be included for proper management. In order to do so,
we need to refactor the implementation of the network endpoints.
The first step is to restructure the NetworkConfig and NetworkDevice
structures.
Next, we will implement the virtio-net driver and add the Network
device to the Device Manager.
Finally, we'll unify entries with do_handle_device for each endpoint.
Fixes: #7215
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
When running cargo test in container, test_mknod_dev may fail sometimes
because of "Operation not permitted". Change the device path to
"/dev/fifo-test" to avoid this case.
Fixes: #7284
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
1. Update memory end assert because address space layout differs between
x86 and arm.
2. Set guest_addr for aarch64 in test_handler_insert_region case.
Fixes: #7284
TODO: #7290
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Ignore cyclomatic complexity failure. I have fixed this in my PR waiting
to forward port remote-hypervisor support into main
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This PR updates memory usage script by applying the clean_env_ctr at the main
in order to avoid failures of leaving certain processes not removed.
Fixes#7302
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
- Bump kernel version to reflect that they are changes
- We've some how gone out of sync with main, so just add a +
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Currently a mixture of cbl-mariner and mariner is used when creating the
mariner initrd. The kata-static tarball has mariner in the name, but the
jenkins url uses cbl-mariner. This breaks cache usage.
Use mariner as the target name throughout the build, so that caching works.
Fixes: #7292
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This is a very nice suggestion from Steve Horsman, as with that we can
manually trigger the workflow anytime we need to test it, instead of
waiting for a full day for it to be retriggered via the `schedule`
event.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Passing the commit hash as the "pr-number" has shown problematic as it
would make the AKS cluster name longer than what's accepted by AKS.
One easy way to solve this is just passing "nightly" as the PR number,
as that's only used to create the cluster.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will help folks to monitor the history of the failing tests, as
we've done in Jenkins with the "Green Effort CI".
Fixes: #7279
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We should not go through the trouble of running all our tests on AKS /
Azure / baremetal machines in case a PR only changes our documentation.
Fixes: #7258
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've noticed this caused regressions with the k8s-oom tests, and then
decided to take a step back and do this in the same way it was done
before 67972ec48a.
Moreover, this step back is also more reasonable in terms of the
controlling logic.
And by doing this we can re-enable the k8s-oom.bats tests, which is done
as part of this PR.
Fixes: #7271
Depends-on: github.com/kata-containers/tests#5705
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
As we need to pass down the commit sha to the jobs that will be
triggered from the `push` event, we must be careful on what exactly
we're using there.
At first we were using ${{ github.ref }}, but this turns out to be the
**branch name**, rather than the commit hash. In order to actually get
the commit hash, Let's use ${{ github.sha }} instead.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we need to pass down the commit sha to the jobs that will be
triggered from the `schedule` event, we must be careful on what exactly
we're using there.
At first we were using ${{ github.ref }}, but this turns out to be the
**branch name**, rather than the commit hash. In order to actually get
the commit hash, Let's use ${{ github.sha }} instead, as described by
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's take the same approach of the go runtime, instead, and allocate
the maximum allowed number of vcpus instead.
Fixes: #7270
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 25d2fb0fde.
The reason we're reverting the commit is because it to check whether
it's the cause for the regression on devmapper tests.
Fixes: #7253
Depends-on: github.com/kata-containers/tests#5705
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
On cc3993d860 we introduced a regression,
where we started passing inputs.commit-hash, instead of
github.event.pull_request.head.sha. However, we have been setting
commit-hash to github.event.pull_request.sha, meaning that we're mssing
a `.head.` there.
github.event.pull_request.sha is empty for the pull_request_target
event, leading the CI to pull the content from `main` instead of the
content from the PR.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Remove the logic that made the kata-remote containerd config not support
io.katacontainers annotations
Fixes: #7265
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The latter workflow is breaking as it doesn't recognise ${GITHUB_REF},
the former would most likely break as well, but it didn't get triggered
yet.
The error we're facing is:
```
Determining the checkout info
/usr/bin/git branch --list --remote origin/${GITHUB_REF}
/usr/bin/git tag --list ${GITHUB_REF}
Error: A branch or tag with the name '${GITHUB_REF}' could not be found
```
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Remove shadowed get_mounts(), added slog-term as a new crate,
slog can directly log to stdout and we can capture output
in the test-cases that are created in the function to be tested.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
We have to do this, otherwise we cannot log into azure.
This is a regression introduced by
106e305717.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of passing "-${{ inputs.tag }}-amd64", we must only pass
"-${{ inputs.tag }}".
This is a regression introduced by
106e305717.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- Add annotation enablement for machine_type, default_memory and
default_vcpus
- Remove note that says that cpu and memory settings are ignored.
Fixes: #7256
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Using an initrd and setting KATA_INIT=yes meaning we're using the kata-agent
as the init process we need to make sure that the agent is not segfaulting
if mounts are already happened. Some workloads need to configure several
things in the initrd before the kata-agent starts which involves having
/proc or /sys already mounted.
Fixes: #6992
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
de83cd9de7 tried to solve an issue, but it
clearly seems that I'm using env wrongly, as what ended up being passed
as input was "$VAR", instead of the content of the VAR variable.
As we can simply avoid using those here, let's do it and save us a
headache.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It has to have steps declared, and we need to make it a dependency for
the nightly kata-containers-ci-on-push job.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Otherwise we'll get the following error from the workflow:
```
The workflow is not valid. .github/workflows/ci-on-push.yaml (Line: 24,
Col: 20): Unrecognized named-value: 'env'. Located at position 1 within
expression: env.COMMIT_HASH .github/workflows/ci-on-push.yaml (Line: 25,
Col: 18): Unrecognized named-value: 'env'. Located at position 1 within
expression: env.PR_NUMBER
```
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR fixes the call to check_metrics function as KATA_HYPERVISOR
is not needed to be passed.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The idea is to mimic what's been done with Jenkins and the "Green CI"
effort, but now using our GHA and the GHA infrastructure.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure we run our tests in a specific namespace, as in case of
any kind of issue, we will just get rid of the namespace itself, which
will take care of cleaning up any leftover from failing tests.
One important thing to mention is why we can get rid of the `namespace:
${namespace}` on the tests that are already using it, and let's do it in
parts:
* namespace: default
We can easily get rid of this as that's the default namespace where
pods are created, so it was a no-op so far.
* namespace: test-quota-ns
My understanding is that we'd need this in order to get a clean
namespace where we'd be setting a quota for. Doing this in the
namespace that's only used for tests should **not** cause any
side-effect on the tests, as we're running those in serial and there's
no other pods running on the `kata-containers-k8s-tests` namespace
Last but not least, we're not dynamically creating namespaces as the
tests are not running in parallel, **never**, not in the case of having
2 tests being ran at same time, neither in the case of having 2 jobs
being scheduled to the same machine.
Fixes: #6864
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is based on the `ci-on-push.yaml` file, and it's called from ther
The reason to split on a new file is that we can easily introduce a
`ci-nightly.yaml` file and re-use the `ci.yaml` file there as well.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Those components are being built twice, one as part of the normal matrix
assets, and the second one as part of the include.
Fixes: #7235
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's ensure we're not relying, on any of the called workflows, on event
specific information.
Right now, the two information we've been relying on are:
* PR number, coming from github.event.pull_request.number
* Commit hash, coming from github.event.pull_request.head.sha
As we want to, in the future, add nightly jobs, which will be triggered
by a different event (thus, having different fields populated), we
should ensure that those are not used unless it's in the "top action"
that's trigerred by the event.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Use the 'function' keyword to prevent bash aliases from colliding
with other function's name.
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR enables storing metrics workflow artifacts in two
separated flavours: clh and qemu.
Fixes: #7239
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
TIL that when using "include" for a matrix we must duplicate the value
we're overriding for each element we want that to happen.
Fixes: #7235
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR adds the function name before the function to have uniformity
across all the test.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Since the measured rootfs work has been merged to main, and then
brought in to the CCv0 via the weekly merge, we have introduced a few
regressions related to how we build it / use it.
This PR attempts to make sure the artefacts are properly built, using
GitHub Actions, so the feature can be used with the operator.
Fixes: #7235
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
SNP's QEMU has changed its name some time ago and, due to that, we have
been leaving the new binary behind during the uninstall process, which
lead to the Operator hanging when uninstalling.
Fixes: #7233
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR adds blogbench and webtooling metrics checks to this repo.
The function running the test intentionally returns zero, so
the test will be enabled in another PR once the workflow is
green.
Fixes: #7069
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR usses double quotes in all the variables as well as general fixes
to the memory usage script in order to have uniformity.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Non AKS k8s tests (SEV/SNP/TDX) don't currently set KATA_HOST_OS, so provide a
default empty value for the variable so that those tests can run.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
as OSSKU value, to get rid of this warning when creating the AKS cluster:
WARNING: The osSKU "AzureLinux" should be used going forward instead of
"CBLMariner" or "Mariner". The osSKUs "CBLMariner" and "Mariner" will
eventually be deprecated.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
We only need to install in run_tests() so that the yq install is picked up by
kubernets/setup.sh as well. We also need to either use (sudo &&
INSTALL_IN_GOPATH=false) || (INSTALL_IN_GOPATH=true).
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This PR adds the checkmetrics ci worker file for cloud hypervisor in
order to check the boot times limit.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds double quotes in all variables to have uniformity across
all the gha-run.sh script.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds checkmetrics installation for gha-run.sh in order to compare
results limits as part of the metrics CI.
Fixes#7198
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
cargo/rust is installed in one step, we need to write the PATH update to
GITHUBENV so that it becomes visible in the next steps.
Fixes: #7221
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This simply displays the asset name first in GH's UI, so that the
release name (always "test") is truncated rather than the asset name.
Makes things slightly easier to read.
e.g.
build-asset (cloud-hypervisor-glibc, te...
instead of
build-asset (test, cloud-hypervisor-gli...
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This allows setting `USE_CACHE=no` to test building e2e during
developmet without having to comment code blocks and so forth.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This enables building CLH with glibc and the mshv feature as required
for Mariner. At test time, it also configures Kata to use that CLH
flavor when running Mariner.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Mariner ships a bleeding-edge kernel that might be ahead of upstream, so
we use that to guarantee compatibility with the host.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
* Adds a new `rootfs-initrd-mariner` build target.
* Sets the custom initrd path via annotation in `setup.sh` at test
time.
* Adapts versions.yaml to specify a `cbl-mariner` initrd variant.
* Introduces env variable `HOST_OS` at deploy time to enable using a
custom initrd.
* Refactors the image builder so that its caller specifies the desired
guest OS.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Currently, even when using devmapper, if the VMM supports virtio-fs /
virtio-9p, that's used to share a few files between the host and the
guest.
This *needed*, as we need to share with the guest contents like secrets,
certificates, and configurations, via Kubernetes objects like configMaps
or secrets, and those are rotated and must be updated into the guest
whenever the rotation happens.
However, there are still use-cases users can live with just copying
those files into the guest at the pod creation time, and for those
there's absolutely no need to have a shared filesystem process running
with no extra obvious benefit, consuming memory and even increasing the
attack surface used by Kata Containers.
For the case mentioned above, we should allow users, making it very
clear which limitations it'll bring, to run Kata Containers with
devmapper without actually having to use a shared file system, which is
already the approach taken when using Firecracker as the VMM.
Fixes: #7207
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR adds memory foot print metrics to tests/metrics/density
folder.
Intentionally, each test exits w/ zero in all test cases to ensure
that tests would be green when added, and will be enabled in a
subsequent PR.
A workflow matrix was added to define hypervisor variation on
each job, in order to run them sequentially.
The launch-times test was updated to make use of the matrix
environment variables.
Fixes: #7066
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
There is nothing in them that requires them to be macros. Converting
them to functions allows for better error messages.
Fixes: #7201
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
There is nothing in it that requires it to be a macro. Converting it to
a function allows for better error messages.
Fixes: #7201
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Having a function allows for better error messages from the type checker
and it makes it clearer to callers what can happen. For example:
is_allowed!(req);
Gives no indication that it may result in an early return, and no simple
way for callers to modify the behaviour. It also makes it look like
ownership of `req` is being transferred.
On the other hand,
is_allowed(&req)?;
Indicates that `req` is being borrowed (immutably) and may fail. The
question mark indicates that the caller wants an early return on
failure.
Fixes: #7201
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Since it is never modified, it doesn't really need a lock of any kind.
Removing the `RwLock` wrapper allows us to remove all `.read().await`
calls when accessing it.
Additionally, `AGENT_CONFIG` already has a static lifetime, so there is
no need to wrap it in a ref-counted heap allocation.
Fixes: #5409
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
This PR adds the word function before the function names in order to have
uniformity across the script as some are using this and some are not.
Fixes#7196
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Vfio support introduce build error on AArch64. Remove arch related
annotation can avoid this error.
Fixes: #7187
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The failure mainly caused by the encoded volume path and
the mount/src. As the src will be validated with stat,but
it's not a full path and encoded, which causes the stat
mount source failed.
Fixes: #7186
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
This PR adds link to the unreference docs in the cmd path to make
them more discoverable.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds checkmetrics makefile which is used to process the
metrics json results files.
Fixes#7172
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds time tests documentation reference in the general README
for kata metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Unlike the previous usage which requires creating
/dev/xxx by mknod on the host, the new approach will
fully utilize the DirectVolume-related usage method,
and pass the spdk controller to vmm.
And a user guide about using the spdk volume when run
a kata-containers. it can be found in docs/how-to.
Fixes: #6526
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
This PR adds the metrics documentation as a general reference in the
main README for kata containers.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds the checkmetrics scripts that will be used for the kata metrics CI.
Fixes#7160
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
We have GH configured so that manual approval is required for CI runs
triggered by outside contributors. However, because CI is triggered by
the `pull_request_target` event, this setting isn't being honored
(see [1]). This means that an attacker could trivially extracts secrets
by submitting a PR.
This change aims to mititgate this issue by preventing PRs from
triggering CI unless the `ok-to-test` label is set.
Note: For further context, we use the `pull_request_target` event and
manually check out the PR branch because it is the only way to both
access secrets and test incoming code changes.
Fixes: #7163
[1]: https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
In order to support different pod VM instance type via
remote hypervisor implementation (cloud-api-adaptor),
we need to pass machine_type, default_vcpus
and default_memory annotations to cloud-api-adaptor.
The cloud-api-adaptor then uses these annotations to spin
up the appropriate cloud instance.
Reference PR for cloud-api-adaptor
https://github.com/confidential-containers/cloud-api-adaptor/pull/1088Fixes: #7140
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
When running on a VM, the kernel parameter "unrestricted_guest" for
kernel module "kvm_intel" is not required. So, return success when running
on a VM without checking value of this kernel parameter.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Implement functionality to add to the env output if the host is capable
of running a VM.
Fixes: #6727
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
This PR removes an unrecognized value located in one of the yamls for the
gha in order to make it work the CI again.
Fixes#7149
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR replaces single spaces for tabs in order to fix the indentation
in the init.sh script.
Fixes#7147
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR installs kata static tarball on metrics runner
and run launch-times tests.
Fixes: #7049
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
The common.sh script includes helper functions used in
our metrics tests, so we are gradually adding more
metrics used in kata.
Fixes: #7108
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This test measures the duration of a workload that starts, and then
immediately stops the contianer. Also measures the workload period,
the time to quit period, and the time to kernel period.
Fixes: #7049
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
A new choice of using vfio devic based volume for kata-containers.
With the help of kata-ctl direct-volume, users are able to add a
specified device which is BDF or IOMMU group ID.
To help users to use it smoothly, A doc about howto added in
docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.
Fixes: #6525
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Limitations:
As no ready rust vmm's vfio manager is ready, it only supports
part of vfio in runtime-rs. And the left part is to call vmm
interfaces related to vfio add/remove.
So when vmm/vfio manager ready, a new PR will be pushed to
narrow the gap.
Fixes: #6525
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
This PR fixes the format for the run launchtimes metrics yaml which
is causing to the workflow to fail.
Fixes#7130
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds the json script which allow us to save the metrics results
into a json file which will be used in the kata containers metrics.
Fixes#7128
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This is to set a default value for `AA_KBC` for the make target `cc_rootfs_initrd_tarball`.
Fixes: #7121
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
This will help to not have to build those on every CI run, and rather
take advantage of the cached image.
Fixes: #7084
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c720869eef)
Let's add the needed infra for only building and pushing the initramfs
builder image to the Kata Containers' quay.io registry.
Fixes: #7084
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 111ad87828)
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder for the initramds.
This will save us some CI time.
Fixes: #7084
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ebf6c83839)
The `-o` option is the legacy way to configure virtiofsd, inherited
from the C implementation. The rust implementation honours it for
compatibility but it logs deprecation warnings.
Let's use the replacement options in the go shim code. Also drop
references to `-o` from the configuration TOML file.
Fixes#7111
Signed-off-by: Greg Kurz <groug@kaod.org>
The C implementation of virtiofsd had some kind of limited support
for remote POSIX locks that was causing some workflows to fail with
kata. Commit 432f9bea6e hard coded `-o no_posix_lock` in order
to enforce guest local POSIX locks and avoid the issues.
We've switched to the rust implementation of virtiofsd since then,
but it emits a warning about `-o` being deprecated.
According to https://gitlab.com/virtio-fs/virtiofsd/-/issues/53 :
The C implementation of the daemon has limited support for
remote POSIX locks, restricted exclusively to non-blocking
operations. We tried to implement the same level of
functionality in #2, but we finally decided against it because,
in practice most applications will fail if non-blocking
operations aren't supported.
Implementing support for non-blocking isn't trivial and will
probably require extending the kernel interface before we can
even start working on the daemon side.
There is thus no justification to pass `-o no_posix_lock` anymore.
Signed-off-by: Greg Kurz <groug@kaod.org>
The rust implementation of virtiofsd always runs foreground and
spits a deprecation warning when `-f` is passed.
Signed-off-by: Greg Kurz <groug@kaod.org>
This PR adds the test lib common script that is going to be used
for kata containers metrics.
Fixes#7113
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The run-launchtimes-metrics workflow needs to get the commit ID
for the last commit to the head branch of the PR.
Fixes: #7116
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
If we override the cold, hot plug with an annotation
we need to reset the other plugging mechanism to NoPort
otherwise both will be enabled.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
In Virt the vhost-user-block is an PCIe device so
we need to make sure to consider it as well. We're keeping
track of vhost-user-block devices and deduce the correct
amount of PCIe root ports.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
This gh-workflow prints a simple msg, but is the base for future
PRs that will gradually add the jobs corresponding to the kata
metrics test.
Fixes: #7100
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR updates the developer guide at the connect to the debug console
section.
Fixes#7094
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Now it is possible to configure the PCIe topology via annotations
and addded a simple test, checking for Invalid and RootPort
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Removed the configuration of PCIeRootPort and PCIeSwitchPort, those
values can be deduced in createPCIeTopology
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Refactor the bus assignment so that the call to GetAllVFIODevicesFromIOMMUGroup
can be used by any module without affecting the topology.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
The hypervisor_state file was the wrong location for the PCIe Port
settings, moved everything under device umbrella, where it can be
consumed more easily and we do not get into circular deps.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
For the GPU CC use case we need to set several crypto algorithms.
The driver relies on them in the CC case.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Use now the sev.conf rather then the snp.conf.
Devices can be prestend in two different way in the
container (1) as vfio devices /dev/vfio/<num>
(2) the device is managed by whataever driver in
the VM kernel claims it.
Fixes: #6844
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
This fixes the builds of `cloud-hypervisor-glibc` and
`rootfs-initrd-mariner` to properly create the `build/` directory.
Fixes: #7098
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This PR updates the firecracker version to 1.3.3 which includes the following
changes
Fixed passing through cache information from host in CPUID leaf 0x80000006.
A race condition that has been identified between the API thread and the VMM
thread due to a misconfiguration of the api_event_fd.
Fixes#7089
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Main merge back to CCv0 caused snp qemu build to move from install_qemu to install_qemu_experimental.
Thus, reflecting this change into the qemu snp command.
Fixes: #7059
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Qemu for SNP is experimental. Thus, when building QEMU for SNP we need to create a builder that builds experimental qemu for CC.
Fixes: #7059
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Kubernetes and Containerd will help calculate the Sandbox Size and pass it to
Kata Containers through annotations.
In order to accommodate this favorable change and be compatible with the past,
we have implemented the handling of the number of vCPUs in runtime-rs. This is
This is slightly different from the original runtime-go design.
This doc introduce how we handle vCPU size in runtime-rs.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
In this commit, we refactored the logic of static resource management.
We defined the sandbox size calculated from PodSandbox's annotation and
SingleContainer's spec as initial size, which will always be the sandbox
size when booting the VM.
The configuration static_sandbox_resource_mgmt controls whether we will
modify the sandbox size in the following container operation.
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Some vmms, such as dragonball, will actively help us
perform online cpu operations when doing cpu hotplug.
Under the old onlineCpuMem interface, it is difficult
to adapt to this situation.
So we modify the semantics of nb_cpus in onlineCpuMemRequest.
In the original semantics, nb_cpus represents the number of
newly added CPUs that need to be online. The modified
semantics become that the number of online CPUs in the guest
needs to be guaranteed.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
The declaration of the cpu number in the cpuset is greater
than the actual number of vcpus, which will cause an error when
updating the cgroup in the guest.
This problem is difficult to solve, so we temporarily clean up
the cpuset in the container spec before passing in the agent.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Updating vCPU resources and memory resources of the sandbox and
updating cgroups on the host will always happening together, and
they are all updated based on the linux resources declarations of
all the containers.
So we merge update_cgroups into the update_linux_resources, so we
can better manage the resources allocated to one pod in the host.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Support vcpu resizing on runtime side:
1. Calculate vcpu numbers in resource_manager using all the containers'
linux_resources in the spec.
2. Call the hypervisor(vmm) to do the vcpu resize.
3. Call the agent to online vcpus.
Fixes: #5030
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Nobody has volunteered to maintain the (currently broken) snap build, so
remove it.
Fixes: #6769.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
After we support memory resize in Dragonball, we need to update
Cargo.lock in runtime-rs.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Adding vhost and vhost-net to the kernel modules. These do not require
any kernel module parameters to be checked. Currently, kernel params is
a required field. Make this as optional. Could make this as <Option>,
but making this a slice instead, as a module could have multiple kernel
params. Refactor the function that checks are for kernel modules into
two with one specifically checking if the module is loaded and other
checking for module parameters.
Refactor some of the tests to take into account these changes.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
This adds the glibc flavor of CLH to the list of assets as preparation
for #6839. Mariner Kata is only tested with glibc.
Fixes: #7026
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
We introduce virtio-balloon device to support memory resize.
virtio-balloon device could reclaim memory from guest to host.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
We introduce virtio-mem device to support memory resize. virtio-mem
device could hot-plug more memory blocks to guest and could also
hot-unplug them from guest.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
As block/direct volume use similar steps of device adding,
so making full use of block volume code is a better way to
handle direct volume.
the only different point is that direct volume will use
DirectVolume and get_volume_mount_info to parse mountinfo.json
from the direct volume path. That's to say, direct volume needs
the help of `kata-ctl direct-volume ...`.
Details seen at Advanced Topics:
[How to run Kata Containers with kinds of Block Volumes]
docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.md
Fixes: #5656
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
In order to support virtio-mem and virtio-balloon devices, we need to
extend DeviceOpContext with VmConfigInfo and InstanceInfo.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
The key aspects of the DM implementation refactoring as below:
1. reduce duplicated code
Many scenarios have similar steps when adding devices. so to reduce
duplicated code, we should create a common method abstracted and use
it in various scenarios.
do_handle_device:
(1) new_device with DeviceConfig and return device_id;
(2) try_add_device with device_id and do really add device;
(3) return device info of device's info;
2. return full info of Device Trait get_device_info
replace the original type DeviceConfig with full info DeviceType.
3. refactor find_device method.
Fixes: #5656
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Qemu entry for SNP was changed in the versions.yaml resulting into the incorrect qemu build for SNP.
Fixes: #7059
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
After we have a guest kernel with builtin initramfs which
provide the rootfs measurement capability and Kata rootfs
image with hash device, we need set related root hash value
and measure config to the kernel params in kata configuration file.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Integrate initramfs into guest kernel as one binary,
which will be measured by the firmware together.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
The init.sh in initramfs will parse the verity scheme,
roothash, root device and setup the root device accordingly.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Generate rootfs hash data during creating the kata rootfs,
current kata image only have one partition, we add another
partition as hash device to save hash data of rootfs data blocks.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Add required kernel config for dm-crypt/dm-integrity/dm-verity
and related crypto config.
Add userspace command line tools for disk encryption support
and ext4 file system utilities.
Fixes: #6674
Signed-off-by: Arron Wang <arron.wang@intel.com>
This PR updates the link to the correspondent Developer Guide at the
enabling full containerd debug that we have for kata 2.0 documentation.
Fixes#7034
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
When the version of libc is upgraded to 0.2.145, older getrandom could not adapt
to new API, and this will make agent-ctl fail to compile.
We upgrade the version of `rand`, so the low version of getrandom will no longer
need.
Fixes: #7032
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Fixes: #5401, #6654
- Switch kata-ctl from eprintln!()/println!() to structured logging via
the logging library which uses slog.
- Adds a new create_term_logger() library call which enables printing
log messages to the terminal via a less verbose / more human readable
terminal format with colors.
- Adds --log-level argument to select the minimum log level of printed messages.
- Adds --json-logging argument to switch to logging in JSON format.
Co-authored-by: Byron Marohn <byron.marohn@intel.com>
Co-authored-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Jayant Singh <jayant.singh@intel.com>
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Kelby Madal-Hellmuth <kelby.madal-hellmuth@intel.com>
Signed-off-by: Liz Lawrens <liz.lawrens@intel.com>
Github Actions reads and runs workflow files from the main branch,
rather than from the PR branch. This means that PRs that modify workflow
files aren't being tested with the updated workflows coming from the PR,
but rather with the old workflows from the main branch. AFAIK, this
behavior isn't avoidable for workflow files (but is for other scripts).
This makes it very hard to reliably test workflow changes before they're
actually merged into main and leads to issues that we have to hotifx
(see #6983, #6995).
This PR aims to mitigate that by extracting the commands used in
workflows to a separate script file. The way our CI is set up, those
script files are read from the PR branch and thus changes would be
reflected in the CI checks.
Fixes: #6971
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
In preparation for CoCo 0.6.0 release, updated td-shim to commit
3252047213b2c580c21bdc52f67e8515ca1e374a
Fixes#7022
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
In preparation for CoCo 0.6.0 release, updated attestation-agent to
commit aa1d3c510350cd2f2668aca374abba19e2b73b3f
Fixes#7022
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
In hypervisors that do not support virtiofs we have to copy files in
the VM sandbox to properly setup the network (resolv.conf, hosts, and hostname).
To do that, we construct the volume as before, with the addition of an extra
variable that designates the path where the file will reside in the sandbox.
In this case, we issue a `copy_file` agent request *and* we patch the spec
to account for this change.
Fixes: #6978
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
Signed-off-by: George Pyrros <gpyrros@nubificus.co.uk>
When dragonball update dbs-boot crate in commit
64c764c147, the Cargo.lock in runtime-rs
should also be updated.
Fixes: #6969
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Full SHA is 40 characters, while AKS cluster name has a limit of 63. Trim the
SHA to 12 characters, which is widely considered to be unique enough and is
short enough to be used in the cluster name
Fixes: #7010
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Let's start adding the status of our jobs as part of our main page, so
folks monitoring those can easily check whether they're okay, or if
someone has to be pinged about those.
Fixes: #7008
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Content of commit
Update Cargo.toml of kata-agent
Change the features to use new naming convention
Run make vendor, to fix the static checks
Update image-rs, step4 of release checklist
Fixes: #6635
Signed-off-by: Jordan Jackson <jordan.jackson@ibm.com>
We added that to create the cluster name, but I forgot to add that to
the part we get the k8s config file, or to the part where we delete the
AKS cluster.
Fixes: #6999
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need to do so, otherwise we'll create two clusters for testing Cloud
Hypervisor with exactly the same name, one using Ubuntu, and one using
Mariner.
Fixes: #6999
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The string representing the architecture aarch64 and x86_64 need to be changed to arm64 and amd64 for the release.
Fixes: #6986
Signed-off-by: SinghWang <wangxin_0611@126.com>
There were recent changes for the tdx kernel in the version.yaml that are
not currently accounted for in the build-kernel.sh script.
Attempts to setup a tdx kernel to build local changes seemed to not download
the tdx kernel. Instead the mainline kernel is downloaded which has no
tdx-related changes.
The version.yaml has a new entry for tdx kernel. Use that instead for
setting up and downloading the tdx kernel.
Fixes: #6984
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
While the Mariner Kata host is in preview, we need the `aks-preview`
extension to enable the `--workload-runtime KataMshvVmIsolation` flag.
Fixes: #6994
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This PR is to make an environment variable `BUILDER_REGISTRY` configurable
so that those who want to use their own registry for build can set up
the registry.
Fixes: #6988
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
The vcpu hotplug/hotunplug feature is implemented with upcall. This commit
add three patches to support the feature on aarch64. Patches:
> 0005: add support of upcall on aarch64
> 0006: skip activate offline cpus' MSI interrupt
> 0007: set the correct boot cpu number
Fixes: #6010
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
This commit implements the vcpu_boot_onlined vector in get_fdt_vm_info.
"boot_enabled" means whether this vcpu should be onlined at first boot.
It will be used by fdt, which write an attribute called boot_enabled,
and will be handled by guest kernel to pass the correct cpu number to
function "bringup_nonboot_cpus".
Fixes: #6010
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
This commit add support of resize_vcpu on aarch64. As kvm will check
whether vgic is initialized when calling KVM_CREATE_VCPU ioctl, all the
vcpu fds should be created before vm is booted.
To support resizing vcpu scenario, we use max_vcpu_count for
create_vcpus and setup_interrupt_controller interfaces. The
SetVmConfiguration API will ensure max_vcpu_count >= boot_vcpu_count.
Fixes: #6010
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
dbs-boot-v0.4.0 refectors the create_fdt interface. It simplifies the
parameters needed to be passed and abstracts them into three structs.
By the way, it also reserves some interfaces for future feature: numa
passthrough and cache passthrough.
Fixes: #6969
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Rewrite the comment of Vm::init_microvm method for aarch64.
Fixes cargo test warnings on aarch64.
Fixes: #6969
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Pod annotations (io.katacontainers.*) are not meaningful
for the remote hypervisor. This patch disables pod annotations
in the kata-remote settings of the containerd configuration.
Fixes: #6345
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
Move the get_volume_mount_info to kata-types/src/mount.rs.
If so, it becomes a common method of DirectVolumeMountInfo
and reduces duplicated code.
Fixes: #6701
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
When run a exec process in backgroud without tty, the
exec will hang and didn't terminated.
For example:
crictl -i <container id> sh -c 'nohup tail -f /dev/null &'
Fixes: #4747
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
The current testing setup only supports running Kata on top of an Ubuntu
host. This adds Mariner to the matrix of testable hosts for k8s
tests, with Cloud Hypervisor as a VMM.
As preparation for the upcoming PR that will change only the actual test
code (rather than workflow YAMLs), this also introduces a new file
`setup.sh` that will be used to set host-specific parameters at test
run-time.
Fixes: #6961
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
sandbox_bind_mounts supports kinds of mount patterns, for example:
(1) "/path/to", default readonly mode.
(2) "/path/to:ro", same as (1).
(3) "/path/to:rw", readwrite mode.
Both support configuration and annotation:
(1)[runtime]
sandbox_bind_mounts=["/path/to", "/path/to:rw", "/mnt/to:ro"]
(2) annotation will alse be supported, restricted as below:
io.katacontainers.config.runtime.sandbox_bind_mounts
= "/path/to /path/to:rw /mnt/to:ro"
Fixes: #6597
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
We're still facing issues related to the time taken to deploy the
kata-deplot daemonset and starting to run the tests.
Ideally, we should solve this with a readiness probe, and that's the
approach we want to take in the future. However, for now, let's just
make sure those tests are not on the way of the community.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've seen tests being aborted close to the end of the run due to the
timeout. Let's increase it, avoiding to hit such cases again..
Fixes: #6964
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR removes unwanted white spaces in order to fix the format
of the kata-deploy-binaries script.
Fixes: #6962
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
Instead of setting:
```
firmware = "/path/to/OVMF.fd"
firmware_volume = "/path/to/OVMF_VARS.fd"
```
We should either be setting:
```
firmware = "/path/to/OVMF.fd"
```
Or:
```
firmware = "/path/to/OVMF_CODE.fd"
firmware_volume = "/path/to/OVMF_VARS.fd"
```
I'm taking the approach to setting up the latter, as that's what's been
tested as part of our TDX CI.
Fixes: #4926
This patch is the same as #4927, but it ended up reverted somewhere in
the CCv0 -> main process, or in the attempts to fix TDX after that.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're currently backing up and restoring all the possible shim files,
but the default one ("containerd-shim-kata-v2").
Let's ensure this is also backed up and restored.
Fixes: #6957
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR fixes the indentation on the kata deploy merge script
that instead of single spaces uses a tap.
Fixes#6925
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
There is a race condition when virtiofsd is killed without finishing all
the clients. Because of that, when a pod is stopped, QEMU detects
virtiofsd is gone, which is legitimate.
Sending a SIGTERM first before killing could introduce some latency
during the shutdown.
Fixes#6757.
Signed-off-by: Beraldo Leal <bleal@redhat.com>
We previously were doing:
* Create a new image on kata-deploy-ci using the commit hash of the
latest tag
* This was used to test on AKS, which is no longer needed as we test
on AKS on every PR
* Create a new image on kata-deploy using the release tag and "latest"
or "stable", by tagging the kata-deploy-ci image accordingly
As part of cfe63527c5, we broke the
workflow described above, as in the first step we would save the PKG_SHA
to be used in the second step, but that part ended up being removed.
Anyways, this back and forth is not needed anymore and we can simplify
the process by doing:
* Create a new image on kata-deploy, using:
- The tag received as ref from the event that triggered this worklow
- "latest" or "stable" tag, depending on whether it's a stable release
or not
Fixes: #6946
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The rebase from `main` to `CCv0` ended up overwriting the image path
that should be used for QEMU, in the CCv0 branch.
Fixes: #6932
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
For some bizarre reason, the login-action will simply fail to
authenticate to docker.io in it's specified as a registry. The way to
proceed, instead, is to *not* specify any registry as it'd be used by
default.
Fixes: #6943
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR updates the nydus version to 2.2.1. This change includes:
nydus-image: fix a underflow issue in get_compressed_size()
backport fix/feature to stable 2.2
[backport] contrib: upgrade runc to v1.1.5
service: add README for nydus-service
nydus: fix a possible panic caused by SubCmdArgs::is_present
Backports two bugfixes from master into stable/v2.2
[backport stable/v2.2] action: upgrade golangci-lint to v1.51.2
[backport] action: fix smoke test for branch pattern
Fixes#6938
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
`docker/login-action@v3` does *not* exist and `docker/login-action@v2`
should be used instead.
Fixes: #6934
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've a discrepancy on what's set along the scripts used to build the
Kata Cotainers artefacts locally.
Some of those were missing a way to easily debug them in case of a
failure happens, but one specific one (build-and-upload-payload.sh)
could actually silently fail.
All of those have been changed as part of this commut.
Fixes: #6908
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ae24dc73c1)
We're facing some issues to download / use the public key provided by
google for installing kubernetes as part of the kata-deploy image.
```
The following signatures couldn't be verified because the public key is
not available: NO_PUBKEY B53DC80D13EDEF05
Reading package lists... Done
W: GPG error: https://packages.cloud.google.com/apt kubernetes-xenial
InRelease: The following signatures couldn't be verified because the
public key is not available: NO_PUBKEY B53DC80D13EDEF05 E: The
repository 'https://apt.kubernetes.io kubernetes-xenial InRelease' is
not signed.
N: Updating from such a repository can't be done securely, and is
therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user
configuration details.
```
Let's work this around following the suggestion made by @dims, at:
https://github.com/kubernetes/k8s.io/pull/4837#issuecomment-1446426585
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 636539bf0c)
- Fix cache for OVMF and rootfs-initrd (both x86_64)
- Upgrade to Cloud Hypervisor v32.0
- osbuilder: Bump fedora image version
- local-build: Standardise what's set for the local build scripts
- gha: aks: Wait a little bit more before run the tests
- docs: Update container network model url
- gha: release: Fix s390x worklow
- cache: Fix OVMF caching
- gha: payload-after-push: Pass secrets down
- tools: Fix arch bug
22154e0a3 cache: Fix OVMF tarball name for different flavours
b7341cd96 cache: Use "initrd" as `initrd_type` to build rootfs-initrd
b8ffcd1b9 osbuilder: Bump fedora image version
636539bf0 kata-deploy: Use apt-key.gpg from k8s.io
ae24dc73c local-build: Standardise what's set for the local build scripts
35c3d7b4b runtime: clh: Re-generate the client code
cfee99c57 versions: Upgrade to Cloud Hypervisor v32.0
ad324adf1 gha: aks: Wait a little bit more before run the tests
191b6dd9d gha: release: Fix s390x worklow
cfd8f4ff7 gha: payload-after-push: Pass secrets down
75330ab3f cache: Fix OVMF caching
a89b44aab tools: Fix arch bug
11a34a72e docs: Update container network model url
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
75330ab3f9 tried to fix OVMF caching, but
didn't consider that the "vanilla" OVMF tarball name is not
"kata-static-ovmf-x86_64.tar.xz", but rather "kata-static-ovmf.tar.xz".
The fact we missed that, led to the cache builds of OVMF failing, and
the need to build the component on every single PR.
Fixes: #6917 (hopefully for good this time).
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've been defaulting to "", which would lead to a mismatch with the
latest version from the cache, causing a miss, and finally having to
build the rootfs-initrd as part of the tests, every single time.
Fixes: #6917
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This commit should've been part of the series that reverted a bunch of
TDX changes that are not compatible with the TDX stack we're using in
the Jenkins CI machine.
The change made here is in order to match what's been undone here:
c29e5036a6Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're facing some issues to download / use the public key provided by
google for installing kubernetes as part of the kata-deploy image.
```
The following signatures couldn't be verified because the public key is
not available: NO_PUBKEY B53DC80D13EDEF05
Reading package lists... Done
W: GPG error: https://packages.cloud.google.com/apt kubernetes-xenial
InRelease: The following signatures couldn't be verified because the
public key is not available: NO_PUBKEY B53DC80D13EDEF05 E: The
repository 'https://apt.kubernetes.io kubernetes-xenial InRelease' is
not signed.
N: Updating from such a repository can't be done securely, and is
therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user
configuration details.
```
Let's work this around following the suggestion made by @dims, at:
https://github.com/kubernetes/k8s.io/pull/4837#issuecomment-1446426585
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've a discrepancy on what's set along the scripts used to build the
Kata Cotainers artefacts locally.
Some of those were missing a way to easily debug them in case of a
failure happens, but one specific one (build-and-upload-payload.sh)
could actually silently fail.
All of those have been changed as part of this commut.
Fixes: #6908
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This patch re-generates the client code for Cloud Hypervisor v32.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.
Fixes: #6632
Signed-off-by: Bo Chen <chen.bo@intel.com>
fa832f4709 increased the timeout, which
helped a lot, mainly in the TEE machines. However, we're still seeing
some failures here and there with the AKS tests.
Let's bump it yet again and, hopefully, those errors to start the tests
will go away.
Fixes: #6905
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
GitHub is warning us that:
"""
The workflow is not valid. In .github/workflows/release.yaml (Line: 21,
Col: 11): Error from called workflow
kata-containers/kata-containers/.github/workflows/release-s390x.yaml@d2e92c9ec993f56537044950a4673e50707369b5
(Line: 14, Col: 12): Job 'kata-deploy' depends on unknown job
'create-kata-tarball'.
"""
This is happening as we need to reference
"build-kata-static-tarball-s390x" instead of "create-kata-tarball".
Fixes: #6903
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
I cannot easily pin-point which commit dropped it, but my gut feeling is
that it's the result of an erroneous conflict resolution when merging
content from main to the CCv0 branch.
Regardless of when / why it happened, as the root_hash logic ended up
being dropped, workflows that depend on that are now failing.
With everything said in mind, let's bring the logic back.
Fixes: #6901
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The "build-assets-${arch}" jobs need to have access to the secrets in
order to log into the container registry in the cases where
"push-to-registry", which is used to push the builder containers to
quay.io, is set to "yes".
Now that "build-assets-${arch}" pass the secrets down, we need to log
into the container registry in the "build-kata-static-tarball-${arch}"
files, in case "push-to-registry" is set to "yes".
Fixes: #6899
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
OVMF has been cached, but it's not been used from cache as the `version`
set in the cached builds has always been empty.
The reason for that is because we've been trying to look for
`externals.ovmf.ovmf.version`, while we should be actually looking for
`externals.ovmf.x86_64.version`.
Setting `x86_64` as the OVMF_FLAVOUR would cause another bug, as the
expected tarball name would then be `kata-static-x86_64.tar.xz`, instead
of `kata-static-ovmf-x86_64.tar.xz`.
With everything said, let's simplify the OVMF_FLAVOUR usage, by using it
as it's passed, and only adapting the tarball name for the TDVF case,
which is the abnormal one.
Fixes: #6897
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- runtime: Use static_sandbox_resource_mgmt=true for TEEs
- update tokio dependency
- resource-control: fix setting CPU affinities on Linux
- runtime: use enable_vcpus_pinning from toml
- gha: k8s: Make the tests more reliable
- gha: Enable SEV-SNP tests on main
- gha: tdx: Use the k3s overlay for kata-cleanup
- runtime: Port sev package to main
- gpu: Rename the last bits from `gpu` to `nvidia-gpu`
- deploy: fix shell script error
- ppc64le: switch virtiofsd from C to rust version
- osbuilder: Fix indentation in rootfs.sh
- virtcontainers/qemu_test.go: Improve coverage
- agent: Add context to errors that may occur when AgentConfig file is …
- virtcontainers/pkg/compatoci/: Improved coverage for for Kata 2.0
- kata-manager: Fix '-o' syntax and logic error
- kata-ctl: Add the option to install kata-ctl to a user specified directory
- runtime-rs: fix building instructions to use correct required Rust ve…
- Dragonball: use LinuxBootConfigurator::write_bootparams
- kata-deploy: Add http_proxy as part of the docker build
- kata-deploy: Do not ship the kata tarball
- kata-deploy: Build improvements
- deploy: Fix arch in image tag
- Revert "kata-deploy: Use readinessProbe to ensure everything is ready"
- virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5%
- main | release: Fix multi-arch publishing is not supported
- cache: More fixes to nvidia-gpu kernels caching
- runtime: remove overriding ARCH value by default for ppc64le
- gha: Fix Body Line Length action flagging empty body commit messages
- gha: Fix snap creation workflow
- cache: Fix nvidia-gpu version
- cache: Update the KERNEL_FLAVOUR list to include nvidia-gpu
- packaging: Add SEV-SNP artifacts to main
- docs: Mark snap installation method as unmaintained
- packaging: Add sev artifacts to main
- kata-ctl: add generic kvm check & unit test
- Log-parser-rs
- warning_fix: fix warnings when build with cargo-1.68.0
- cross-compile: Include documentation and configuration for cross-compile
- runtime: Fix virtiofs fd leak
- gpu: cold plug VFIO devices
- pkg/signals: Improved test coverage 60% to 100%
- virtcontainers/persist: Improved test coverage 65% to 87.5%
- virtcontainers/clh_test.go: improve unit test coverage
- virtcontainers/factory: Improved test coverage
- gha: Also run k8s tests on qemu-snp
- gha: sev: fix for kata-deploy error
- gha: Also run k8s tests on qemu-sev
- Implement the "kata-ctl env" command
- runtime-rs: support keep_abnormal in toml config
- gpu: Build and Ship an GPU enabled Kernel
- kata-ctl: checks for kvm, kvm_intel modules loaded
- osbuilder: Fix D-Bus enabling in the dracut case
- snap: fix docker start fail issue
- kata-manager: Fix containerd download
- agent: Fix ut issue caused by fd double closed
- Bump ttrpc to 0.7.2 and protobuf to 3.2.0
- gpu: Add GPU enabled confguration and runtime
- gpu: Do not pass-through PCI (Host) Bridges
- cache-components: Fix caching of TDVF and QEMU for TDX
- gha: tdx: Ensure kata-deploy is removed after the tests run
- versions: Upgrade to Cloud Hypervisor v31.0
- osbuilder: Enable dbus in the dracut case
- runtime: Don't create socket file in /run/kata
- nydus_rootfs/prefetch_files: add prefetch_files for RAFS
- runtime-rs/virtio-fs: add support extra handler for cache mode.
- runtime-rs: enable nerdctl to setup cni plugin
- tdx: Add artefacts from the latest TDX tools release into main
- runtime: support non-root for clh
- gha: ci-on-push: Run k8s tests with dragonball
- rustjail: Use CPUWeight with systemd and CgroupsV2
- gha: k8s-on-aks: {create,delete} AKS must be a coded-in step
- docs: update the rust version from version.yaml
- gha: k8s-on-aks: Set {create,delete}_aks as steps
- gha: k8s-on-aks: Fix cluster name
- gha: Also run k8s tests on AKS with dragonball
- gha: Only push images to registry after merging a PR
- gha: aks: Use D4s_v5 instance
- tools: Avoid building the kernel twice
- rustjail: Fix panic when cgroup manager fails
- runtime: add filter metrics with specific names
- gha: Use ghcr.io for the k8s CI
- GHA |Switch "kubernetes tests" from jenkins to GitHub actions
- docs: Update CNM url in networking document
- kata-ctl: add function to get platform protection.
f6e1b1152 agent: update tokio dependency
4cb83dc21 kata-ctl: update tokio dependency
df615ff25 runk: update tokio dependency
ca6892ddb runtime-rs: update tokio dependency
ca1531fe9 runtime: Use static_sandbox_resource_mgmt=true for TEEs
fa832f470 gha: k8s: Make the tests more reliable
cbb9fe8b8 config: Use standard OVMF with SEV
724437efb kata-deploy: add kata-qemu-sev runtimeclass
521dad2a4 Tests: skip CPU constraints test on SEV and SNP
72308ddb0 gha: ci-on-push: Don't skip tests for SEV
da0f92cef gha: ci-on-push: Don't skip tests for SEV-SNP
12f43bea0 gha: tdx: Use the k3s overlay for kata-cleanup
1a3f8fc1a deploy: fix shell script error
87cb98c01 osbuilder: Fix indentation in rootfs.sh
c5a59caca ppc64le: switch virtiofsd from C to rust version
bfdf0144a versions: Bump virtiofsd to 1.6.1
dd7562522 runtime: pkg/sev: Add kbs utility package for SEV pre-attestation
05de7b260 runtime: Add sev package
3a9d3c72a gpu: Rename the last bits from `gpu` to `nvidia-gpu`
4cde844f7 local-build: Fix kernel-nvidia-gpu target name
593840e07 kata-ctl: Allow INSTALL_PATH= to be specified
bdb75fb21 runtime: use enable_vcpus_pinning from toml
20cb87508 virtcontainers/qemu_test.go: Improve test coverage
b9a1db260 kata-deploy: Add http_proxy as part of the docker build
3e85bf5b1 resource-control: fix setting CPU affinities on Linux
5f3f844a1 runtime-rs: fix building instructions with respect to required Rust version
777c3dc8d kata-deploy: Do not ship the kata tarball
50cc9c582 tests: Improve coverage for virtcontainers/pkg/compatoci/ for Kata 2.0
136e2415d static-build: Download firecracker instead of building it
3bf767cfc static-build: Adjust ARCH for nydus
ac88d34e0 static-build: Use relased binary for CLH (aarch64)
73913c8eb kata-manager: Fix '-o' syntax and logic error
2856d3f23 deploy: Fix arch in image tag
e8f81ee93 Revert "kata-deploy: Use readinessProbe to ensure everything is ready"
cfe63527c release: Fix multi-arch publishing is not supported
197c33651 Dragonball: use LinuxBootConfigurator::write_bootparams to writes the boot parameters into guest memory.
4d17ea4a0 cache: Fix nvidia-snp caching version
a133fadbf cache: Fix nvidia-gpu-tdx-experimental cache URL
b9990c201 cache: Fix nvidia-gpu version
c9bf7808b cache: Update the KERNEL_FLAVOUR list to include nvidia-gpu
3665b4204 gpu: Rename `gpu` targets to `nvidia-gpu`
2c90cac75 local-build: fixup alphabetization
4da6eb588 kata-deploy: Add qemu-snp shim
14dd05375 kata-deploy: add kata-qemu-snp runtimeclass
0bb37bff7 config: Add SNP configuration
af7f2519b versions: update SEV kernel description
dbcc3b5cc local-build: fix default values for OVMF build
b8bbe6325 gha: build OVMF for tests and release
cf0ca265f local-build: Add x86_64 OVMF target
db095ddeb cache: add SNP flavor to comments
f4ee00576 gha: Build and ship QEMU for SNP
7a58a91fa docs: update SNP guide
879333bfc versions: update SNP QEMU version
38ce4a32a local-build: add support to build QEMU for SEV-SNP
5f8008b69 kata-ctl: add unit test for kvm check
a085a6d7b kata-ctl: add generic kvm check
772d4db26 gha: Build and ship SEV initrd
45fa36692 gha: Build and ship SEV OVMF
4770d3064 gha: Build and ship SEV kernel.
fb9c1fc36 runtime: Add qemu-sev config
813e4c576 runtimeClasses: add sev runtime class
af18806a8 static-build: Add caching support to sev ovmf
76ae7a3ab packaging: adding caching capability for kernel
12c5ef902 packaging: add support to build OVMF for SEV
b87820ee8 packaging: add support to build initrd for sev
e1f3b871c docs: Mark snap installation method as unmaintained
022a33de9 agent: Add context to errors when AgentConfig file is missing
b0e6a094b packaging: Add sev kernel build capability
a4c0303d8 virtcontainers: Fixed static checks for improved test coverage for fc.go
8495f830b cross-compile: Include documentation and configuration for cross-compile
13d7f39c7 gpu: Check for VFIO port assignments
6594a9329 tools: made log-parser-rs
03a8cd69c virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5%
9e2b7ff17 gha: sev: fix for kata-deploy error
5c9246db1 gha: Also run k8s tests on qemu-snp
c57a44436 gha: Add the ability to test qemu-snp
406419289 env: Utilize arch specific functionality to get cpu details
fb40c71a2 env: Check for root privileges
1016bc17b config: Add api to fetch config from default config path
b908a780a kata-env: Pass cmd option for file path
b1920198b config: Workaround the way agent and hypervisor configs are fetched
f2b2621de kata-env: Implement the kata-env command.
c849bdb0a gha: Also run k8s tests on qemu-sev
6bf1fc605 virtcontainers/factory: Improved test coverage
0d49ceee0 gha: Fix snap creation workflow warnings
138ada049 gpu: Cold Plug VFIO toml setting
defb64334 runtime: remove overriding ARCH value by default for ppc64le
f7ad75cb1 gpu: Cold-plug extend the api.md
0fec2e698 gpu: Add cold-plug test
f2ebdd81c utils: Get rid of spurious print statement left behind.
9a94f1f14 make: Export VERSION and COMMIT
2f81f48da config: Add file under /opt as another location to look for the config
07f7d17db config: Make the pipe_size field optional
68f635773 config: Make function to get the default conf file public
7565b3356 kata-ctl: Implement Display trait for GuestProtection enum
94a00f934 utils: Make certain constants in utils.rs public
572b338b3 gitignore: Ignore .swp and .swo editor backup files
376884b8a cargo: Update version of clap to 4.1.13
17daeb9dd warning_fix: fix warnings when build with cargo-1.68.0
521519d74 gha: Add the ability to test qemu-sev
205909fbe runtime: Fix virtiofs fd leak
5226f15c8 gha: Fix Body Line Length action flagging empty body commit messages
0f45b0faa virtcontainers/clh_test.go: improve unit test coverage
dded731db gpu: Add OVMF setting for MMIO aperture
2a830177c gpu: Add fwcfg helper function
131f056a1 gpu: Extract VFIO Functions to drivers
c8cf7ed3b gpu: Add ColdPlug of VFIO devices with devManager
e2b5e7f73 gpu: Add Rawdevices to hypervisor
6107c32d7 gpu: Assign default value to cold-plug
377ebc2ad gpu: Add configuration option for cold-plug VFIO
c18ceae10 gpu: Add new struct PCIePort
9c38204f1 virtcontainers/persist: Improved test coverage 65% to 87.5%
1c1ee8057 pkg/signals: Improved test coverage 60% to 100%
cc8ea3232 runtime-rs: support keep_abnormal in toml config
96e8470db kata-manager: Fix containerd download
432d40744 kata-ctl: checks for kvm, kvm_intel modules loaded
b1730e4a6 gpu: Add new kernel build option to usage()
3e7b90226 osbuilder: Fix D-Bus enabling in the dracut case
53c749a9d agent: Fix ut issue caused by fd double closed
2e3f19af9 agent: fix clippy warnings caused by protobuf3
4849c56fa agent: Fix unit test issue cuased by protobuf upgrade
0a582f781 trace-forwarder: remove unused crate protobuf
73253850e kata-ctl: remove unused crate ttrpc
76d2e3054 agent-ctl: Bump ttrpc from 0.6.0 to 0.7.1
eb3d20dcc protocols: Add ut for Serde
59568c79d protocols: add support for Serde
a6b4d92c8 runtime-rs: Bump ttrpc from 0.6.0 to 0.7.1
ac7c63bc6 gpu: Add containerd shim for qemu-gpu
a0cc8a75f gpu: Add a kube runtime class
a81fff706 gpu: Adding a GPU enabled configuration
8af6fc77c agent: Bump ttrpc from 0.6.0 to 0.7.1
009b42dbf protocols: Fix unit test
392732e21 protocols: Bump ttrpc from 0.6.0 to 0.7.1
f4f958d53 gpu: Do not pass-through PCI (Host) Bridges
825e76948 gpu: Add GPU support to default kernel without any TEE
e4ee07f7d gpu: Add GPU TDX experimental kernel
a1272bcf1 gha: tdx: Fix typo overlay -> overlays
3fa0890e5 cache-components: Fix TDVF caching
80e3a2d40 cache-components: Fix TDX QEMU caching
87ea43cd4 gpu: Add configuration fragment
aca6ff728 gpu: Build and Ship an GPU enabled Kernel
dc662333d runtime: Increase the dial_timeout
eb1762e81 osbuilder: Enable dbus in the dracut case
f478b9115 clh: tdx: Update timeouts for confidential guest
3b76abb36 kata-deploy: Ensure node is ready after CRI Engine restart
5ec9ae0f0 kata-deploy: Use readinessProbe to ensure everything is ready
ea386700f kata-deploy: Update podOverhead for TDX
e31efc861 gha: tdx: Use the k3s overlay
542bb0f3f gha: tdx: Set KUBECONFIG env at the job level
d7fdf19e9 gha: tdx: Delete kata-deploy after the tests finish
da35241a9 tests: k8s: Skip k8s-cpu-ns when testing TDX
db2cac34d runtime: Don't create socket file in /run/kata
6d315719f snap: fix docker start fail issue
e4b3b0887 gpu: Add proper CONFIG_LOCALVERSION depending on TEE
69ba2098f runtime-rs: remove network entities and netns
b31f103d1 runtime-rs: enable nerdctl cni plugin
69d7a959c gha: ci-on-push: Run tests on TDX
5a0727ecb kata-deploy: Ship kata-qemu-tdx runtimeClass
98682805b config: Add configuration for QEMU TDX
3e1580019 govmm: Directly pass the firmware using -bios with TDX
3c5ffb0c8 govmm: Set "sept-ve-disable=on"
ed145365e runtime/qemu: Drop "kvm-type=tdx"
25b3cdd38 virtcontainers: Drop check for the `tdx` CPU flag
01bdacb4e virtcontainers: Also check /sys/firmwares/tdx for TDX
9feec533c cache: Add ability to cache OVMF
ce8d98251 gha: Build and ship the OVMF for TDX
39c3fab7b local-build: Add support to build OVMF for TDX
054174d3e versions: Bump OVMF for TDX
800fb49da packaging: Add get_ovmf_image_name() helper
fbf03d7ac cache: Document kernel-tdx-experimental
5d79e9696 cache: Add a space to ease the reading of the kernel flavours
6e4726e45 cache: Fix typos
fc22ed0a8 gha: Build and ship the Kernel for TDX
502844ced local-build: Add support to build Kernel for TDX
b2585eecf local-build: Avoid code duplication building the kernel
f33345c31 versions: Update Kernel TDX version
20ab2c242 versions: Move Kernel TDX to its own experimental entry
3d9ce3982 cache: Allow specifying the QEMU_FLAVOUR
33dc6c65a gha: Build and ship QEMU for TDX
eceaae30a local-build: Add support to build QEMU for TDX
f7b7c187e static-build: Improve qemu-experimental build script
3018c9ad5 versions: Update QEMU TDX version
800ee5cd8 versions: Move QEMU TDX to its own experimental entry
1315bb45f local-build: Add dragonball kernel to the `all` target
73e108136 local-build: Rename non vanilla kernel build functions
1d851b4be local-build: Cosmetic changes in build targets
49ce685eb gha: k8s-on-aks: Always delete the AKS cluster
e2a770df5 gha: ci-on-push: Run k8s tests with dragonball
d1f550bd1 docs: update the rust version from versions.yaml
f3595e48b nydus_rootfs/prefetch_files: add prefetch_files for RAFS
3bfaafbf4 fix: oci hook
c1fbaae8d rustjail: Use CPUWeight with systemd and CgroupsV2
375187e04 versions: Upgrade to Cloud Hypervisor v31.0
79f3047f0 gha: k8s-on-aks: {create,delete} AKS must be a coded-in step
2f35b4d4e gha: ci-on-push: Only run on `main` branch
e7bd2545e Revert "gha: ci-on-push: Depend on Commit Message Check"
0d96d4963 Revert "gha: ci-on-push: Adjust to using workflow_run"
c7ee45f7e Revert "gha: ci-on-push: Adapt chained jobs to workflow_run"
5d4d72064 Revert "gha: k8s-on-aks: Fix cluster name"
13d857a56 gha: k8s-on-aks: Set {create,delete}_aks as steps
dc6569dbb runtime-rs/virtio-fs: add support extra handler for cache mode.
85cc5bb53 gha: k8s-on-aks: Fix cluster name
1688e4f3f gha: aks: Use D4s_v5 instance
108d80a86 gha: Add the ability to also test Dragonball
2550d4462 gha: build-kata-static-tarball: Only push to registry after merge
e81b8b8ee local-build: build-and-upload-payload is not quay.io specific
13929fc61 gha: publish-kata-deploy-payload: Improve registry login
41026f003 gha: payload-after-push: Pass registry / repo as inputs
7855b4306 gha: ci-on-push: Adapt chained jobs to workflow_run
3a760a157 gha: ci-on-push: Adjust to using workflow_run
a159ffdba gha: ci-on-push: Depend on Commit Message Check
8086c75f6 gha: Also run k8s tests on AKS with dragonball
fe86c08a6 tools: Avoid building the kernel twice
3215860a4 gha: Set ci-on-push to run on `pull_request_target`
d17dfe4cd gha: Use ghcr.io for the k8s CI
b661e0cf3 rustjail: Add anyhow context for D-Bus connections
60c62c3b6 gha: Remove kata-deploy-test.yaml
43894e945 gha: Remove kata-deploy-push.yaml
cab9ca043 gha: Add a CI pipeline for Kata Containers
53b526b6b gha: k8s: Add snippet to run k8s tests on aks clusters
c444c24bc gha: aks: Add snippets to create / delete aks clusters
11e0099fb tests: Move k8s tests to this repo
73be4bd3f gha: Update actions for release.yaml
d38d7fbf1 gha: Remove code duplication from release.yaml
56331bd7b gha: Split payload-after-push-*.yaml
a552a1953 docs: Update CNM url in networking document
7796e6ccc rustjail: Fix minor grammatical error in function name
41fdda1d8 rustjail: Do not unwrap potential error with cgroup manager
a914283ce kata-ctl: add function to get platform protection.
0f7351556 runtime: add filter metrics with specific names
cbe6ad903 runtime: support non-root for clh
d3bb25418 utils: Add function to check vhost-vsock
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 800ee5cd88.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 3018c9ad51.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
If a hypervisor debug console is enabled and sandbox_cgroup_only is set,
the hypervisor can fail to open /dev/ptmx, which prevents the sandbox
from launching.
This is caused by the absence of a device cgroup entry to allow access
to /dev/ptmx. When sandbox_cgroup_only is not set, the hypervisor
inherits the default unrestrcited device cgroup, but with it enabled it
runs into allow / deny list restrictions.
Fix by adding an allowlist entry for /dev/ptmx when debug is enabled,
sandbox_cgroup_only is true, and no /dev/ptmx is already in the list of
devices.
Fixes: #6870
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
This PR updates the container network model url that is part of the
virtcontainers documentation.
Fixes#6889
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This reverts commit 20ab2c2420.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit f33345c311.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This partially reverts commit 054174d3e6
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 25b3cdd38c.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit ed145365ec.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 3c5ffb0c85.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 3e15800199.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When this option is enabled the runtime will attempt to determine the
appropriate sandbox size (memory, CPU) before booting the virtual
machine.
As TEEs do not support memory and CPU hotplug, this approach must be
used.
Fixes: #6818
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When this option is enabled the runtime will attempt to determine the
appropriate sandbox size (memory, CPU) before booting the virtual
machine.
As TEEs do not support memory and CPU hotplug, this approach must be
used.
Fixes: #6818
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We like it or not, every now and then we'll have to deal with flaky
tests, and our tests using GHA are not exempt from that fact.
With this simple commit, we're trying to improve the reliability of the
tests in a few different fronts:
* Giving enough time for the script used by kata-deploy to be executed
* We've hit issues as the kata-deploy pod is considered "Ready" at the
moment it starts running, not when it finishes the needed setup. We
should also be looking on how to solve this on the kata-deploy side
but, for now, let's ensure our tests do not break with the current
kata-deploy behavior.
* Merging the "Deploy kata-deploy" and "Run tests" steps
* We've hit issues re-running tests and seeing even more failures than
the ones we're trying to debug, as a step will simply be taken as
succeeded as part of the re-run, in case it was successful executed
as part of the first run. This causes issues with the kata-deploy
deployment, as the tests would start running before even having the
node set up for running Kata Containers.
Fixes: #6865#6649
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The AmdSev firmware package should be used with
measured direct boot. If the expected hashes are not
injected into the firmware binary by the VMM, the
guest will not boot. This is required for security.
Currently the main branch does not have the extended
shim support for SEV, which tells the VMM to inject
the expected hashes.
We ship the standard OVMF package to use with SNP,
so let's switch SEV to that for now. This will need
to be changed back when shim support for SEV(-ES)
is added to main.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
In order to populate containerd config file with
support for SEV, we need to add the qemu-sev shim
to the kata-deploy script.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Currently Kata does not support memory / CPU hotplug for SEV or
SEV-SNP so we need to skip tests that rely on it.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Now that SEV artifacts are built by GHA, remove
conditional that skips tests when using qemu-sev.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Now that we have SNP artifacts in place and they are built via gha,
remove the condition that skips the tests for SNP.
Fixes: #6809
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
When updating an interface, there's maybe an existed
interface whose name would be the same with the updated
required name, thus it would update failed with interface
name existed error. Thus we should rename the existed interface
with an temporary name and swap it with the previouse interface
name last.
Fixes: #6842
Signed-off-by: fupan <fupan.lfp@antgroup.com>
As the TDX CI runs on k3s, we must ensure the cleanup, as already done
for the deploy, used the k3s overlay.
Fixes: #6857
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR replaces single spaces to tabs in order to fix the
indentation of the rootfs script.
Fixes#6848
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
We have been using the C version of virtiofsd on ppc64le. Now that the issue with
rust virtiofsd have been fixed, let's switch to it.
Fixes: #4259
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
virtiofsd v1.6.1 has been released with the fixes required for running
successfully on ppc64le.
Fixes: #4259
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
Supports both online and offline modes of interaction with simple-kbs
for SEV/SEV-ES confidential guests.
Fixes: #6795
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
The sev package provides utilities for launching AMD SEV and SEV-ES
confidential guests.
Fixes: #6795
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Let's specifically name the `gpu` runtime class as `nvidia-gpu`. By
doing this we keep the door open and ease the life of the next vendor
adding GPU support for Kata Containers.
Fixes: #6553
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Update the kata-ctl install rule to allow it to be installed to a given directory
The Makefile was updated to use an INSTALL_PATH variable to track where the
kata-ctl binary should be installed. If the user doesn't specify anything,
then it uses the default path that cargo uses. Otherwise, it will install it
in the directory that the user specified. The README.md file was also updated
to show how to use the new option.
Fixes#5403
Co-authored-by: Cesar Tamayo <cesar.tamayo@intel.com>
Co-authored-by: Kevin Mora Jimenez <kevin.mora.jimenez@intel.com>
Co-authored-by: Narendra Patel <narendra.g.patel@intel.com>
Co-authored-by: Ray Karrenbauer <ray.karrenbauer@intel.com>
Co-authored-by: Srinath Duraisamy <srinath.duraisamy@intel.com>
Signed-off-by: Narendra Patel <narendra.g.patel@intel.com>
Rework TestQemuCreateVM routine to be a table driven test with
various config variations passed to it. After CreateVM a handful
of additional functions are exercised to improve code-coverage.
Also add partial coverage for StartVM routine.
Currently improving from 19.7% to 35.7%
Credit PR to Hackathon Team3
Fixes: #267
Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Add http_proxy and https_proxy as part of the docker build arguments
in order to build properly when we are behind a proxy.
Fixes#6834
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
With this fix the vCPU pinning feature chooses the correct
physical cores to pin the vCPU threads on rather than always using core 0.
Fixes#6831
Signed-off-by: Peteris Rudzusiks <rye@stripe.com>
`uname -m` produces `x86_64`, but container image convention
is to use `amd64`, so update this in the tag
Fixes: #6820
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
(cherry picked from commit 2856d3f23d)
There's absolutely no reason to ship the kata-static tarball as part of
the payload image, as:
* The tarball is already part of the release process
* The payload image already has uncompressed content of the tarball
* The tarball itself is not used anywhere by the kata-deploy scripts
Fixes: #6828
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
There's no reason for us to build firecracker instead of simply
downloading the official released tarball, as tarballs are provided for
the architectures we want to use them.
Fixes: #6770
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When building from aarch64, just use "arm64" as that's what's used in
the name of the released nydus tarballs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
There's no need to build Cloud Hypervisor aarch64 as, for a few releases
already, Cloud Hypervisor provides an official release binary for the
architecture.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Fix the syntax and logic error that is only displayed if the user runs
the script with `-o`. This option requests that "only" Kata Containers
is installed and stops containerd from being installed.
Fixes: #6822.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
`uname -m` produces `x86_64`, but container image convention
is to use `amd64`, so update this in the tag
Fixes: #6820
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This reverts commit 5ec9ae0f04, for two
main reasons:
* The readinessProbe was misintepreted by myself when working on the
original PR
* It's actually causing issues, as the pod ends up marked as not
healthy.
When release is published, kata-deploy payload and kata-static package
can support multi-arch publishing.
Fixes: #6449
Signed-off-by: SinghWang <wangxin_0611@126.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- Create multi-arch manifests for the ci and release runtime-payload
that are tagged with the commit, for use in the operator
Fixes: #6814
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
All the kernel-foo instances, such as "kernel-sev" or "kernel-snp",
should be transformed into "kernel.foo" when looking at the
versions.yaml file.
This was already done for SEV, but missed on the SNP case.
Fixes: #6777
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We were passing "kernel-nvidia-gpu-tdx", missing the "-experimental"
part, leading to a non-valid URL.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
c9bf7808b6 introduced the logic to
properly get the version of nvidia-gpu kernels, but one important part
was dropped during the rebase into main, which is actually getting the
correct version of the kernel.
Fixing this now, and using the old issue as reference.
Fixes: #6777
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need to make sure that, when caching a `-nvidia-gpu` kernel, we still
look at the version of the base kernel used to build the nvidia-gpu
drivers, as the ${vendor}-gpu kernels are based on already existing
entries in the versions.yaml file and do not require a new entry to be
added.
Fixes: #6777
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
A few pieces of the local-build tooling are supposed to be
alphabetized. Fixup a couple minor issues that have accumulated.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Now that we have the SNP components in place, make sure that
kata-deploy knows about the qemu-snp shim so that it will be
added to containerd config.
Fixes: #6575
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Since SEV-SNP has limited hotplug support, increase
the pod overhead to account for fixed resource usage.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
SNP requires many specific configurations, so let's make
a new SNP configuration file that we can use with the
kata-qemu-snp runtime class.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
SNP and SEV will share a (guest) kernel. Update the description
in versions.yaml to mention this.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
The x86_64 package of OVMF is required for deployments
that don't use kernel hashes, which includes SEV-SNP
in the short term. We should keep this in the bundle
in the long term in case someone wants to disable
kernel hashes.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Add targets to build the "plain" x86_64 OVMF.
This will be used by anyone who is using SEV or SNP
without kernel hashes. The SNP QEMU does not yet
support kernel hashes so the OvmfPkg will be used
by default.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
Since we reshuffled versions.yaml, update the guide so that
we can find the SNP QEMU info.
Once runtime support is merged we should overhaul or remove
this guide, but let's keep it for now.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Refactor SNP QEMU entry in versions.yaml to match
qemu-experimental and qemu-tdx-experimental.
Also, update the version of QEMU to what we are using
in CCv0. This is the non-UPM QEMU and it does not
have kernel hashes support.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
Add Make targets and helper functions to build the QEMU
needed for SEV-SNP.
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
Check that kvm test fails when run as non-root and when device specified
is not /dev/kvm.
Fixes#5338
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Add kvm check using ioctl macro to create a syscall that checks the kvm
api version and if creation of a vm is successful.
Fixes#5338
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
We have code that builds initrd for SEV.
thus, adding that to the test and release process.
Fixes: #6572
Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
SEV requires custom kernel arguments when building.
Thus, adding it to the test and release process.
Fixes: #6572
Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
Adding config file that can be used with qemu-sev runtime class.
Since SEV has limited hotplug support, increase
the pod overhead to account for fixed resource usage.
Fixes: #6572
Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
SEV requires special OVMF.
Now that we have ability to build this custom OVMF, let's optimize
it by caching so that we don't have to build it for every run.
Fixes: sev: #6572
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
The SEV initrd build requires kernel modules.
So, for SEV case, we need to cache kernel modules tarball in
addition to kernel tarball.
Fixes: #6572
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
SEV requires special OVMF to work with kernel hashes.
Thus, adding changes that builds this custom OVMF for SEV.
Fixes: #6572
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
We need special initrd for SEV. The work on SEV initrd is based on
Ubuntu. Thus, adding another entry in versions.yaml
This binary will have '-sev' suffix to distinguish it from the generic
binary.
Fixes: #6572
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
The snap package is no longer being maintained so update the docs to
warn readers.
We'll remove the snap installation docs in a few weeks.
See: #6769.
Fixes: #6793.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
When the agent config file is missing, the panic message says "no such file or
directory" but doesn't inform the user about which file was missing. Add
context to the parsing (with filename) and to the from_config_file() calls
(with information where the path is coming from).
Fixes: #6771
Depends-on: github.com/kata-containers/tests#5627
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Expanded tests on fc_test.go to cover more lines of code. Coverage went from 4.6% to 18.5%.
Fixed very simple static check fail on line 202.
Fixes: #266
Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
`cross` is an open source tool that provides zero-setup cross compile
for rust binaries. Add documentation on this tool for compiling
kata-ctl tool and Cross.toml file that provides required configuration
for installing dependencies for various targets.
This is pretty useful for a developer to make sure code compiles and
passes checks for various architectures.
Fixes: #6765
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Add configurable context timeout for StopVM in remote hypervisor
similar to StartVM
Fixes: #6730
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Eventual replacement of kata-log-parser, but for now replicates its
functionaility for the new runtime-rs syntax. Takes in log files,
parses, sorts by timestamp, spits them out in json, csv, xml, toml, and
a few others.
Fixes#5350
Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>
Expanded tests on fc_test.go to cover more lines of code. Coverage went from 4.6% to 18.5%.
Fixes: #266
Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
With the changes proposed as part of this PR, a qemu-snp cluster
will be created but no tests will be performed.
GitHub Actions will only run the tests using the workflows that are
part of the **target** branch, instead of the using the ones coming
from the PR. No way to work around this for now.
After this commit is merged, the tests (not the yaml files for the
actions) will be altered in order for the checkout action to help in
this case.
Fixes: #6722
Signed-off-by: Ryan Savino <ryan.savino@amd.com>
Have kata-env call architecture specific function to get cpu details
instead of generic function to get cpu details that works only for
certain architectures. The functionality for cpu details has been fully
implemented for x86_64 and arm architectures, but needs to be
implemented for s390 and powerpc.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Add ability to write the environment information to a file
or stdout if file path is absent.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
This is essentially a workaround for the issue:
https://github.com/kata-containers/kata-containers/issues/5954
runtime-rs chnages the Kata config format adding agent_name and
hypervisor_name which are then used as keys to fetch the agent and
hypervisor configs. This will not work for older configs.
So use the first entry in the hashmaps to fetch the configs as a
workaround while the config change issue is resolved.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Expanded tests on factory_test.go to cover more lines of code. Coverage went from 34% to 41.5% in the case of user-mode run tests,
and from 77.7% to 84% in the case of priviledge-mode run tests.
Fixes: #260
Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
Fix recurring issues of failing to install dependencies due to stale apt cache.
Uprev actions/checkout to v3 to resolve issue "Node.js 12 actions are deprecated."
Fixes: #5659
Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Currently, ARCH value is being set to powerpc64le by default.
powerpc64le is only right in context of rust and any operation
which might use this variable for a different purpose would fail on ppc64le.
Fixes: #6741
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
These will be consumed by kata-ctl, so export these so that
they can be used to replace variables available to the rust binary.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Most of kata installation tools use this path for installation, so
add this to the paths to look for the configuration.toml file.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Add the serde default attribute to the field so that parsing
can continue if this field is not present.
The agent assumes a default value for this, so it is not required
by the user to provide a value here.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
With the changes proposed as part of this PR, a qemu-sev cluster will
be created but no tests will be performed.
GitHub Actions will only run the tests using the workflows that are
part of the **target** branch, instead of the using the ones coming
from the PR. No way to work around this for now.
After this commit is merged, the tests (not the yaml files for the
actions) will be altered in order for the checkout action to help in this
case.
Fixes: #6711
Signed-off-by: Ryan Savino <ryan.savino@amd.com>
The kata runtime invokes removeStaleVirtiofsShareMounts after
a container is stopped to clean up the stale virtiofs file caches.
Fixes: #6455
Signed-off-by: Feng Wang <fwang@confluent.io>
Change the Body Line Length workflow to not trigger when the commit
message contains only a message without a body. Other workflows will
flag the missing body sections, and it was confusing to have an error
message that said 'Body line too long (max 150)' when this was not
actually the case.
Fixes: #5561
Co-authored-by: Jayant Singh <jayant.singh@intel.com>
Co-authored-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Jayant Singh <jayant.singh@intel.com>
Signed-off-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Kelby Madal-Hellmuth <kelby.madal-hellmuth@intel.com>
Signed-off-by: Liz Lawrens <liz.lawrens@intel.com>
Added driver util function for easier handling of VFIO
devices outside of the VFIO module. At the sandbox level
we may need to set options depending if we have a VFIO/PCIe
device, like the fwCfg for confiential guests.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Some functions may be used in other modules then only in
the VFIO module, extract them and make them available to
other layers like sandbox.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
If we have a VFIO device and cold-plug is enabled
we mark each device as ColdPlug=true and let the VFIO
module do the attaching.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
RawDevics are used to get PCIe device info early before the sandbox
is started to make better PCIe topology decisions
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
For the hypervisor to distinguish between PCIe components, adding
a new enum that can be used for hot-plug and cold-plug of PCIe devices
Fixes: #6687
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Expanded tests on signals_test.go to cover more lines of code. 'go test' won't show 100% coverage (only 66.7%), because one test need to spawn a new
process (since it is testing a function that calls os.Exit(1)).
Fixes: #256
Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
This patch adds keep_abnormal in runtime config. If keep_abnormal =
true, it means that 1) if the runtime exits abnormally, the cleanup
process will be skipped, and 2) the runtime will not exit even if the
health check fails.
This option is typically used to retain abnormal information for
debugging and should NOT be enabled by default.
Fixes: #6717
Signed-off-by: mengze <mengze@linux.alibaba.com>
Signed-off-by: quanweiZhou <quanweiZhou@linux.alibaba.com>
`image_policy_file` is the URI of the image security policy file which
is used to control the allowlist and denylist of the containers to be
run inside the guest.
`image_registry_auth_file` is the URI of the auth file which is needed
to access an private registry.
`simple_signing_sigstore_config` is a configuration file that shows
where the signature files are when simple signing is enabled.
All the config items support both `kbs://..` scheme, `file://..` scheme,
or a direct absolute path `/...`, which means either the file is to be
fetched from a remote KBS that `aa_kbc_params` points to, or from the
local filesystem of the guest.
Fixes: #6640
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Newer containerd releases have an additional static package published.
Because of this, download_url contains two urls causing curl to fail.
To resolve this, pick the first url from the containerd releases to
download containerd.
Fixes: #6695
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Ensure that kvm and kvm_intel modules are loaded.
Renames the get_cpu_info() function to read_file_contents()
Fixes#5332
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
- D-Bus enabling now occurs only in setup_rootfs (instead of
prepare_overlay and setup_rootfs)
- Adjust permissions of / so dbus-broker will be able to traverse FS
These changes enables kata-agent to successfully communicate with D-Bus.
Fixes#6677
Signed-off-by: Vladimir <amigo.elite@gmail.com>
Never ever try to close the same fd double times, even in a unit test.
A file descriptor is a number which will be reused, so when you close
the same number twice you may close another file descriptor in the second
time and then there will be an error 'Bad file descriptor (os error 9)'
while the wrongly closed fd is being used.
Fixes: #6679
Signed-off-by: Tim Zhang <tim@hyper.sh>
Last but not least add the continerd shim configuration
pointing to the correct configuration-<shim>.toml
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
We need to set hotplug on pci root port and enable at least one
root port. Also set the guest-hooks-dir to the correct path
Fixes: #6675
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
On some systems a GPU is in a IOMMU group with a PCI Bridge and
PCI Host Bridge. Per default no PCI Bridge needs to be passed-through.
When scanning the IOMMU group, ignore devices with a 0x60 class ID prefix.
Fixes: #6663
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
With each release make sure we ship a GPU and TEE enabled kernel
This adds tdx-experimental kernel support
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
The beauty of GHA not allowing us to easily test changes in the yaml
files as part of the PR has hit us again. :-/
The correct path for the k3s deployment is
tools/packaging/kata-deploy/kata-deploy/overlays/k3s instead of
tools/packaging/kata-deploy/kata-deploy/overlay/k3s.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
TDVF caching is not working as the tarball name is incorrect. The result
expected is kata-static-tdvf.tar.xz, but it's looking for
kata-static-tdx.tar.xz.
This happens as a logic to convert tdx -> tdvf has been added as part of
the building scripts, but I missed doing this as part of the caching
scripts.
Fixes: #6669
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
TDX QEMU caching is not working as expected, as we're checking for its
version looking at "assets.hypervisor.${QEMU_FLAVOUR}.version", which is
correct for standard QEMU. However, for TDX QEMU we should be checking
for "assets.hypervisor.${QEMU_FLAVOUR}.tag"
Fixes: #6668
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When testing on AKS, we've been hitting the dial_timeout every now and
then. Let's increase it to 45 seconds (instead of 30) for all the VMMs,
and to 60 seconfs in case of TEEs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The agent now offloads cgroup configuration to systemd when
possible. This requires to enable D-Bus in order to communicate
with systemd.
Fixes#6657
Signed-off-by: Greg Kurz <groug@kaod.org>
Booting up TDX takes more time than booting up a normal VM. Those
values are being already used as part of the CCv0 branch, and we're just
bringing them to the `main` branch as well.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's ensure the node is ready after the CRI Engine restart, otherwise
we may proceed and scripts may simply fail if they try to deploy a pod
while the CRI Engine is not yet restarted (and, consequently, the node
is not Ready).
Related: #6649
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
readinessProbe will help us to only have the kata-deploy pod marked as
Ready when it finishes all the needed configurations in the node.
Related: #6649
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As TEEs cannot hotplug memory / CPU, we *must* consider the default
values for those as part of the podOverhead.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As the TDX machine is using k3s, let's make sure we're deploying
kat-deploy using the k3s overlay.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We must ensure that no kata-deploy is left behind after the tests
finish, otherwise it may interfere with the next run.
Fixes: #6647
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The socket file for shim management is created in /run/kata
and it isn't deleted after the container is stopped. After
running and stopping thousands of containers /run folder
will run out of space.
Fixes#6622
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Co-authored-by: Greg Kurz <groug@kaod.org>
In preparation for CoCo 0.5 release, updated td-shim to
commit 10568bab569bc40034cc973f26fbb0a768dcc3e3
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
In preparation for CoCo 0.5 release, updated attestation-agent to
commit c939d211fe5ac497715008e36161aff20cabb6e6
Fixes#6650
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
If conf_guest is set we need to update the CONFIG_LOCALVERSION
to match the suffix created in install_kata
-nvidia-gpu-{snp|tdx}, the linux headers will be named the very
same if build with make deb-pkg for TDX or SNP.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
1. when we use nerdctl to setup network for kata, no netns is created by
nerdctl, kata need to create netns by its own
2. after start VM, nerdctl will call cni plugin via oci hook, we need to
rescan the netns after the interfaces have been created, and hotplug
the network device into the VM
Fixes:#4693
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Now that we've added a TDX capable external runner, let's make sure we
also run the basic tests using TDX.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure we configure containerd for the kata-qemu-tdx handler
and ship the kata-qemu-tdx runtime class for kubernetes.
Fixes: #6537
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As the QEMU configuration for TDX differs quite a lot from the normal
QEMU configuration, let's add a new configuration file for the QEMU TDX.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Since TDX doesn't support readonly memslot, TDVF cannot be mapped as
pflash device and it actually works as RAM. "-bios" option is chosen to
load TDVF.
OVMF is the opensource firmware that implements the TDVF support. Thus
the command line to specify and load TDVF is ``-bios OVMF.fd``
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure we also check /sys/firmwares/tdx for TDX guest
protection, as the location may depend on whether TDX Seam is being used
or not.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the ability to cache OVMF, which right now we're only building
and shipping it for TDX.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's build the OVMF with TDX support as part of our tests, and let's
ship it as part of our releases.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed targets and modifications to be able to build
OVMF for TDX as part of the local-build scripts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's update the OVMF for TDX version to what's the latest tested
release of the Intel TDX tools with Kata Containers.
This change requires a newer version of `nasm` than the one provided by
the container used to build the project. This change will also be
needed for SEV-SNP and was originally done by Alex Carter (thanks!).
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
As we'll be using this from different places in the near future, let's
create a helper function as part of the libs.sh.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make users aware of the cache_components_main.sh that they can
also cache the kernel-tdx-experimental builds.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's build the kernel with TDX support as part of our tests, and let's
ship it as part of our releases.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed targets and modifications to be able to build
kernel-tdx-experimental as part of the local-build scripts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's create a `install_kernel_helper()` function, as it was already
done for QEMU, and rely on that when calling `install_kernel` and
`install_kernel_dragonball_experimental`.
This helps us to reduce the code duplication by a fair amount.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's update the Kernel TDX version to what's the latest tested release
of the Intel TDX tools with Kata Containers.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Although we've been providing users a way to build kernel with TDX
support, this must be moved to its own experimental entry instead of how
it currently is.
The reason for that is because the patches are not yet merged into
kernel, and this is still an experimental build of the project.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's do what we already did when caching the kernel, and allow passing
a FLAVOUR of the project to build.
By doing this we can re-use the same function used to cache QEMU to also
cache any kind of experimental QEMU that we may happen to have.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed targets and modifications to be able to build
qemu-tdx-experimental as part of the local-build scripts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure the `qemu_suffix` and `qemu_tarball_name` can be
specified. With this we make it really easy to reuse this script for
any addition flavour of an experimental QEMU that ends up having to be
built (specifically looking at the ones for Confidential Containers
here).
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's update the QEMU TDX version to what's the latest tested release of
the Intel TDX tools with Kata Containers.
In order to do such update, we had to relax the checks on the QEMU
version for some of the configuration options, as those were removed
right after the window was open for the 7.1.0 development (thus the
7.0.50 check).
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Although we've been providing users a way to build QEMU with TDX
support, this must be moved to its own experimental entry instead of how
it currently is.
The reason for that is because the patches are not yet merged into QEMU,
and this is still an experimental build of the project.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As the dragonball kernel is shipped as part of our releases, it must be
added to the `all` target.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
In order to make it easier to read, let's just rename the
install_dragonball_experimental_kernel and install_experimental_kernel
to install_kernel_dragonball_experimental and
install_kernel_experimental, respectively.
This allows us to quickly get to those functions when looking for
`install_kernel`.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Now that the infra for running dragonball tests has been enabled, let's
actually make sure to have them running on each PR.
The tests skipped are:
* `k8s-cpu-ns.bats`, as CPU resize doesn't seem to be yet properly
supported on runtime-rs
* https://github.com/kata-containers/kata-containers/issues/6621Fixes: #6605
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
A sandbox annotation used to specify prefetch_files.list
path the container image being used, and runtime will pass
it to Hypervisor to search for corresponding prefetch file:
format looks like:
"io.katacontainers.config.hypervisor.prefetch_files.list"
= /path/to/<uid>/xyz.com/fedora:36/prefetch_file.list
Fixes: #6582
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
1. when do the deserialization for the oci hook, we should use camel
case for createRuntime
2. we should pass the dir of bundle path instead of the path of
config.json
Fixes:#4693
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
I should have seen this coming, but currently the "create" and "delete"
AKS workflows cannot be imported and uses as a job's step, resulting on
an error trying to find the correspondent action.yaml file for those.
Fixes: #6630
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's ensure we're only running this workflow when PRs are opened
against the main branch.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit a159ffdba7.
Unfortunately we have to revert the PRs related to the switch done to
using `workflow_run` instead of `pull_request_target`. The reason for
that being that we can only mark jobs as required if they are targetting
PRs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 3a760a157a.
Unfortunately we have to revert the PRs related to the switch done to
using `workflow_run` instead of `pull_request_target`. The reason for
that being that we can only mark jobs as required if they are targetting
PRs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 7855b43062.
Unfortunately we have to revert the PRs related to the switch done to
using `workflow_run` instead of `pull_request_target`. The reason for
that being that we can only mark jobs as required if they are targetting
PRs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 85cc5bb534.
Unfortunately we have to revert the PRs related to the switch done to
using `workflow_run` instead of `pull_request_target`. The reason for
that being that we can only mark jobs as required if they are targetting
PRs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've been currently using {create,delete}_aks as jobs. However, it
means that if the tests fail we'll end up deleting the AKS cluster (as
expected), but not having a way to recreate the cluster without
re-running all jobs, which is a waste of resources.
Fixes: #6628
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Add support for virtiofsd when virtio_fs_extra_args with
"-o cache auto, ..." users specified.
Fixes: #6615
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
This patch updates the template configuration file for
the remote hypervisor to set static_sandbox_resource_mgmt
to be true. The remote hypervisor uses the peer pod config
to determine the sandbox size, so requires this to be set to
true by default.
Fixes: #6616
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
This was missed from the last series, as GHA will use the "target
branch" yaml file to start the workflow.
Basically we changed the name of the cluster created to stop relying on
the PR number, as that's not easily accessible on `workflow_run`.
Fixes: #6611
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Adding SNP components needed to the x86 payload push and release payloads.
QEMU is needed in both the after-push payload and release payload, while OVMF is only
missing from the release workflow.
Fixes: #6600
Signed-Off-By: Alex Carter <AlexCarter@ibm.com>
It's been pointed out that D4s_v5 instances are more powerful than the
D4s_v3 ones, and have the very same price. With this in mind, let's
switch to the newer machines.
Fixes: #6606
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
With the changes proposed as part of this PR, an AKS cluster will be
created but no tests will be performed.
The reason we have to do this is because GitHub Actions will only run
the tests using the workflows that are part of the **target** branch,
instead of the using the ones coming from the PR, and we didn't find yet
a way to work this around.
Once this commit is in, we'll actually change the tests themselves (not
the yaml files for the actions), as those will be the ones we want as
the checkout action helps us on this case.
Fixes: #6583
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
56331bd7bc oversaw the fact that we
mistakenly tried to push the build containers to the registry for a PR,
rather than doing so only when the code is merged.
As the workflow is now shared between different actions, let's introduce
an input variable to specify which are the cases we actually need to
perform a push to the registry.
Fixes: #6592
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's just print "to the registry" instead of printing "to quay.io", as
the registry used is not tied to quay.io.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We made registry / repo mandatory, but we only adapted that to the amd64
job. Let's fix it now and make sure this is also passed to the arm64
and s390x jobs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're using the `workflow_run` event, the checkout action would
pull the **current target branch** instead of the PR one.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The way previously used to get the PR's commit sha can only be used with
`pull_request*` kind of events.
Let's adapt it to the `workflow_run` now that we're using it.
With this change we ended up dropping the PR number from the tarball
suffix, as that's not straightforward to get and, to be honest, not a
unique differentiator that would justify the effort.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make this workflow dependent of the commit message check, and only
start it if the commit message check one passes.
As a side effect, this allows us to run this specific workflow using
secrets, without having to rely on `pull_request_target`.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is to add an artifact named `cc-rootfs-initrd` to a payload image
because it is identified that the artifact is required to run a cc-operator
e2e test.
Fixes: #6544
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
As already done for Cloud Hypervisor and QEMU, let's make sure we can
run the AKS tests using dragonball.
Fixes: #6583
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
attestation-agent depends on tdx-attest-rs when cc_kbc is enabled, which
depends on libtdx-attest.so. Include the dev package in build container,
and the runtime package in the built rootfs.
The build of tdx-attest-sys (which is a dep of tdx-attest-rs) uses
bindgen, which requires libclang so install that in the build container
as well.
We specify the tdx stack DCAP v1.15
Fixes: #6519
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Two different kernel build targets (build,install) have both instructions to
build the kernel, hence it was executed twice. Install should only do
install and build should only do build.
Fixes: #6588
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
This is less secure than running the PR on `pull_request`, and will
require using an additional `ok-to-test` label to make sure someone
deliverately ran the actions coming from a forked repo.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's switch to using the `ghcr.io` registry for the k8s CI, as this
will save us some troubles on running the CI with PRs coming from forked
repos.
Fixes: #6587
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
In cases where the D-Bus connection fails, add a little additional context about
the origin of the error.
Fixes: 6561
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Suggested-by: Archana Shinde <archana.m.shinde@intel.com>
Spell-checked-by: Greg Kurz <gkurz@redhat.com>
This workflow becomes redundant as we're already testing kubernetes
using kata-deploy, and also testing it on AKS.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is the very first step to replacing the Jenkins CI, and I've
decided to start with an x86_64 approach only (although easily
expansible for other arches as soon as they're ready to switch), and to
start running our kubernetes tests (now running on AKS).
Fixes: #6541
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will be shortly used as part of a newly created GitHub action which
will replace our Jenkins CI.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Those will be shortly used as part of a newly added GitHub action for
testing k8s tests on Azure.
They've been created using the secrets we already have exposed as part
of our GitHub, and they follow a similar way to authenticate to Azure /
create an AKS cluster as done in the `/test-kata-deploy` action.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The first part of simplifying things to have all our tests using GitHub
actions is moving the k8s tests to this repo, as those will be the first
vict^W targets to be migrated to GitHub actions.
Those tests have been slightly adapted, mainly related to what they load
/ import, so they are more self-contained and do not require us bringing
a lot of scripts from the tests repo here.
A few scripts were also dropped along the way, as we no longer plan to
deploy kubernetes as part of every single run, but rather assume there
will always be k8s running whenever we land to run those tests.
It's important to mention that a few tests were not added here:
* k8s-block-volume:
* k8s-file-volume:
* k8s-volume:
* k8s-ro-volume:
These tests depend on some sort of volume being created on the
kubernetes node where the test will run, and this won't fly as the
tests will run from a GitHub runner, targetting a different machine
where kubernetes will be running.
* https://github.com/kata-containers/kata-containers/issues/6566
* k8s-hugepages: This test depends a whole lot on the host where it
lands and right now we cannot assume anything about that anymore, as
the tests will run from a GitHub runner, targetting a different
machine where kubernetes will be running.
* https://github.com/kata-containers/kata-containers/issues/6567
* k8s-expose-ip: This is simply hanging when running on AKS and has to
be debugged in order to figure out the root cause of that, and then
adapted to also work on AKS.
* https://github.com/kata-containers/kata-containers/issues/6578
Till those issues are solved, we'll keep running a jenkins job with
hose tests to avoid any possible regression.
Last but not least, I've decided to **not** keep the history when
bringing those tests here, otherwise we'd end up polluting a lot the
history of this repo, without any clear benefit on doing so.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We can easily re-use the newly added build-kata-static-tarball-*.yaml as
part of the release.yaml file.
By doing this we consolidate on how we build the components accross our
actions.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's split those actions into two different ones:
* Build the kata-static tarball
* Publish the kata-deploy payload
We're doing this as, later in this series we'll start taking advantage
of both pieces.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR updates the url for the Container Network Model
in the network document.
Fixes#6563
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
There can be an error while connecting to the cgroups managager, for
example a `ENOENT` if a file is not found. Make sure that this is
reported through the proper channels instead of causing a `panic()`
that does not provide much information.
Fixes: #6561
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Reported-by: Greg Kurz <gkurz@redhat.com>
For remote hypervisor, the configmap, secrets, downward-api or project-volumes are
copied from host to guest. This patch watches for changes to the host files
and copies the changes to the guest.
Note that configmap updates takes significantly longer than updates via downward-api.
This is similar across runc and Kata runtimes.
Fixes: #6341
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Julien Ropé <jrope@redhat.com>
`ttrpc=true` parameter tells the Makefile of attestation-agent
to build the attestation-agent with ttrpc support
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
This commit brings ttrpc of image-rs. It will use the
lightweight underlying ttrpc to interact between kata-agent
and attestation-agent.
Also, this PR brings a patch for `oci-distribution`,
because two dependencies of `image-rs` depends on different
versions of `oci-distribution`, which will cause that
`image-rs` can not be built. We need a specified version of
`oci-distribution` to unify.
Fixes#6219
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
This is to add a new element `qemu-se` to the shims for a new runtime
class `kata-qemu-se`.
Fixes: #6549
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
SNP needs two builds of ovmf: the AmdSev build and the normal x86_64 build.
Adds target for vanilla ovmf build for snp
Adding another make target / kata-deploy function, and fixing the ovmf builder so these builds dont overlap.
Fixes: #5849
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
This is a preliminary work to establish an e2e test for a new runtime
class kata-qemu-se (IBM secure execution).
Fixes: #6544
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
The kata monitor metrics API returns a huge size response,
if containers or sandboxs are a large number,
focus on what we need will be harder.
Fixes: #6500
Signed-off-by: Miao Xia <xia.miao1@zte.com.cn>
- nydus: upgrad to v2.2.0
- osbuilder: Add support for CBL-Mariner
- kata-deploy: Fix bash semantics error
- make only_kata work without -f
- runtime-rs: ch: Implement confidential guest handling
- qemu/arm64: disable image nvdimm once no firmware offered
- static checks workflow improvements
- A couple of kata-deploy fixes
- agent: Bring in VFIO-AP device handling again
- bugfix: set hostname in CreateSandboxRequest
- packaging / kata-deploy builds: Add the ability to cache and consume cached components
- versions: Update firecracker version
- dependency: update cgroups-rs
- Built-in Sandbox: add more unit tests for dragonball. Part 6
- runtime: add support for Hyper-V
- runtime-rs: update load_config comment
- Add support for ephemeral mounts to occupy entire sandbox's memory
- runtime-rs: fix default kernel location and add more default config paths
- Implement direct-volume commands handler for shim-mgmt
- bugfix: modify tty_win info in runtime when handling ResizePtyRequest
- bugfix: add get_ns_path API for Hypervisor
- runtime-rs: add the missing default trait
- packaging: Simplify get_last_modification()
- utils: Make kata-manager.sh runs checks
- dragonball: support pmu on aarch64
- docs: fix typo in key filename in AWS installation guide
- backport rustjail systemd cgroup fix#6331 to 3.1
- main | kata-deploy: Fix kata deploy arm64 image build error
- workflows: Yet more fixes for publishing the kata-deploy payload after every PR merged
- rustjail: fix cgroup handling in agent-init mode
- runtime/Makefile: Fix install-containerd-shim-v2 dependency
- fix wrong notes for func GetSandboxesStoragePathRust()
- fix(runtime-rs): add exited state to ensure cleanup
- runtime-rs: add oci hook support
- utils: Remove kata-manager.sh cgroups v2 check
- workflows: Fixes for the `payload-after-push` action
- Dragonball: update dependencies
- workflows: Do not install docker
- workflows: Publish kata-deploy payload after a merge
- src: Fixed typo mod.rs
- actions: Use `git-diff` to get changes in kernel dir
- agent: don't set permission of existing directory in copy_file
- runtime: use filepath.Clean() to clean the mount path
- Upgrade to Cloud Hypervisor v30.0
- feat(runtime): make static resource management consistent with 2.0
- osbuilder: Include minimal set of device nodes in ubuntu initrd
- kata-ctl/exec: add new command exec to enter guest VM.
- kernel: Add CONFIG_SEV_GUEST to SEV kernel config
- runtime-rs: Improve Cloud Hypervisor config handling
- virtiofsd: update to a valid path on ppc64le
- runtime-rs: cleanup kata host share path
- osbuilder: fix default build target in makefile
- devguide: Add link to the contribution guidelines
- kata-deploy: Ensure go binaries can run on Ubuntu 20.04
- dragonball: config_manager: preserve device when update
- Revert "workflows: Push the builder image to quay.io"
- Remove all remaining unsafe impl
- kata-deploy: Fix building the kata static firecracker arm64 package occurred an error
- shim-v2: Bump Ubuntu container image to 22.04
- packaging: Cache the container used to build the kata-deploy artefacts
- utils: always check some dependencies.
- versions: Use ubuntu as the default distro for the rootfs-image
- github-action: Replace deprecated command with environment file
- docs: Change the order of release step
- runtime-rs: remove unnecessary Send/Sync trait implement
- runtime-rs: Don't build on Power, don't break on Power.
- runtime-rs: handle sys_dir bind volume
- sandbox: set the dns for the sandbox
- packaging/shim-v2: Only change the config if the file exists
- runtime-rs: Add basic CH implementation
- release: Revert kata-deploy changes after 3.1.0-rc0 release
8b008fc743 kata-deploy: fix bash semantics error
74ec38cf02 osbuilder: Add support for CBL-Mariner
ac58588682 runtime-rs: ch: Generate Cloud Hypervisor config for confidential guests
96555186b3 runtime-rs: ch: Honour debug setting
e3c2d727ba runtime-rs: ch: clippy fix
ece5edc641 qemu/arm64: disable image nvdimm if no firmware offered
dd23f452ab utils: renamed only_kata to skip_containerd
59c81ed2bb utils: informed pre-check about only_kata
4f0887ce42 kata-deploy: fix install failing to chmod runtime-rs/bin/*
09c4828ac3 workflows: add missing artifacts on payload-after-push
fbf891fdff packaging: Adapt `get_last_modification()`
82a04dbce1 local-build: Use cached VirtioFS when possible
3b99004897 local-build: Use cached shim v2 when possible
1b8c5474da local-build: Use cached RootFS when possible
09ce4ab893 local-build: Use cached QEMU when possible
1e1c843b8b local-build: Use cached Nydus when possible
64832ab65b local-build: Use cached Kernel when possible
04fb52f6c9 local-build: Use cached Firecracker when possible
8a40f6f234 local-build: Use cached Cloud Hypervisor when possible
194d5dc8a6 tools: Add support for caching VirtioFS artefacts
a34272cf20 tools: Add support for caching shim v2 artefacts
7898db5f79 tools: Add support for caching RootFS artefacts
e90891059b tools: Add support for caching QEMU artefacts
7aed8f8c80 tools: Add support for caching Nydus artefacts
cb4cbe2958 tools: Add support for caching Kernel artefacts
762f9f4c3e tools: Add support for caching Firecracker artefacts
6b1b424fc7 tools: Add support for caching Cloud Hypervisor artefacts
08fe49f708 versions: Adjust kernel names to match kata-deploy build targets
99505c0f4f versions: Update firecracker version
f4938c0d90 bugfix: set hostname
96baa83895 agent: Bring in VFIO-AP device handling again
f666f8e2df agent: Add VFIO-AP device handling
b546eca26f runtime: Generalize VFIO devices
4c527d00c7 agent: Rename VFIO handling to VFIO PCI handling
db89c88f4f agent: Use cfg-if for s390x CCW
68a586e52c agent: Use a constant for CCW root bus path
a8b55bf874 dependency: update cgroups-rs
97cdba97ea runtime-rs: update load_config comment
974a5c22f0 runtime: add support for Hyper-V
40f4eef535 build: Use the correct kernel name
a6c67a161e runtime: add support for ephemeral mounts to occupy entire sandbox memory
844bf053b2 runtime-rs: add the missing default trait
e7bca62c32 bugfix: modify tty_win info in runtime when handling ResizePtyRequest
30e235f0a1 runtime-rs: impl volume-resize trait for sandbox
e029988bc2 bugfix: add get_ns_path API for Hypervisor
42b8867148 runtime-rs: impl volume-stats trait for sandbox
462d4a1af2 workflows: static-checks: Free disk space before running checks
e68186d9af workflows: static-checks: Set GOPATH only once
439ff9d4c4 tools/osbuilder/tests: Remove TRAVIS variable
43ce3f7588 packaging: Simplify get_last_modification()
33c5c49719 packaging: Move repo_root_dir to lib.sh
16e2c3cc55 agent: implement update_ephemeral_mounts api
3896c7a22b protocol: add updateEphemeralMounts proto
23488312f5 agent: always use cgroupfs when running as init
8546387348 agent: determine value of use_systemd_cgroup before LinuxContainer::new()
736aae47a4 rustjail: print type of cgroup manager
dbae281924 workflows: Properly set the kata-tarball architecture
76b4591e2b tools: Adjust the build-and-upload-payload.sh script
cd2aaeda2a kata-deploy: Switch to using an ubuntu image
2d43e13102 docs: fix typo in AWS installation guide
760f78137d dragonball: support pmu on aarch64
9bc7bef3d6 kata-deploy: Fix path to the Dockerfile
78ba363f8e kata-deploy: Use different images for s390x and aarch64
6267909501 kata-deploy: Allow passing BASE_IMAGE_{NAME,TAG}
3443f558a6 nydus: upgrad nydus to v2.2.0
395645e1ce runtime: hybrid-mode cause error in the latest nydusd
f8e44172f6 utils: Make kata-manager.sh runs checks
f31c79d210 workflows: static-checks: Remove TRAVIS_XXX variables
8030e469b2 fix(runtime-rs): add exited state to ensure cleanup
7d292d7fc3 workflows: Fix the path of imported workflows
e07162e79d workflows: Fix action name
dd2713521e Dragonball: update dependencies
bd1ed26c8d workflows: Publish kata-deploy payload after a merge
fea7e8816f runtime-rs: Fixed typo mod.rs
a9e2fc8678 runtime/Makefile: Fix install-containerd-shim-v2 dependency
b6880c60d3 logging: Correct the code notes
12cfad4858 runtime-rs: modify the transfer to oci::Hooks
828d467222 workflows: Do not install docker
4b8a5a1a3d utils: Remove kata-manager.sh cgroups v2 check
2c4428ee02 runtime-rs: move pre-start hooks to sandbox_start
e80c9f7b74 runtime-rs: add StartContainer hook
977f281c5c runtime-rs: add CreateContainer hook support
875f2db528 runtime-rs: add oci hook support
ecac3a9e10 docs: add design doc for Hooks
3ac6f29e95 runtime: clh: Re-generate the client code
262daaa2ef versions: Upgrade to Cloud Hypervisor v30.0
192df84588 agent: always use cgroupfs when running as init
b0691806f1 agent: determine value of use_systemd_cgroup before LinuxContainer::new()
dc86d6dac3 runtime: use filepath.Clean() to clean the mount path
c4ef5fd325 agent: don't set permission of existing directory
3483272bbd runtime-rs: ch: Enable initrd usage
fbee6c820e runtime-rs: Improve Cloud Hypervisor config handling
1bff1ca30a kernel: Add CONFIG_SEV_GUEST to SEV kernel config Adding kernel config to sev case since it is needed for SNP and SNP will use the SEV kernel. Incrementing kernel config version to reflect changes
ad8968c8d9 rustjail: print type of cgroup manager
b4a1527aa6 kata-deploy: Fix static shim-v2 build on arm64
2c4f8077fd Revert "shim-v2: Bump Ubuntu container image to 22.04"
afaccf924d Revert "workflows: Push the builder image to quay.io"
4c39c4ef9f devguide: Add link to the contribution guidelines
76e926453a osbuilder: Include minimal set of device nodes in ubuntu initrd
697ec8e578 kata-deploy: Fix kata static firecracker arm64 package build error
ced3c99895 dragonball: config_manager: preserve device when update
da8a6417aa runtime-rs: remove all remaining unsafe impl
0301194851 dragonball: use crossbeam_channel in VmmService instead of mpsc::channel
9d78bf9086 shim-v2: Bump Ubuntu container image to 22.04
3cfce5a709 utils: improved unsupported distro message.
919d19f415 feat(runtime): make static resource management consistent with 2.0
b835c40bbd workflows: Push the builder image to quay.io
781ed2986a packaging: Allow passing a container builder to the scripts
45668fae15 packaging: Use existing image to build td-shim
e8c6bfbdeb packaging: Use existing image to build td-shim
3fa24f7acc packaging: Add infra to push the OVMF builder image
f076fa4c77 packaging: Use existing image to build OVMF
c7f515172d packaging: Add infra to push the QEMU builder image
fb7b86b8e0 packaging: Use existing image to build QEMU
d0181bb262 packaging: Add infra to push the virtiofsd builder image
7c93428a18 packaging: Use existing image to build virtiofsd
8c227e2471 virtiofsd: Pass the expected toolchain to the build container
7ee00d8e57 packaging: Add infra to push the shim-v2 builder image
24767d82aa packaging: Use existing image to build the shim-v2
e84af6a620 virtiofsd: update to a valid path on ppc64le
6c3c771a52 packaging: Add infra to push the kernel builder image
b9b23112bf packaging: Use existing image to build the kernel
869827d77f packaging: Add push_to_registry()
e69a6f5749 packaging: Add get_last_modification()
6c05e5c67a packaging: Add and export BUILDER_REGISTRY
1047840cf8 utils: always check some dependencies.
95e3364493 runtime-rs: remove unnecessary Send/Sync trait implement
a96ba99239 actions: Use `git-diff` to get changes in kernel dir
619ef54452 docs: Change the order of release step
a161d11920 versions: Use ubuntu as the default distro for the rootfs-image
be40683bc5 runtime-rs: Add a generic powerpc64le-options.mk
47c058599a packaging/shim-v2: Install the target depending on the arch/libc
b582c0db86 kata-ctl/exec: add new command exec to enter guest VM.
07802a19dc runtime-rs: handle sys_dir bind volume
04e930073c sandbox: set the dns for the sandbox
32ebe1895b agent: fix the issue of creating the dns file
44aaec9020 github-action: Replace deprecated command with environment file
a68c5004f8 packaging/shim-v2: Only change the config if the file exists
ee76b398b3 release: Revert kata-deploy changes after 3.1.0-rc0 release
bbc733d6c8 docs: runtime-rs: Add CH status details
37b594c0d2 runtime-rs: Add basic CH implementation
545151829d kata-types: Add Cloud Hypervisor (CH) definitions
2dd2421ad0 runtime-rs: cleanup kata host share path
0a21ad78b1 osbuilder: fix default build target in makefile
9a01d4e446 dragonball: add more unit test for virtio-blk device.
Signed-off-by: Greg Kurz <groug@kaod.org>
Our CI and release process are currently taking advantage of the
kata-deploy local build scripts to build the artefacts.
Having snap doing the same is the next logical step, and it will also
help to reduce, by a lot, the CI time as we only build the components
that a PR is touching (otherwise we just pull the cached component).
Fixes: #6514
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
DCAP has upgraded to 1.16, which is not compatible with the host OS used
as part of our CI (2022ww44). Let's ensure DCAP 1.15 is used instead.
Fixes: #6529
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Add osbuilder support to build a rootfs and image
based on the CBL-Mariner Linux distro
Fixes: #6462
Signed-off-by: Dallas Delaney <dadelan@microsoft.com>
This change provides a preliminary implementation for the Cloud Hypervisor (CH) feature ([currently
disabled](https://github.com/kata-containers/kata-containers/pull/6201))
to allow it to generate the CH configuration for handling confidential guests.
This change also introduces concrete errors using the `thiserror` crate
(see `src/runtime-rs/crates/hypervisor/ch-config/src/errors.rs`) and a
lot of unit tests for the conversion code that generates the CH
configuration from the generic Hypervisor configuration.
Fixes: #6430.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Enable Cloud Hypervisor debug based on the specified configuration
rather than hard-coding debug to be disabled.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Update the build to use the attestation-agent makefile to build it, so
we can pick up the enhancements there
Fixes: #6253
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We've been seeing the 'sudo make test' job occasionally run out of space in
/tmp, which is part of the root filesystem. Removing dotnet and
`AGENT_TOOLSDIRECTORY` frees around 10GB of space and in my tests the job still
has 13GB of space left after running.
Fixes: #6401
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
{{ runner.workspace }}/kata-containers and {{ github.workspace }} resolve to
the same value, but they're being used multiple times in the workflow. Remove
multiple definitions and define the GOPATH var at job level once.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The last remaining user of the TRAVIS variable in this repo is
tools/osbuilder/tests and it is only used to skip spinning up VMs. Travis
didn't support virtualization and the same is true for github actions hosted
runners. Replace the variable with KVM_MISSING and determine availability of
/dev/kvm at runtime.
TRAVIS is also used by '.ci/setup.sh' in kata-containers/tests to reduce the
set of dependencies that gets installed, but this is also in the process of
being removed.
Fixes: #3544
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
These variables are unused since we don't use travis CI. This also allows to
remove two steps:
- 'Setup GOPATH' only printed variables
- 'Setup travis reference' modified some shell local variables that don't have
any influence on the rest of the steps
The TRAVIS var is still used by tools/osbuilder/tests to determine if
virtualization is available.
Fixes: #3544
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
For now, image nvdimm on qemu/arm64 depends on UEFI/ACPI, so if there
is no firmware offered, it should be disabled.
Fixes: #6468
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2b41dbe broke all the cached jobs as it changed the permission of the
cache_components.sh file from 755 to 644.
Fixes: #6497
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
passed the only_kata variable through to pre_check, only_kata does not
abort the install when containerd is already installed.
fixes#6385
Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>
The kata-deploy install method tried to `chmod +x /opt/kata/runtime-rs/bin/*` but it isn't
always true that /opt/kata/runtime-rs/bin/ exists. For example, the
s390x payload does not build the kernel-dragonball-experimental
artifacts. So let's ensure the dir exist before issuing the command.
Fixes#6494
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
The kata-deploy-ci payloads for amd64 and arm64 were missing the shim-v2
and kernel-dragonball-experimental artifacts.
Fixes#6493
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
`make kata-tarball` needs a lot of disk space and github action runners don't
have that much of it. Remove unneeded tools from the runner, which frees
another ~10GB of space.
Fixes: #6490
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The various tarballs are unpacked into a temporary directory, and then that
directory is compressed into kata-static.tar.xz. After we have the tarball,
there is no reason to keep the temporary directory. Dispose of it as the last
step.
Fixes: #6490
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The function is returning "" when called from the script used to cache
the artefacts and one difference noted between this version and the
already working one from the CCv0 is that we make sure to `pushd
${repo_root_dir}` in the CCv0 version.
Let's give it a try here and see if it solves the issue.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add support for caching VirtioFS artefacts that are generated using
the kata-deploy local-build scripts.
Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.
Fixes: #6480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's add support for caching shim v2 artefacts that are generated using
the kata-deploy local-build scripts.
Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.
Fixes: #6480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's add support for caching RootFS artefacts that are generated using
the kata-deploy local-build scripts.
Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.
Fixes: #6480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's add support for caching QEMU artefacts that are generated using
the kata-deploy local-build scripts.
Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.
Fixes: #6480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's add support for caching Nydus artefacts that are generated using
the kata-deploy local-build scripts.
Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.
Fixes: #6480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's add support for caching Kernel artefacts that are generated using
the kata-deploy local-build scripts.
Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.
Fixes: #6480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's add support for caching Firecracker artefacts that are generated
using the kata-deploy local-build scripts.
Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.
Fixes: #6480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's add support for caching Cloud Hypervisor artefacts that are
generated using the kata-deploy local-build scripts.
Right now those are not used, but we'll switch to using them very soon
as part of upcoming changes of how we build the components we test in
our CI.
Fixes: #6480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's adjust the kernel names in versions.yaml so those can match the
names used as part of the kata-deploy local build scripts.
Right now this doesn't bring any benefit nor drawback, but it'll make
our life easier later on in this same series.
Depends-on: github.com/kata-containers/tests#5534
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR updates the firecracker version being used in kata containers
versions.yaml
The changes in version 1.3.1 are
Added
Introduced T2CL (Intel) and T2A (AMD) CPU templates to provide
instruction set feature parity between Intel and AMD CPUs when using
these templates.
Added Graviton3 support (c7g instance type).
Changed
Improved error message when invalid network backend provided.
Improved TCP throughput by between 5% and 15% (depending on CPU) by using
scatter-gather I/O in the net device's TX path.
Upgraded Rust toolchain from 1.64.0 to 1.66.0.
Made seccompiler output bit-reproducible.
Fixed
Fixed feature flags in T2 CPU template on Intel Ice Lake.
Fixes#6482
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
We need to ensure `kata_config_version` is taken into account when:
* consuming a cached kernel, otherwise we may introduce changes to a
kernel that will never be validated as part of the PR
* caching the kernel, otherwise we won't update the artefacts if just a
config is changed
Fixes: #6485
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR is a continuing work for (kata-containers#3679).
This generalizes the previous VFIO device handling which only
focuses on PCI to include AP (IBM Z specific).
Fixes: kata-containers#3678
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Initial VFIO-AP support (#578) was simple, but somewhat hacky; a
different code path would be chosen for performing the hotplug, and
agent-side device handling was bound to knowing the assigned queue
numbers (APQNs) through some other means; plus the code for awaiting
them was written for the Go agent and never released. This code also
artificially increased the hotplug timeout to wait for the (relatively
expensive, thus limited to 5 seconds at the quickest) AP rescan, which
is impractical for e.g. common k8s timeouts.
Since then, the general handling logic was improved (#1190), but it
assumed PCI in several places.
In the runtime, introduce and parse AP devices. Annotate them as such
when passing to the agent, and include information about the associated
APQNs.
The agent awaits the passed APQNs through uevents and triggers a
rescan directly.
Fixes: #3678
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Generalize VFIO devices to allow for adding AP in the next patch.
The logic for VFIOPciDeviceMediatedType() has been changed and IsAPVFIOMediatedDevice() has been removed.
The rationale for the revomal is:
- VFIODeviceMediatedType is divided into 2 subtypes for AP and PCI
- Logic of checking a subtype of mediated device is included in GetVFIODeviceType()
- VFIOPciDeviceMediatedType() can simply fulfill the device addition based
on a type categorized by GetVFIODeviceType()
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
e.g., split_vfio_option is PCI-specific and should instead be named
split_vfio_pci_option. This mutually affects the runtime, most notably
how the labels are named for the agent.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Since shimv2 create task option is already implemented, we need to update the
corresponding comments.
Also, the ordering is also updated to fit with the code.
fixes: #3961
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
If cid is empty, we will use image name as default cid, to
support multiple containers with same image, we need append
unique id to the image name.
Fixes: #6456
Signed-off-by: Wang, Arron <arron.wang@intel.com>
This adds /dev/mshv to the list of sandbox devices so that VMMs can
create Hyper-V VMs.
In our testing, this also doesn't error out in case /dev/mshv isn't
present.
Fixes#6454.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Reverting the container image names to pick up the lib.sh methods.
Signed-off-by: Georgina Kinge georgina.kinge@ibm.com
Co-authored-by: Steve Horsman <steven@uk.ibm.com>
When calling `MAKE_KERNEL_NAME` we're considering the default kernel
name will be `vmlinux.container` or `vmlinuz.container`, which is not
the case as the runtime-rs, when used with dragonball, relies on the
`vmlinu[zx]-dragonball-experimental.container` kernel.
Other hypervisors will have to introduce a similar
`MAKE_KERNEL_NAME_${HYPERVISOR}` to adapt this to the kernel they want
to use, similarly to what's already done for the go runtime.
By doing this we also ensure that no changes in the configuration file
will be required to run runtime-rs, with dragonball, as part of our CI
or as part of kata-deploy.
Fixes: #6290
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
On hotplug of memory as containers are started, remount all ephemeral mounts with size option set to the total sandbox memory
Fixes: #6417
Signed-off-by: Sidhartha Mani <sidhartha_mani@apple.com>
This will help us in several ways:
* The first one is not using an image that's close to be EOLed, and
which doesn't officially provide multi-arch images.
* The second is getting closer to what's been already done on main.
* The third is simplifying the logic to build the payload image.
Fixes: #6446
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Some structs in the runtime-rs don't implement Default trait.
This commit adds the missing Default.
Fixes: #5463
Signed-off-by: Li Hongyu <lihongyu1999@bupt.edu.cn>
Currently, we only create the new exec process in runtime, this will cause error
when the following requests needing to be handled:
- Task: exec process
- Task: resize process pty
- ...
The agent do not do_exec_process when we handle ExecProcess, thus we can not find
any process information in the guest when we handle ResizeProcessPty. This will
report an error.
In this commit, the handling process is modified to the:
* Modify process tty_win information in runtime
* If the exec process is not running, we just return. And the truly pty_resize will
happen when start_process
Fixes: #6248
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Implements resize-volume handlers in shim-mgmt,
trait for sandbox and add RPC calls to agent.
Note the actual rpc handler for the resize request is currently not
implemented, refer to issue #3694.
Fixes#5369
Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
For external hypervisors(qemu, cloud-hypervisor, ...), the ns they launch vm in
is different from internal hypervisor(dragonball). And when we doing CreateContainer
hook, we will rely on the netns path. So we add a get_ns_path API.
Fixes: #6442
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Implements get-volume-stats trait for sandbox,
handler for shim-mgmt and add RPC calls to
agent. Also added type conversions in trans.rs
Fixes#5369
Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
We've been seeing the 'sudo make test' job occasionally run out of space in
/tmp, which is part of the root filesystem. Removing dotnet and
`AGENT_TOOLSDIRECTORY` frees around 10GB of space and in my tests the job still
has 13GB of space left after running.
Fixes: #6401
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
{{ runner.workspace }}/kata-containers and {{ github.workspace }} resolve to
the same value, but they're being used multiple times in the workflow. Remove
multiple definitions and define the GOPATH var at job level once.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The last remaining user of the TRAVIS variable in this repo is
tools/osbuilder/tests and it is only used to skip spinning up VMs. Travis
didn't support virtualization and the same is true for github actions hosted
runners. Replace the variable with KVM_MISSING and determine availability of
/dev/kvm at runtime.
TRAVIS is also used by '.ci/setup.sh' in kata-containers/tests to reduce the
set of dependencies that gets installed, but this is also in the process of
being removed.
Fixes: #3544
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This is to prepare a secure image tarball to run a confidential
container for IBM Z SE(TEE).
Fixes: #6206
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
There's no need to pass repo_root_dir to get_last_modification() as the
variable used everywhere is exported from that very same file.
Fixes: #6431
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is used in several parts of the code, and can have a single
declaration as part of the `lib.sh` file, which is already imported by
all the places where it's used.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- implement update_ephemeral_mounts rpc
- for each mountpoint passed in, remount it with new options
Signed-off-by: Sidhartha Mani <sidhartha_mani@apple.com>
- adds a new rpc call to the agent service named `updateEphemeralMounts`
- this call takes a list of grpc.Storage objects
Signed-off-by: Sidhartha Mani <sidhartha_mani@apple.com>
The logic to decide which cgroup driver is used is currently based on the
cgroup path that the host provides. This requires host and guest to use the
same cgroup driver. If the guest uses kata-agent as init, then systemd can't be
used as the cgroup driver. If the host requests a systemd cgroup, this
currently results in a rustjail panic:
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: I/O error: No such file or directory (os error 2)
Caused by:
No such file or directory (os error 2)', rustjail/src/cgroups/systemd/manager.rs:44:51
stack backtrace:
0: 0x7ff0fe77a793 - std::backtrace_rs::backtrace::libunwind::trace::h8c197fa9a679d134
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
1: 0x7ff0fe77a793 - std::backtrace_rs::backtrace::trace_unsynchronized::h9ee19d58b6d5934a
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x7ff0fe77a793 - std::sys_common::backtrace::_print_fmt::h4badc450600fc417
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:65:5
3: 0x7ff0fe77a793 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::had334ddb529a2169
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:44:22
4: 0x7ff0fdce815e - core::fmt::write::h1aa7694f03e44db2
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/fmt/mod.rs:1209:17
5: 0x7ff0fe74e0c4 - std::io::Write::write_fmt::h61b2bdc565be41b5
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/io/mod.rs:1682:15
6: 0x7ff0fe77cd3f - std::sys_common::backtrace::_print::h4ec69798b72ff254
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:47:5
7: 0x7ff0fe77cd3f - std::sys_common::backtrace::print::h0e6c02048dec3c77
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:34:9
8: 0x7ff0fe77c93f - std::panicking::default_hook::{{closure}}::hcdb7e705dc37ea6e
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:267:22
9: 0x7ff0fe77d9b8 - std::panicking::default_hook::he03a933a0f01790f
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:286:9
10: 0x7ff0fe77d9b8 - std::panicking::rust_panic_with_hook::he26b680bfd953008
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:688:13
11: 0x7ff0fe77d482 - std::panicking::begin_panic_handler::{{closure}}::h559120d2dd1c6180
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:579:13
12: 0x7ff0fe77d3ec - std::sys_common::backtrace::__rust_end_short_backtrace::h36db621fc93b005a
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:137:18
13: 0x7ff0fe77d3c1 - rust_begin_unwind
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
14: 0x7ff0fda52ee2 - core::panicking::panic_fmt::he7679b415d25c5f4
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
15: 0x7ff0fda53182 - core::result::unwrap_failed::hb71caff146724b6b
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5
16: 0x7ff0fe5bd738 - <rustjail::cgroups::systemd::manager::Manager as rustjail::cgroups::Manager>::apply::hd46958d9d807d2ca
17: 0x7ff0fe606d80 - <rustjail::container::LinuxContainer as rustjail::container::BaseContainer>::start::{{closure}}::h1de806d91fcb878f
18: 0x7ff0fe604a76 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1749c148adcc235f
19: 0x7ff0fdc0c992 - kata_agent::rpc::AgentService::do_create_container::{{closure}}::{{closure}}::hc1b87a15dfdf2f64
20: 0x7ff0fdb80ae4 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h846a8c9e4fb67707
21: 0x7ff0fe3bb816 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h53de16ff66ed3972
22: 0x7ff0fdb519cb - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1cbece980286c0f4
23: 0x7ff0fdf4019c - <tokio::future::poll_fn::PollFn<F> as core::future::future::Future>::poll::hc8e72d155feb8d1f
24: 0x7ff0fdfa5fd8 - tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::h0a407ffe2559449a
25: 0x7ff0fdf033a1 - tokio::runtime::task::raw::poll::h1045d9f1db9742de
26: 0x7ff0fe7a8ce2 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::h4924ae3464af7fbd
27: 0x7ff0fe7afb85 - tokio::runtime::task::raw::poll::h5c843be39646b833
28: 0x7ff0fe7a05ee - std::sys_common::backtrace::__rust_begin_short_backtrace::ha7777c55b98a9bd1
29: 0x7ff0fe7a9bdb - core::ops::function::FnOnce::call_once{{vtable.shim}}::h27ec83c953360cdd
30: 0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hed812350c5aef7a8
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
31: 0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hc7df8e435a658960
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
32: 0x7ff0fe7801d5 - std::sys::unix::thread::Thread::new::thread_start::h575491a8a17dbb33
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys/unix/thread.rs:108:17
Forward the value of "init_mode" to AgentService, so that we can force cgroupfs
when systemd is unavailable.
Fixes: #5779
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Right now LinuxContainer::new() gets passed a CreateOpts struct, but then
modifies the use_systemd_cgroup field inside that struct. Pull the cgroups path
parsing logic into do_create_container, so that CreateOpts can be immutable in
LinuxContainer::new. This is just moving things around, there should be no
functional changes.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Since the cgroup manager is wrapped in a dyn now, the print in
LinuxContainer::new has been useless and just says "CgroupManager". Extend the
Debug trait for 'dyn Manager' to print the type of the cgroup manager so that
it's easier to debug issues.
Fixes: #5779
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Let's make sure the kata-tarball architecture upload / downloaded / used
is exactly the same one that we need as part of the architecture we're
using to generate the image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Now that we've switched the base container image to using Ubuntu instead
of CentOS, we don't need any kind of extra logic to correctly build the
image for different architectures, as Ubuntu is a multi-arch image that
supports all the architectures we're targetting.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure we use a multi-arch image for building kata-deploy.
A few changes were also added in order to get systemd working inside the
kata-deploy image, due to the switch from CentOS to Ubuntu.
Fixes: #6358
Signed-off-by: SinghWang <wangxin_0611@126.com>
This commit adds support for pmu virtualization on aarch64. The
initialization of pmu is in the following order:
1. Receive pmu parameter(vpmu_feature) from runtime-rs to determine the
VpmuFeatureLevel.
2. Judge whether to initialize pmu devices and add pmu device node into
fdt on aarch64, according to VpmuFeatureLevel.
Fixes: #6168
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
As part of bd1ed26c8d, we've pointed to
the Dockerfile that's used in the CC branch, which is wrong.
For what we're doing on main, we should be pointing to the one under the
`kata-deploy` folder, and not the one under the non-existent
`kata-deploy-cc` one.
Fixes: #6343
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As the image provided as part of registry.centos.org is not a multi-arch
one, at least not for CentOS 7, we need to expand the script used to
build the image to pass images that are known to work for s390x (ClefOS)
and aarch64 (CentOS, but coming from dockerhub).
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's break the IMAGE build parameter into BASE_IMAGE_NAME and
BASE_IMAGE_TAG, as it makes it easier to replace the default CentOS
image by something else.
Spoiler alert, the default CentOS image is **not** multi-arch, and we do
want to support at least aarch64 and s390x in the near term future.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When update the nydusd to 2.2, the argument "--hybrid-mode" cause
the following error:
thread 'main' panicked at 'ArgAction::SetTrue / ArgAction::SetFalse is defaulted'
Maybe we should remove it to upgrad nydusd
Fixes: #6407
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Updated the `kata-manager.sh` script to make it run all the checks on
the host system before attempting to create a container. If any checks
fail, they will indicate to the user what the problem is in a clearer
manner than those reported by the container manager.
Fixes: #6281.
Signed-off-by: tg5788re <jfokugas@gmail.com>
These variables are unused since we don't use travis CI. This also allows to
remove two steps:
- 'Setup GOPATH' only printed variables
- 'Setup travis reference' modified some shell local variables that don't have
any influence on the rest of the steps
The TRAVIS var is still used by tools/osbuilder/tests to determine if
virtualization is available.
Fixes: #3544
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Set process status to exited at end of io wait, which indicate process
exited only, but stop process has not been finished. Otherwise, the
cleanup_container will be skipped.
Fixes: #6393
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
In `payload-after-push.yaml` we ended up mentioning cc-*.yaml workflows,
which are non existent in the main branch.
Let's adapt the name to the correct ones.
Fixes: #6343
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We have a few actions in the `payload-after-push.*.yaml` that are
referring to Confidential Containers, but they should be referring to
Kata Containers instead.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Since rust-vmm and dragonball-sandbox has introduced several updates
such as vPMU support for aarch64, we also need to update Dragonball
dependencies to include those changes.
Update:
virtio-queue to v0.6.0
kvm-ioctls to v0.12.0
dbs-upcall to v0.2.0
dbs-virtio-devices to v0.2.0
kvm-bindings to v0.6.0
Also, several aarch64 features are updated because of dependencies
changes:
1. update vcpu hotplug API.
2. update vpmu related API.
3. adjust unit test cases for aarch64 Dragonball.
fixes: #6268
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
For the architectures we know that `make kata-tarball` works as
expected, let's start publishing the kata-deploy payload after each
merge.
This will help to:
* Easily test the content of current `main` or `stable-*` branch
* Easily bisect issues
* Start providing some sort of CI/CD content pipeline for those who
need that
This is a forward-port work from the `CCv0` and groups together patches
that I've worked on, with the work that Choi did in order to support
different architectures.
Fixes: #6343
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
$ make install
make: *** No rule to make target 'containerd-shim-kata-v2', needed by 'install-containerd-shim-v2'. Stop.
Spotted when building kata-runtime with a different name for
SHIMV2_OUTPUT. For instance, trying to keep different runtime binaries
installed at the same time, one from master and another from lets say,
the CCv0 branch, with the following small change applied.
diff --git a/src/runtime/Makefile b/src/runtime/Makefile
index 95efaff78..2bab9eb75 100644
--- a/src/runtime/Makefile
+++ b/src/runtime/Makefile
@@ -231,7 +231,7 @@ SED = sed
CLI_DIR = cmd
SHIMV2 = containerd-shim-kata-v2
-SHIMV2_OUTPUT = $(bCURDIR)/$(SHIMV2)
+SHIMV2_OUTPUT = $(CURDIR)/$(SHIMV2)-ccv0
SHIMV2_DIR = $(CLI_DIR)/$(SHIMV2)
MONITOR = kata-monitor
Fixes: #6398
Signed-off-by: Eduardo Lima (Etrunko) <etrunko@redhat.com>
/opt/confidential-containers/runtime-rs needs to be cleaned up, otherwise
containerd post-uninstall script fails due to weird logic in `rmdir
--ignore-fail-on-non-empty`.
Fixes: #6396
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
In this commit, we have done:
* modify the tranfer process from grpc::Hooks to oci::Hooks, so the code
can be more clean
* add more tests for create_runtime, create_container, start_container hooks
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
The latest ubuntu runners already have docker installed and trying to
install it manually will cause the following issue:
```
Run curl -fsSL https://test.docker.com/ -o test-docker.sh
Warning: the "docker" command appears to already exist on this system.
If you already have Docker installed, this script can cause trouble, which is
why we're displaying this warning and provide the opportunity to cancel the
installation.
If you installed the current Docker package using this script and are using it
again to update Docker, you can safely ignore this message.
You may press Ctrl+C now to abort this script.
+ sleep 20
+ sudo -E sh -c apt-get update -qq >/dev/null
E: The repository 'https://packages.microsoft.com/ubuntu/22.04/prod jammy Release' is no longer signed.
```
Fixes: #6390
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The latest ubuntu runners already have docker installed and trying to
install it manually will cause the following issue:
```
Run curl -fsSL https://test.docker.com/ -o test-docker.sh
Warning: the "docker" command appears to already exist on this system.
If you already have Docker installed, this script can cause trouble, which is
why we're displaying this warning and provide the opportunity to cancel the
installation.
If you installed the current Docker package using this script and are using it
again to update Docker, you can safely ignore this message.
You may press Ctrl+C now to abort this script.
+ sleep 20
+ sudo -E sh -c apt-get update -qq >/dev/null
E: The repository 'https://packages.microsoft.com/ubuntu/22.04/prod jammy Release' is no longer signed.
```
Fixes: #6390
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Removed the part in the `kata-manager.sh` script that checks if the host system only runs cgroups v2.
Fixes: #6259.
Signed-off-by: Alec Pemberton <pembek1901@gmail.com>
In online-kbs attestation the guest is given the location of the
keybroker server to connect after launch. This patch appends the
IP:Port of the online-kbs to the kernel params of the guest.
Patch also simplifies the kbs config into "mode" = offline/online,
and updates SEV config variable names and default values
Fixes: #5661#5715
Signed-off-by: Jim Cadden <jcadden@ibm.com>
In some cases, network endpoints will be configured through Prestart
Hook. So network endpoints may need to be added(hotpluged) after vm
is started and also Prestart Hook is executed.
We move pre-start hook functions' execution to sandbox_start to allow
hooks running between vm_start and netns_scan easily, so that the
lifecycle API can be cleaner.
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
StartContainer will be execute in guest container namespace in Kata.
The Hook Path of this kind of hook is also in guest container namespace.
StartContainer is executed after start operation is called, and it
should be executed before user-specific command is executed.
Fixes: #5787
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
CreateContainer hook is one kind of OCI hook. In kata, it will be
executed after VM is started, before container is created, and after
CreateRuntime is executed.
The hook path of CreateContainer hook is in host runtime namespace, but
it will be executed in host vmm namespace.
Fixes: #5787
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
According to the runtime OCI Spec, there can be some hook
operations in the lifecycle of the container. In these hook
operations, the runtime can execute some commands. There are different
points in time in the container lifecycle and different hook types
can be executed.
In this commit, we are now supporting 4 types of hooks(same in
runtime-go): Prestart hook, CreateRuntime hook, Poststart hook and
Poststop hook.
Fixes: #5787
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
This patch re-generates the client code for Cloud Hypervisor v30.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.
Fixes: #6375
Signed-off-by: Bo Chen <chen.bo@intel.com>
The logic to decide which cgroup driver is used is currently based on the
cgroup path that the host provides. This requires host and guest to use the
same cgroup driver. If the guest uses kata-agent as init, then systemd can't be
used as the cgroup driver. If the host requests a systemd cgroup, this
currently results in a rustjail panic:
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: I/O error: No such file or directory (os error 2)
Caused by:
No such file or directory (os error 2)', rustjail/src/cgroups/systemd/manager.rs:44:51
stack backtrace:
0: 0x7ff0fe77a793 - std::backtrace_rs::backtrace::libunwind::trace::h8c197fa9a679d134
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
1: 0x7ff0fe77a793 - std::backtrace_rs::backtrace::trace_unsynchronized::h9ee19d58b6d5934a
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x7ff0fe77a793 - std::sys_common::backtrace::_print_fmt::h4badc450600fc417
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:65:5
3: 0x7ff0fe77a793 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::had334ddb529a2169
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:44:22
4: 0x7ff0fdce815e - core::fmt::write::h1aa7694f03e44db2
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/fmt/mod.rs:1209:17
5: 0x7ff0fe74e0c4 - std::io::Write::write_fmt::h61b2bdc565be41b5
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/io/mod.rs:1682:15
6: 0x7ff0fe77cd3f - std::sys_common::backtrace::_print::h4ec69798b72ff254
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:47:5
7: 0x7ff0fe77cd3f - std::sys_common::backtrace::print::h0e6c02048dec3c77
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:34:9
8: 0x7ff0fe77c93f - std::panicking::default_hook::{{closure}}::hcdb7e705dc37ea6e
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:267:22
9: 0x7ff0fe77d9b8 - std::panicking::default_hook::he03a933a0f01790f
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:286:9
10: 0x7ff0fe77d9b8 - std::panicking::rust_panic_with_hook::he26b680bfd953008
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:688:13
11: 0x7ff0fe77d482 - std::panicking::begin_panic_handler::{{closure}}::h559120d2dd1c6180
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:579:13
12: 0x7ff0fe77d3ec - std::sys_common::backtrace::__rust_end_short_backtrace::h36db621fc93b005a
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:137:18
13: 0x7ff0fe77d3c1 - rust_begin_unwind
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
14: 0x7ff0fda52ee2 - core::panicking::panic_fmt::he7679b415d25c5f4
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
15: 0x7ff0fda53182 - core::result::unwrap_failed::hb71caff146724b6b
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5
16: 0x7ff0fe5bd738 - <rustjail::cgroups::systemd::manager::Manager as rustjail::cgroups::Manager>::apply::hd46958d9d807d2ca
17: 0x7ff0fe606d80 - <rustjail::container::LinuxContainer as rustjail::container::BaseContainer>::start::{{closure}}::h1de806d91fcb878f
18: 0x7ff0fe604a76 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1749c148adcc235f
19: 0x7ff0fdc0c992 - kata_agent::rpc::AgentService::do_create_container::{{closure}}::{{closure}}::hc1b87a15dfdf2f64
20: 0x7ff0fdb80ae4 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h846a8c9e4fb67707
21: 0x7ff0fe3bb816 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h53de16ff66ed3972
22: 0x7ff0fdb519cb - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h1cbece980286c0f4
23: 0x7ff0fdf4019c - <tokio::future::poll_fn::PollFn<F> as core::future::future::Future>::poll::hc8e72d155feb8d1f
24: 0x7ff0fdfa5fd8 - tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::h0a407ffe2559449a
25: 0x7ff0fdf033a1 - tokio::runtime::task::raw::poll::h1045d9f1db9742de
26: 0x7ff0fe7a8ce2 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::h4924ae3464af7fbd
27: 0x7ff0fe7afb85 - tokio::runtime::task::raw::poll::h5c843be39646b833
28: 0x7ff0fe7a05ee - std::sys_common::backtrace::__rust_begin_short_backtrace::ha7777c55b98a9bd1
29: 0x7ff0fe7a9bdb - core::ops::function::FnOnce::call_once{{vtable.shim}}::h27ec83c953360cdd
30: 0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hed812350c5aef7a8
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
31: 0x7ff0fe7801d5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hc7df8e435a658960
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
32: 0x7ff0fe7801d5 - std::sys::unix::thread::Thread::new::thread_start::h575491a8a17dbb33
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys/unix/thread.rs:108:17
Forward the value of "init_mode" to AgentService, so that we can force cgroupfs
when systemd is unavailable.
Fixes: #5779
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Right now LinuxContainer::new() gets passed a CreateOpts struct, but then
modifies the use_systemd_cgroup field inside that struct. Pull the cgroups path
parsing logic into do_create_container, so that CreateOpts can be immutable in
LinuxContainer::new. This is just moving things around, there should be no
functional changes.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Fix path check bypassed issuse introduced by #6082,
use filepath.Clean() to clean path before check
Fixes: #6082
Signed-off-by: XDTG <click1799@163.com>
This patch fixes the issue that do_copy_file changes
the directory permission of the parent directory of
a target file, even when the parent directory already
exists.
Fixes#6367
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
This is the latest td-shim commit and the latest known working
attestation-agent commit.
Fixes: #6366
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Features were renamed, so switch both arches to the katacc* feature.
Testing showed that "signature-simple" feature in image-rs is needed on
x86_64, so add that too. This image-rs commit does not include the
latest ocicrypt-rs and attestation-agent code itself.
Fixes: #6366
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This change enables to run cloud-hypervisor VMM using a non-root user
when rootless flag is set true in the configuration
Fixes: #2567
Signed-off-by: Feng Wang <fwang@confluent.io>
Renaming the output binary from the AmdSevPkg from OVMF.fd to AMDSEV.fd so it does not conflict with the base x86_64 build.
Changing install name in ovmf static builder and the location in the sev config file.
Fixes: #6337
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
Allow an initrd/initramfs image to be used with Cloud Hypervisor, which
is handled differently to the default rootfs image type.
Fixes: #6335.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Replace `cloud_hypervisor_vm_create_cfg()` with a set of `TryFrom` trait
implementations in the new CH specific `convert.rs` to allow the generic
`Hypervisor` configuration to be converted into the CH specific
`VmConfig` type.
Note that device configuration is not currently handled in `convert.rs`
(it's handled in `inner_device.rs`).
This change removes the old hard-coded CH specific configuration.
Fixes: #6203.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Adding kernel config to sev case since it is needed for SNP and SNP will use the SEV kernel.
Incrementing kernel config version to reflect changes
Fixes: #6123
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
Since the cgroup manager is wrapped in a dyn now, the print in
LinuxContainer::new has been useless and just says "CgroupManager". Extend the
Debug trait for 'dyn Manager' to print the type of the cgroup manager so that
it's easier to debug issues.
Fixes: #5779
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Following Jong Wu suggestion, let's link /usr/bin/musl-gcc to
/usr/bin/aarch64-linux-musl-gcc.
Fixes: #6320
Signed-off-by: SinghWang <wangxin_0611@126.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 9d78bf9086.
Golang binaries are built statically by default, unless linking against
CGO, which we do. In this case we dynamically link against glibc,
causing us troubles when running a binary built with Ubuntu 22.04 on
Ubuntu 20.04 (which will still be supported for the next few years ...)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
New developers are often confused by some of our requirements, notably porting
labels. While our CONTRIBUTING.md file points to the solution, the developer's
guide does not. Add a link there.
Fixes: #6329
Signed-off-by: Christophe de Dinechin <christophe@dinechin.org>
When starting an initrd the kernel expects to find /dev/console in the initrd,
so that it can connect it as stdin/stdout/stderr to the /init process. If the
device node is missing the kernel will complain that it was unable to open an
initial console. If kata-agent is the initrd init process, it will also result
in log messages not being logged to console and thus not forwarded to host
syslog.
Add a set of standard device nodes for completeness, so that console logging
works. To do that we install the makedev packge which provides a MAKEDEV helper
that knows the major/minor numbers. Unfortunately the debian package tries to
create devnodes from postinst, which can be suppressed if systemd-detect-virt
is present. That's why we create a small dummy script that matches what
systemd-detect-virt would output (anything is enough to suppress mknod).
Fixes: #6261
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
DeviceConfigInfo contains config and device, so when we want to do
update we could simply update config part of the info, and device would
not be changed during update.
Fixes: #6324
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Because crossbeam_channel has more features and better performance than
mpsc::channel and finally rust replace its channel implementation with
crossbeam_channel on version 1.67
Signed-off-by: Tim Zhang <tim@hyper.sh>
Let's bump the base container image to use the 22.04 version of Ubuntu,
as it does bring up-to-date package dependencies that we need to
statically build the runtime-rs on aarch64.
Fixes: #6320
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
previously, if installing on unkown distro, script would tell user that
their distro was unsupported. Changed error message prompting user to
install dependecies manually, then retry.
Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>
Let's push the builder images to a registry, so we can take advantage of
those on each step of our building process.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This, combined with the effort of caching builder images *and* only
performing the build itself inside the builder images, is the very first
step for reproducible builds for the project.
Reproducible builds are quite important when we talk about Confidential
Containers, as users may want to verify the content used / provided by
the CSPs, and this is the first step towards that direction.
Fixes: #5517
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the td-shim.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the td-shim.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for building and pushing the OVMF builder
image to the Kata Containers' quay.io registry.
Fixes: #5477
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of buildinf our
own, to be used as a builder image for OVMF.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the QEMU
builder image to the Kata Containers' quay.io registry.
Fixes: #5481
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existsing image, instead of building our
own, to be used as a builder image for QEMU.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the virtiofsd
builder image to the Kata Containers' quay.io registry.
Fixes: #5480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the virtiofsd.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's ensure we're building virtiofsd with a specific toolchain that's
known to not cause any issues, instead of always using the latest one.
On each bump of the virtiofsd, we'll make sure to adjust this according
to what's been used by the virtiofsd community.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the shim-v2
builder image to the Kata Containers' quay.io registry.
Fixes: #5478
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's try to pull a pre-existing image, instead of building our own, to
be used as a builder for the shim-v2.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Currently the symbolic link for virtiofsd which is used as
a valid path is not updated on every CI run. Fix it by
using the actual path of installation.
Fixes: #6311
Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
Let's add the needed infra for only building and pushing the kernel
builder image to the Kata Containers' quay.io registry.
Fixes: #5476
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the kernel.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will push a specific tag to a registry, whenever the
PUSH_TO_REGISTRY environment variable is set, otherwise it's a no-op.
This will be used in the future to avoid replicating that logic in every
builder used by the kata-deploy scripts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a function to get the hash of the last commit modifying a
specific file.
This will help to avoid writing `git rev-list ...` into every single
build script used by the kata-deploy.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
BUILD_REGISTRY, which points to quay.io/kata-containers/builder, will be
used for storing the builder images used to build the artefacts via the
kata-deploy scripts.
The plan is to tag, whenever it's possible and makes sense, images like:
* ${BUILDER_REGISTRY}:${component}-${unique_identifier}
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Every dependency in check_deps is used inside the script (apart from
git, which may be a historical artifact), and therefore should be
checked even when the -f option is passed to the script. Simply changed
at what point check_deps is called in order to always run it.
Fixes#6302.
Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>
Send and Sync are automatically derived traits,
if a type is composed entirely of Send or Sync types, then it is Send or Sync.
Almost all primitives are Send and Sync,
so we don't need to implement them manually most of the time.
Fixes: #6307
Signed-off-by: Tim Zhang <tim@hyper.sh>
Use `git-diff` instead of legacy `git-whatchanged` to get
differences in the packaging/kernel directory. This also fixes
a bug by grepping for the kernel directory in the output of the
git command.
Fixes: #6210
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
When a new stable branch is created, it is necessary to change the
references in the tests repo from main to the new stable branch.
However this step needs to be performed after the repos have been tagged
as the `tags_repos.sh` script is the one that creates the new branch.
Clarify this in the documentation and move the step to change branch
references in test repo after repos have been tagged.
Fixes: #1824
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Currently ubuntu is already the default distro for all the architectures
but x86_64, which uses clearlinux. However, our CI does *not* test the
clearlinux image we ship.
Taking a look at our CI code [0], we've been using ubuntu as base for
the tests for a few years already, if not forever.
The minimum we can do is to switch to distributing ubuntu, as the tested
rootfs-image, and then decide later on whether we should switch back to
clearlinux (once we switch our CI to using that, and make sure all tests
will be green), or if we move to slimmer distro, such as alpine.
[0]: 0a39dd1a01/.ci/install_kata_image.sh (L44)Fixes: #6303
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
There's a check in the runtime-rs Makefile that basically checks whether
the `arch/$arch-options.mk` exists or not and, if it doesn't, the build
is just aborted.
With this in mind, let's create a generic powerpc64le-options.mk file
and not bail when building for this architecture.
Fixes: #6142
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
In the `install_go_rust.sh` file we're adding a
x86_64-unknown-linux-musl target unconditionally. That should be,
instead, based in the ARCH of the host and the appropriate LIBC to be
used with that host.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The patchset will help users to easily enter guest VM by debug
console sock.
In order to enter guest VM smoothly, users needs to do some
configuration, options as below:
(1) Set debug_console_enabled = true with default vport 1026.
(2) Or add agent.debug_console agent.debug_console_vport=<PORT>
into kernel_params, and the vport is <PORT> you set.
The detail of usage:
$ kata-ctl exec -h
kata-ctl-exec
Enter into guest VM by debug console
USAGE:
kata-ctl exec [OPTIONS] <SANDBOX_ID>
ARGS:
<SANDBOX_ID> pod sandbox ID
Fixes: #5340
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
For some cases, users will mount system directories as bind volume.
We should not bind mount these kind of directories in the host as it does
not make sense.
Fixes: #6299
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
The rust agent had supported to set the guest dns
server in start sandbox request, thus add the dns
in the runtime side.
Fixes:#6286
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
We should make sure the dns's source file's parent
directory exist, otherwise, it would failed to create
the file directly.
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
In workflow, `set-output` command is deprecated and will be disabled soon.
This commit replaces the deprecated `set-output` command with putting a
value in the environment file `$GITHUB_OUTPUT`.
Fixes#6266
Signed-off-by: jongwooo <jongwooo.han@gmail.com>
Let's not try to sed a file that doesn't exist, which may be the case
depending on the architecture we're building the shim-v2 for.
This is a partial-forward port of
f24c47ea47.
Fixes: #6293
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As 3.1.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup
tags back to "latest", and re-add the kata-deploy-stable and the
kata-cleanup-stable files.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Add a few details about the current state of the Cloud Hypervisor (CH)
runtime-rs external hypervisor implementation with pointers to the
appropriate issues.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add a basic runtime-rs `Hypervisor` trait implementation for Cloud
Hypervisor (CH).
> **Notes:**
>
> - This only supports a default Kata configuration for CH currently.
>
> - Since this feature is still under development, `cargo` features have
> been added to enable the feature optionally. The default is to not enable
> currently since the code is not ready for general use.
>
> To enable the feature for testing and development, enable the
> `cloud-hypervisor` feature in the `virt_container` crate and enable the
> `cloud-hypervisor` feature for its `hypervisor` dependency.
Fixes: #5242.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
In order to disable or enable some features when running tdx vms, we
need to add is_tdx_enabled() function to identify whether the VM
confidiential type is TDX.
fixes: #6276
Signed-off-by: fengshifang <fengshifang@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
It is just to place a missing stage for permission adjustment in the
cc-payload-after-push-s390x workflow.
Fixes#6278
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
A Makefile target `install` should be included in the conditional branch
as default and test.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
(cherry picked from commit 4139d68d51)
Nothing much to add, let's just make the message more clear.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c071355359)
As done for s390x, let's just skip the runtime-rs build for Power.
Fixes: #6142
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4e2db96ef7)
The .dracut_rootfs.done file is accidentally being picked up as the default
target, regardless of BUILD_METHOD. Move the 'all' target definition up, so
that it's the default (=first) target in the makefile. Additionally make the
.dracut_rootfs.done target conditional on the right BUILD_METHOD being
selected, as building it doesn't make sense with BUILD_METHOD=distro.
Fixes: #6235
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This is to make a docker version to v20.10 in docker upstream image ubuntu:20.04 for s390x and ppc64le.
Fixes: #6211
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Cherry-picked: f49b89b
A known bug in qemu 7.2.0 causes a problem handling the kernel hashes argument and causes SEV container launching to fail.
Fixes: #6189
Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
Now that TDX work will start coming for runtime-rs, let's also take it
into consideration when caching the shim-v2 tarball.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Removed the qemu paramter 'policy' (and also dh-cert-file, session-file, kernel-hashes=on)
for SNP container.
Fixes: #5795
Signed-off-by: Niteesh Dubey <niteesh@linux.ibm.com>
- Remove umoci entry from versions
- Update the usage of skopeo to control the tooling we use to build
the pause image
Fixes: #
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Now we don't need to have skopeo and umoci in the rootfs
remove the code that optionally builds and installs them
Fixes: #3970
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Remove the container_policy_file config parameter as it was only used
by the skopeo code path
Fixes: #3970
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
These changed will be consumed by SEV firmware caching job in the CI. This will help in reducing the CI runtime.
Fixes: #6119
Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
The attestation-agent had its v0.3.0 release earlier Today, following
the Confidential Containers v0.3.0 release process.
Let's bump it on our side, as we've already tested the version that
became this release.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
image-rs has released its v0.3.0 release earlier Today, following the
v0.3.0 Confidential Containers release process.
The v0.3.0 is based on exactly the same commit we've been using already,
so no changes are expected for us.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
TD-Shim has released its v0.3.0 release earlier Today, following the
Confidential Containers v0.3.0 release.
Let's update it here. We need to also bump the toolchain to using the
nightly-2022-11-15
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
With `disable_netns=true`, we should never scan the sandbox netns which
is the host netns in such case.
Fixes: #6021
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Cherry-picked: 12fd6ff
As the toolchain is installed in the image itself, we *must* take the
toolchain into consideration when deciding whether to use a cached image
or building a new one.
Fixes: #6033
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Bug fix for #5651. Faulty bash syntax let a initrd build complete, but not copy the kernel module.
This change fixes the if logic to work as an 'or' as intended.
Fixes: #6024
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
When `HOST_ARCH` != `ARCH` unset `CC`
Specifying a foreign CC is incompatible with building libgit2. Thus after the RUSTFLAGS linker
has been set we can safely unset CC to avoid passing this value through the build.
Fixes: #5890
Signed-off-by: James Tumber <james.tumber@ibm.com>
Cherry-picked: 087515a
Adds AA_KBC option in rootfs builder to specify online_sev_kbc into the initrd.
Guid and secret type for sev updated in shim makefile to generate default config
KBC URI will be specified via kernel_params
Also changing the default option for sev in the local build scipts
Making sure sev guest kernel module is copied into the initrd. Will also eventually be needed for SNP
Fixes: #5650
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
The `sev_guest_policy` configuration field distinguishes between SEV and
SEV-ES guests (according to standard AMD SEV policy values).
Modify the kata runtime to detect SEV-ES guests and calculate calculate
the expected launch digest taking into account the number of VCPUs and
their CPU signature (model/family/stepping).
Fixes: #5471
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Switching sev build of ovmf to the cc fork until patches are upstreamed.
Adding build for dependencies
Fixes: #5892
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
the switch to cases lets AA_KBC to be parsed correctly.
There will be an addition to the offline_sev_kbc case to do the same for online_sev_kbc
There will also be an addition for SNP
Fixes: #5909
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
This is to enable quay.io/confidential-containers/runtime-payload for
s390x on top of amd64.
Fixes: #5894
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
As done for different components, let's also use a cached version of the
shim-v2 whenever it's possible.
Fixes: #5838
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
In order to cache the shim-v2 we're considering the the cached component
can be used if:
* There were no changes in the runtime directory
* There were no changes in the golang version used
* There were no changes in the rust version used
* We don't build the rust agent, but better be prepared for the future
* There were no changes in the following files that are provided by the
rootfs builds:
* root_hash_vanilla.txt
* root_hash_tdx.txt
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As done for different components, let's also use a cached version of
the rootfs whenever it's possible.
Fixes: #5433
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
e1f075dc60 reworked the action so the
shim-v2 was split out of the matrix build. With that done I ended up
not realising I'd need to log into the quay.io as one step of the
build-asset-cc-shim-v2 job.
Fixes: #5885
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is the most complex part to cache, as the cached component can be
only used if:
* There were no changes in the agent
* There were no changes in the libs (used by the agent)
* There were no changes in the rootfs build scripts
* There is no change in the version of the following components:
* attestation-agent (part of the rootfs)
* gperf (used to build libseccomp)
* libseccomp (used to build the agent)
* pause image (part of the rootfs)
* skopeo (part of the rootfs)
* umoci (part of the rootfs)
* rust (used to build the kata-containers and attestation agents)
We're relying on the last commit merged on places related to the rootfs
generation and using that as the rootfs version and that should be good
enough for what we need.
Apart from everything already mentioned, we've also added the ability to
cache the `root_hash_vanilla.txt` and `root_hash_tdx.txt` files, as
those are needed for when building the shim-v2, in order to have
measured boot working there.
It's important to note that we've added the ability to cache *both*
files, and I've taken that path as the shim-v2 cache work (which will
come soon) relies on both files.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will help us, in the future, to debug any possible issue related to
the measured rootfs arguments passed to the shim during the build time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The ability to do a measured boot has been overlooked when releasing the
payload consumed by the Confidential Containers project, and this
happened as we depend, at the shim-v2 build time, of a `root_hash_*.txt`
generated in the `tools/osbuilder/` directory, which is then used to add
a specific parameter to the `kernel_params` in the Kata Containers
configuration files.
With everything said above, the best way we can ensure this is done is
by saving those files during the rootfs build, download them during the
shim-v2 build (which *must* happen only after the rootfs builds happen),
and correctly use them there.
Fixes: #5847
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As Cloud Hypervisor and QEMU are using different rootfs images (the
former with `offline_fs_kbc` as aa_kbc, and the latter with `eaa_kbc`),
we need to differentiate the kernel parameters passed to each one of
those, as the `root_hash.txt` file used for measured boot will differ
according to the rootfs used.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
By doing this we can ensure that when building different rootfs-images
we won't end up overring the `root_hash.txt` file.
Plus, this will help us later in this series to pass the correct
argument to be used with the respective image.
Nothing's been done for SEV as it uses a initrd instead of an image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Adding the `root_hash.txt` to the final tarball doesn't bring any
benefit to the project, as the file dependency is for building the
shim-v2 and passing the correct measurement for the kernel command line.
It's important to mention that when building shim-v2, it doesn't look
for the file in `/opt/confidential-containers/share/kata-containers`,
bur rather in the `${repo_root_dir}/tools/osbuilder/`, as shown here:
ac3683e26e/tools/packaging/kata-deploy/local-build/kata-deploy-binaries.sh (L228-L232)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It turns out that there's more work needed to be done on the Cloud
Hypervisor side so we can fully support EAA_KBC with it.
For now, let's remove the configuration as the tests are not currently
passing when using it, and stick to the `offline_fs_kbc` and its
specific image for the Cloud Hypervisor + TDX case.
Fixes: #5862
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The `qemu-tdx` configuration is tied to using `offline_fs_kbc` as the
aa_kbc, which is something we're moving away from.
With this in mind, let's rename the `qemu-tdx-eaa-kbc` to `qemu-tdx` and
decrease the amount of the way too many configurations that we ship.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is to differentiate an artifact name between amd64 and s390x and add a
virtiofsd target for s390x.
Fixes: #5851
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
As already done for install_cc_kernel(), let's ensure we export
KATA_BUILD_CC=yes as part of the install_cc_tee_kernel.
This is used to generate the hash of the devices in the initramfs.
Fixes: #5845
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's move the info about building initramfs to *after* trying to
install the cached components.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR allow us to use the virtiofsd cache tarball instead of
building it from source.
Fixes#5356
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Exclude the image-rs cosign feature when the build target
is the s390x architecture.
Change Cargo to use workspace resolver 2 so that conditional
include for the image-rs crate is resolved correctly for different
targets.
Update cargo lock.
Fixes: #5582
Signed-off-by: Matthew Arnold <mattarno@uk.ibm.com>
It seems that the Kata Containers jenkins may be very slow to reach from
behind the firewall, causing TDX machine to fail downloading some of the
cached artefacts.
With this in mind, let's switch to using wget for this specific case.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's avoid getting into a dir and risking not being able to leave that
dir in case something fails.
Instead, let's just stay in the current dir and move the final tarball
to the exoected directory in case all the checks go as expected.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
With the cached version we're concatenating the td-shim version with the
toolchain version used to build the project.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a documentation about the environment variables that can be
used with the `-k` and `-q` options.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The `qemu_script_dir` is a leftover from before the rework on how we
cache the components.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're already doing for some components, let's also add support for
caching firmwares. TD-Shim and TDVF are the ones supported for now.
Fixes: #5360, #5361
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We should use {} instead of () when passing the component version to the
install_cached_component() function.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
asserts -> assets
stastic -> static
Those were not caught during the first merge of the series as we didn't
have CI jobs testing for the TEE artefacts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're currently building Cloud Hypervusor with thE TDX feature
regardless of using with TDX or not.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The name of the asset was wrong, "cloud-hypervisor" instead of
"hypervisor.cloud_hypervsior", generating an empty "latest" file.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The final cloud hypervisor tarball name is either
kata-static-cc-cloud-hypervisor.tar.xz or
kata-static-cc-tdx-cloud-hypervisor.tar.xz, meaning it uses
"cloud-hypervisor" instead of "clh" in the name.
Fixes: #5816
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's check for the cached version of the components as part of the
kata-deploy-binaries.sh as here we already have the needed info for
checking whether a component is cached or not, and to use it without
depending on changes made on each one of the builder scripts.
Fixes: #5816
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of caching files generated during the component build, let's
cache the final tarball generated for each component.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's do this as the component name will be re-used later on, when we
start checking whether a cached component needs to be rebuilt or not.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is used in several parts of the code, and can have a single
declaration as part of the `lib.sh` file, which is already imported by
all the places where it's used.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're going to use this function from different places, so we better
move it to lib.sh and avoid rewriting it.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
If you're directly using the output of this function, the info message
will show up as part of the string, and that's not what we want.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Change the if statement to check if the CWD is set to /
Add unit tests for the correct merging of working directory
in the container and image process
Note: there is an outstanding question about one test case
Format code
Fixes: #5721
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Jordan Jackson <jordan.jackson@ibm.com>
This includes contructing VMSA pages, parsing OVMF footer table to fetch
the AP reset EIP address, and allowing different vcpu types.
Fixes: #5471
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Now that we're caching the kernel, we're relying on the kernel version
being exported. This is already done for the CC kernel, but not for the
TEE specific ones.
Fixes: #5770
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Loop through the images enviroment variables, checking if it exists
inside the target. If it does then do not append it.
Add unit tests for correctly merging the env variables of the pod yaml
and image itself in the container and image process
Format code
Fixes: #5730
Signed-off-by: Jordan Jackson <jordan.jackson@ibm.com>
As we bumped containerd dependency to v1.6.8, let's also do the
re-vendor of its code on the runtime side.
Fixes: #5745
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
If the attestation-agent is used then enable image_client_auth
to enable the attempt to get registry credentials for the pull
Fixes: #5652
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The incorrect name causes `make cc-payload` to fail, as
`cc-tdx-rootfs-tarball` is a non existent target.
Fixes: #5628
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is just to keep the support for s390x without the cosign
verification while looking for a solution for #5582.
Fixes: #5599
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
image-rs tagged its v0.2.0 release, let's bump it here as we're about to
release the payload for the v0.2.0 Confidential Containers release.
Fixes: #5593
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's bump the td-shim to its `v0.2.0` release.
Together with the bump, let's also adapt its build scripts so we're able
to build the `v0.2.0` as part of our infra.
Fixes: #5593
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The attestation-agent v0.2.0 has been released, let's bump it here and
ensure we use the new release as part of what will become the payload
for the Confidential Containers v0.2.0 release.
Fixes: #5593
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It appears that _either_ the GitHub workflow runners have changed their
environment, or the Ubuntu archive has changed package dependencies,
resulting in the following error when building the snap:
```
Installing build dependencies: bc bison build-essential cpio curl docker.io ...
:
The following packages have unmet dependencies:
docker.io : Depends: containerd (>= 1.2.6-0ubuntu1~)
E: Unable to correct problems, you have held broken packages.
```
This PR uses the simplest solution: install the `containerd` and `runc`
packages. However, we might want to investigate alternative solutions in
the future given that the docker and containerd packages seem to have
gone wild in the Ubuntu GitHub workflow runner environment. If you
include the official docker repo (which the snap uses), a _subset_ of
the related packages is now:
- `containerd`
- `containerd.io`
- `docker-ce`
- `docker.io`
- `moby-containerd`
- `moby-engine`
- `moby-runc`
- `runc`
Fixes: #5545.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
(cherry picked from commit 990e6359b7)
Rather than hard-coding the package manager into the docker part,
use the `build-packages` section to specify the parts package
dependencies in a distro agnostic manner.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
(cherry picked from commit ca69a9ad6d)
Although introducing an awful amount of code duplication, let's
parallelise the static checks in order to reduce its time and the space
used in the VMs running those.
While I understand there may be ways to make the whole setup less
repetitive and error prone, I'm taking the approach of:
* Make it work
* Make it right
* Make it fast
So, it's clear that I'm only attempting to make it work, and I'd
appreciate community help in order to improve the situation here. But,
for now, this is a stopgap solution.
JFYI, the time needed for run the tests on the `main` branch went down
from ~110 minutes to ~60 minutes. Plus, we're not running those on a
single VM anymore, which decreases the change to hit the space limit.
Reference: https://github.com/kata-containers/kata-containers/actions/runs/3393468605/jobs/5640842041
Ideally, each one of the following tests should be also split into
smaller tests, each test for one component, for instance.
* static-checks
* compiler-checks
* unit-tests
* unit-tests-as-root
Fixes: #5585
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 40d514aa2c)
This commit adds the `kernel-hashes=on` flag to the QEMU command line
for all SEV guests (previously, this was only enabled for SEV guests
with `guest_pre_attestation=on`. This change allows the AmdSev firmware
to be used for both encrypted and non-encrypted container images.
**Note:** This change makes the AmdSev OVMF build a requirement for all
SEV guests. The standard host OVMF package will no longer work.
Fixes#5307.
Signed-off-by: Jim Cadden <jcadden@ibm.com>
Let's ensure we add the option for the user, at build time, to set the
AGENT_AA_KBC_PARAMS passed to the agent, via the kernel command line.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're switching TDX to using EAA KBC instead of OfflineFS KBC, let's
add the configuration files needed for testing this before we fully
switch TDX to using such an image.
Fixes: #5563
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The specific TDX image relies on having EAA KBC, instead of using the
default `offline_fs_kbc`.
This image is, with this commit, built and distributed, but not yet used
by TDX specific configurations, which will be done in a follow-up
commit.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of removing the non-needed packages under `/usr/share` and then
installing new components, let's make sure we do the removal at the end
of our script.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's do that instead of updating and installing the
`software-properties-common` package, as it reduces the final size of
the image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
First of all, EAA KBC is only used with TDX, thus we can safely assume
that eaa_kbc means TDX, at least for now.
A `/etc/tdx-attest.conf` file, with the data "port=4050" is needed as
that's the default configuration for the Quote Generation Service (QGS)
which is present on the guest side.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will become very handy by the moment we start building different
images targetting different TEEs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Replace hard-coded aa_kbc_param check to set the image_client's
security_validate, with reading the setting from the agent config
Fixes: #4888
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Launching a pod with measured boot enabled seems to be taking longer
than expected with Cloud Hypervisor, which leads to hitting a timeout
limit.
Let's double those timeout limits for now.
Fixes: #5576
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Add agent.enable_signature_verification=false to the kernel_params
default config to get backwards compatibility in config.
Note the the agent config will default this setting to true for security
reasons if it's unset
Fixes: #4888
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add a new agent config parameter enable_signature_verification which
defaults to true for security reasons
- Add unit tests to check parsing and defaults
Fixes: #4888
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Allows parameters in the agent config file to be overwritten
by the kernel commandline. Does not change trust model since
the commandline is measured.
Makes sure to set endpoints_allowed correctly.
Fixes: #5173
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This PR implements the use of a cached cc qemu tarball to speed up
the CI and avoid building the cc qemu tarball when it is not
necessary.
Fixes#5363
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
As we discussed in #5178, user need set aa_kbc_params config without
modify kata guest image, since kernel params is also measured in TEE
boot flow, we make aa_kbc_params can be parsed through kernel cmdline.
Fixes: #5178
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Inclavare released a rats-tls-tdx package, which we depend on for using
verdictd.
Let's install it when using EAA_KBC, as already done for the rats-tls
package.
One thin to note here is that rats-tls-tdx depends on libtdx-attest,
which depends on libprotobuf-c1, thus we had to add the intel-sgx repo
together with enabling the universe channel.
Fixes: #5543
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're currently using Ubuntu 20.04 as the base for the Ubuntu rootfs,
meaning that right now there's no issue with the approach currently
taken. However, if we do a bump of an Ubuntu version, we could face
issues as the rats-tls package is only provided for Ubuntu 20.04.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
To support cosign signature verification.
Fix build warning in signal.rs:
error: unused `tokio::sync::MutexGuard` that must be used
--> src/signal.rs:27:9
|
27 | rustjail::container::WAIT_PID_LOCKER.lock().await;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `-D unused-must-use` implied by `-D warnings`
= note: if unused the Mutex will immediately unlock
Fixes: #5541
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Add tag entry to the attestation agent entry of the versions file.
Checkout tag commit after cloning AA in rootfs builder.
Fixes: #5373
Fixes: kata-containers#5373
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
1. Add unit tests for pkg/sev
2. Allow CalculateLaunchDigest to calculate launch digest without direct
booted kernel (and, therefore, without initrd and kernel cmdline).
This mode is currently not used in kata.
Fixes: #5456
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
This, combined with the effort of caching builder images *and* only
performing the build itself inside the builder images, is the very first
step for reproducible builds for the project.
Reproducible builds are quite important when we talk about Confidential
Containers, as users may want to verify the content used / provided by
the CSPs, and this is the first step towards that direction.
Fixes: #5517
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need to do that in order to avoid trying to use the image in an
architecture which is not yet supported (such as trying to use the x6_64
image on a s390x machine)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will help to not have to build those on every CI run, and rather
take advantage of the cached image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit c1aac0cdea.
The reason this has to be reverted is because we cannot cache an image
that has a specific user, uid, gid, docker_host_id, and expect that to
work equally on different machines. Unfortunately, this is one of the
images that cannot be cached at all.
This reverts commit fe8b246ae4.
The reason this has to be reverted is because we cannot cache an image
that has a specific user, uid, gid, docker_host_id, and expect that to
work equally on different machines. Unfortunately, this is one of the
images that cannot be cached at all.
The ${file} path is an absolute path, as /home/fidencio/..., while the
result of the `git status --porcelain` is a path relative to the
${repo_root_dir}. Because of this, the logic to adding `-dirty` to the
image name would never work.
Let's fix this by removing the ${repo_root_dir} from the ${file} when
grepping for it.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The init.sh in initramfs will parse the verity scheme,
roothash, root device and setup the root device accordingly.
Fixes: #5135
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Let's take advantge of an existing action that publishes the payload
after each pull request, to also publish the "builder images" used to
build each one of the artefacts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the initramfs
builder image to the Kata Containers' quay.io registry.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder for the initramds.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Now that the QEMU builder image provides only the environment used for
building QEMU, let's ensure it doesn't get removed.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the QEMU
builder image to the Kata Containers' quay.io registry.
Fixes: #5481
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existsing image, instead of building our
own, to be used as a builder image for QEMU.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Differently than every single other bit that's part of our repo, QEMU
has been using a single Dockerfile that prepares an environment where
the project can be built, but *also* building the project as part of
that very same Dockerfile.
This is a problem, for several different reasons, including:
* It's very hard to have a reproducible build if you don't have an
archived image of the builder
* One cannot cache / ipload the image of the builder, as that contains
already a specific version of QEMU
* Every single CI run we end up building the builder image, which
includes building dependencies (such as liburing)
Let's split the logic into a new build script, and pass the build script
to be executed inside the builder image, which will be only responsible
for providing an environment where QEMU can be built.
Fixes: #5464
Backports: #5465
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the virtiofsd
builder image to the Kata Containers' quay.io registry.
Fixes: #5480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the virtiofsd.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's ensure we're building virtiofsd with a specific toolchain that's
known to not cause any issues, instead of always using the latest one.
On each bump of the virtiofsd, we'll make sure to adjust this according
to what's been used by the virtiofsd community.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the td-shim
builder image to the Kata Containers' quay.io registry.
Fixes: #5479
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the td-shim.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the shim-v2
builder image to the Kata Containers' quay.io registry.
Fixes: #5478
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's try to pull a pre-existing image, instead of building our own, to
be used as a builder for the shim-v2.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for building and pushing the OVMF builder
image to the Kata Containers' quay.io registry.
Fixes: #5477
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of buildinf our
own, to be used as a builder image for OVMF.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the kernel
builder image to the Kata Containers' quay.io registry.
Fixes: #5476
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the kernel.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the image used
to build the kata-deploy artefacts to the Kata Containers' quay.io
registry.
Fixes: #5475
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the kata-deploy artefacts.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will push a specific tag to a registry, whenever the
PUSH_TO_REGISTRY environment variable is set, otherwise it's a no-op.
This will be used in the future to avoid replicating that logic in every
builder used by the kata-deploy scripts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a function to get the hash of the last commit modifying a
specific file.
This will help to avoid writing `git rev-list ...` into every single
build script used by the kata-deploy.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
CC_BUILD_REGISTRY, which points to quay.io/kata-containers/cc-builder,
will be used for storing the builder images used to build the artefacts
via the kata-deploy scripts.
The plan is to tag, whenever it's possible and makes sense, images like:
* ${CC_BUILDER_REGISTRY}:kernel-${sha}
* ${CC_BUILDER_REGISTRY}:qemu-${sha}
* ${CC_BUILDER_REGISTRY}:ovmf-${sha}
* ${CC_BUILDER_REGISTRY}:shim-v2-${go-toolchain}-{rust-toolchain}-${sha}
* ${CC_BUILDER_REGISTRY}:td-shim-${toolchain}-${sha}
* ${CC_BUILDER_REGISTRY}:virtiofsd-${toolchain}-${sha}
Where ${sha} is the sha of the last commit modifying the Dockerfile used
by the builder.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The init.sh in initramfs will parse the verity scheme,
roothash, root device and setup the root device accordingly.
Fixes: #5135
Signed-off-by: Wang, Arron <arron.wang@intel.com>
As the CCv0 effort is not using the runtime-rs, let's add a mechanism to
avoid building it.
The easiest way to do so, is to simply *not* build the runtime-rs if the
RUST_VERSION is not provided, and then not providing the RUST_VERSION as
part of the cc-shim-v2-tarball target.
Fixes: #5462
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
There was a typo in the registry name, which should be
quay.io/confidential-containers/runtime-payload-ci instead of
quay.io/repository/confidential-containers/runtime-payload-ci
Fixes: #5469
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's create a GitHub action to automate the Kata Containers payload
generation for the Confidential Containers project.
This GitHub action builds the artefacts (in parallel), merges them into
a single tarball, generates the payload with the resulting tarball, and
uploads the payload to the Confidential Containers quay.io.
It expects the tags to be used to be in the `CC-x.y.z` format, with x,
y, and z being numbers.
Fixes: #5330
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's have a GitHub action to publish the Kata Containers payload, after
every push to the CCv0 branch, to the Confidential Containers
`runtime-payload-ci` registry.
The intention of this action is to allow developers to test new
features, and easily bisect breakages that could've happened during the
development process. Ideally we'd have a CI/CD pipeline where every
single change would be tested with the operator, but we're not yet
there. In any case, this work would still be needed. :-)
It's very important to mention that this should be carefully considered
on whether it should or should not be merged back to `main`, as the flow
of PRs there is way higher than what we currently have as part of the
CCv0 branch.
Fixes: #5460
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's modify the script so we allow passing an extra tag, which will be
used as part of the Kata Containers pyload for Confidential Containers
CI GitHub action.
With this we can pass a `latest` tag, which will make things easier for
the integration on the operator side.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make the registry an optional argument to be passed to the
`kata-deploy-build-and-upload-payload.sh` script, defaulting to the
official Confidential Containers payload registry.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instad of bailing out whenever the docker group doesn't exist, just
consider podman is being used, and set the docker_gid to the user's gid.
Also, let's ensure to pass `--privileged` to the container, so
`/run/podman/podman.socket` (which is what `/var/run/docker.sock` points
to) can be passed to the container.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need to pass the RUST_VERSION, in the same way done for GO_VERSION,
as nowadays both the go and the rust runtime are built.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's build virtiofsd using the kata-deploy build scripts, which
simplifies and unifies the way we build our components.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 0bc5baafb9)
Let's have the docker installation / configuration as part of its own
task, which can be set as a dependency of other tasks whcih may or may
not depend on docker.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit cb4ef4734f)
When moving to building the CI artefacts using the kata-deploy scripts,
we've noticed that the build would fail on any machine where the tarball
wasn't officially provided.
This happens as rust is missing from the 1st layer container. However,
it's a very common practice to leave the 1st layer container with the
minimum possible dependencies and install whatever is needed for
building a specific component in a 2nd layer container, which virtiofsd
never had.
In this commit we introduce the second layer containers (yes,
comtainers), one for building virtiofsd using musl, and one for building
virtiofsd using glibc. The reason for taking this approach was to
actually simplify the scripts and avoid building the dependencies
(libseccomp, libcap-ng) using musl libc.
Fixes: #5425
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 7e5941c578)
The previously used repo will be removed by Intel, as done with the one
used for TDX kernel. The TDX team has already worked on providing the
patches that were hosted atop of the QEMU commit with the following hash
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0 as a tarball in the
https://github.com/intel/tdx-tools repo, see
https://github.com/intel/tdx-tools/pull/162.
On the Kata Containers side, in order to simplify the process and to
avoid adding hundreds of patches to our repo, we've revived the
https://github.com/kata-containers/qemu repo, and created a branch and a
tag with those hundreds of patches atop of the QEMU commit hash
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0. The branch is called
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0-plus-TDX-v3.1 and the tag is
called TDX-v3.1.
Knowing the whole background, let's switch the repo we're getting the
TDX QEMU from.
Fixes: #5419
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 35d52d30fd)
The previously used repo has been removed by Intel. As this happened,
the TDX team worked on providing the patches that were hosted atop of
the v5.15 kernel as a tarball present in the
https://github.com/intel/tdx-tools repos, see
https://github.com/intel/tdx-tools/pull/161.
On the Kata Containers side, in order to simplify the process and to
avoid adding ~1400 kernel patches to our repo, we've revived the
https://github.com/kata-containers/linux repo, and created a branch and
a tag with those ~1400 patches atop of the v5.15. The branch is called
v5.15-plus-TDX, and the tag is called 5.15-plus-TDX (in order to avoid
having to change how the kernel builder script deals with versioning).
Knowing the whole background, let's switch the repo we're getting the
TDX kernel from.
Fixes: #5326
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 9eb73d543a)
Prior to this patch, we were missing a call to `verify_cid` when the cid
was derived from the image path, which meant that the host could specify
something like "prefix/..", and we would use ".." as the cid. Paths
derived from this (e.g., `bundle_path`) would not be at the intended
tree.
This patch factors the code out of `pull_image` so that it can be more
easily tested. Tests are added for a number of cases.
Fixes#5421
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
This also slightly improves readability by decluttering the function
declaration and call site.
Fixes#5405
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Based on https://gitlab.com/cryptsetup/cryptsetup/-/issues/525
1. When --no-wipe is used, the device will have invalid checksums
2. mkfs.ext4 would fail on an un-wiped device due to reads of pages with
invalid checksums
3. To make mkfs.ext4 work
- Perform a dry run to figure out which sectors (pages) mkfs.ext4 will
write to.
- Perform directe writes to these pages to ensure that they will have
valid checksums
- Invoke mkfs.ext4 again to perform initialization
4 Use lazy_journal_init option with mkfs.ext4 to lazily initialize the journal.
According to the man pages,
"This speeds up file system initialization noticeably, but carries some small
risk if the system crashes before the journal has been overwritten entirely
one time."
Since the storage is ephemeral, not expected to survive a system crash/power cycle,
it is safe to use lazy_journal_init.
Fixes#5329
Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
In order to ensure that the proxy configuration is passed to the 2nd
layer container, let's ensure the $HOME/.docker/config.json file is
exposed inside the 1st layer container.
For some reason which I still don't fully understand exporting
https_proxy / http_proxy / no_proxy was not enough to get those
variables exported to the 2nd layer container.
In this commit we're creating a "$HOME/.docker" directory, and removing
it after the build, in case it doesn't exist yet. The reason we do this
is to avoid docker not running in case "$HOME/.docker" doesn't exist.
This was not tested with podman, but if there's an issue with podman,
the issue was already there beforehand and should be treated as a
different problem than the one addressed in this commit.
Fixes: #5077
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4da743f90b)
before setting a limit, otherwise paths may not be found.
guest supporting different hugepage size is more likely with peer-pods where
podvm may use different flavor.
Fixes: #5191
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Umoci is not longer required if we have the attestation-agent, so don't
override the user input
Fixes: #5237
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
To use depmod in the rootfs builder, the docker environment will require kmod.
Fixes: kata-containers#5125
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
Using depmod when adding kernel modules to get dependencies.
Needed for the efi secret module for sev.
Fixes: #5125
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
If we are using the offline_fs_kbc and have created a resource json
then switch security_validate on the image_client to enable
the signature verification feature for image-rs
Fixes: #4581
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Update the agent build to get around the nix & glibc linker problems
by running the libseccomp installation first
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Update the doc and scripts to reflect that skopeo isn't mandatory
for signature verification any longer
- Update the script to default the aa_kbc to offline_fs_kbc
Fixes: #4581
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Adds default config file.
Adds case in rootfs.sh to copy config.
Fixes kata-containers#5023
Fixes: #5023
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
After we have a guest kernel with builtin initramfs which
provide the rootfs measurement capability and Kata rootfs
image with hash device, we need set related root hash value
and measure config to the kernel params in kata configuration file.
Fixes: #5168
Signed-off-by: Wang, Arron <arron.wang@intel.com>
The sev initrd target had been changed to "cc-sev-rootfs-initrd".
This was good discussion as part of #5120.
I failed to rename it from "cc-sev-initrd-image" in kata-deploy-binaries.
The script will fail for a bad build target.
Fixes: #5166
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
Adds the efi_secret kernel module to the sev initrd.
Adds a rootfs flag for kernel module based on the AA_KBC.
Finding the kernel module in the local build based on kernel version and kernel config version.
Moved kernel config version checking function from kernel builder to lib script.
Fixes: #5118
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
Adds a make target, and a function in the kata-deploy-binaries script.
In the spirit of avoiding code duplication, making the cc-initrd function more generic.
Fixes: #5118
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
Adds a new make target for an sev kernel which can be built and put into payload bundles for the operator.
Currently not including this sev kernel target in the cc payload bundle.
Unfortunately having to breakflow from using the generic cc_tee_kernel functions in either the kata-deploy-binaries or build-kernel.
Largely based on using an upstreamed kernel release, meaning the url is the defaul cdn, and e.g. we use version rather than tag.
The upside of this is that we can use the sha sum checking functionality from the generic get_kernel function.
CC label in title removed for commit message check.
Fixes: #5037
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
Currently leaving the cc-sev-ovmf-tarball target out of the cc payload.
I was not sure where discussion had landed on the number of payload bundles.
e.g. could be included in a cc bundle along with tdx support or create an SEV bundle.
Fixes: kata-containers#5025
Fixes: #5025
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
Integrate initramfs into guest kernel as one binary,
which will be measured by the firmware together.
Fixes: #5148
Signed-off-by: Wang, Arron <arron.wang@intel.com>
This patch allows copying of directories and symlinks when
static file copying is used between host and guest. This change is
necessary to support recursive file copying between shim and agent.
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
This patch adds the support of the remote hypervisor type.
Shim opens a Unix domain socket specified in the config file,
and sends TTPRC requests to a external process to control
sandbox VMs.
Fixes#4482
Co-authored-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
After adding an SEV QEMU config file (#4850), need to configure containerd to select this when appropriate based on a new runtimeclass.
Adds to the configuration of containerd so the correct config is selected.
Fixes: #4851
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
Let's remove the whole content from:
* /opt/confidential-containers/libexec
* /opt/confidential-containers/share
And then manually remove the binaries under bin directory` as the
pre-install hook will drop binaries there.
Finally, let's call a `rmdir -p /opt/confidential-containers/bin` which
should take care of the cleanup in case no pre-install hook is used, and
let's make sure we pass `--ignore-fail-on-non-empty` so we don't fail
when using a pre-install hook.
Fixes: #5128
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
For Confidential Containers the file is present at
`/opt/confidential-containers` instead of `/opt/kata`.
Fixes: #5119
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Every now and then we've been hitting issues with parallel builds. in
order to not rely on lucky for the first release, let's do a serial
build of the payload image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the documentation on how to generate the Kata Containers
payload, based in the CCv0 branch, that's consumed by the Confidential
Containers Operator.
Fixes: #5041
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The `image` target is only used by and only present in the `CCv0`
branch, and it's name is misleading. :-)
Let's rename it (and the scripts used by it) to mention payload rather
than image, and to actually build the cc related tarballs instead of the
"vanilla" Kata Containers tarballs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's adjust the `kata-deploy-build-and-upload-image.sh` to build the
image following the `kata-containers-${commit}` tag pattern, and to push
it to the quay.io/confidential-containers/runtime-payload repo.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's try to remove the /opt/confidential-containers directory. If it's
not empty, let's not bother force removing it, as the pre-install script
also drops files to the very same directory.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're currently backing up and restoring all the possible shim files,
but the default one ("containerd-shim-kata-v2").
Let's ensure this is also backed up and restored.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of passing a `KATA_CONF_FILE` environament variable, let's rely
on the configured (in the container engine) config path, as both
containerd and CRI-O support it, and we're using this for both of them.
This is a "backport" of f7ccf92dc8, from
the original `kata-deploy.sh` to the one used for Confidential
Containers.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As containerd is the only supported container engine, let's simplify the
script and, at the same time, make it clear that other container engines
are not supported yet.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's create the QEMU build image based on the version of QEMU used, so
if we happen to have a parallel build we ensure different images are
being used.
Also, let's ensure the image gets remove after the build.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
In the commit 54d6d01754 we ended up
removing the BUILD_SUFFIX argument passed to QEMU as it only seemed to
be used to generate the HYPERVISOR_NAME and PKGVERSION, which were added
as arguments to the dockerfile.
However, it turns out BUILD_SUFFIX is used by the `qemu-build-post.sh`
script, so it can rename the QEMU binary accordingly.
Let's just bring it back.
Fixes: #5078
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 373dac2dbb)
Patches failing without the no_patches.txt file for SPR-BKC-QEMU-v2.5.
Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
(cherry picked from commit 59e3850bfd)
Dockerfile cannot decipher multiple conditional statements in the main RUN call.
Cannot segregate statements in Dockerfile with '{}' braces without wrapping entire statement in 'bash -c' statement.
Dockerfile does not support setting variables by bash command.
Must set HYPERVISOR_NAME and PKGVERSION from parent script: build-base-qemu.sh
Fixes: #5078
Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
(cherry picked from commit 54d6d01754)
4cf502fb20 added the ability to build
TD-Shim, but forgot to have it added as part of the cc-tarball target.
Fixes: #5042
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Added default sev kata config template.
Added required default variables in Makefile.
Fixes#5012Fixes#5008
Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
The agent configuration file, which is part of the docs, is used by the
confidential containers CIs and, right now, cannot be run behind a
firewall, which is exactly how the TDX CIs are reunning, as https_proxy
is not set there.
Fixes: #5020
Depends-on: github.com/kata-containers/tests#5080
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As the previous commit added a new runtime class to be used with TDX,
let's make sure this gets shipped and configured as part of the
kata-deploy-cc script, which is used by the Confidential Containers
Operator.
This commit also cleans up all the extra artefacts that will be
installed in order to run the CLH TDX workloads.
Fixes: #4833
Depends-on: github.com/kata-containers/tests#5070
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a new configuration file for using a cloud hypervisor (and all
the needed artefacts) that are TDX capable.
This PR extends the Makefile in order to provide variables to be set
during the build time that are needed for the proper configuration of
the VMM, such as:
* Specific kernel parameters to be used with TDX
* Specific kernel features to be used when using TDX
* Artefacts path for the artefacts built to be used with TDX
* Kernel
* TD-Shim
The reason we don't hack into the current Cloud Hypervisor configuration
file is because we want to ship both configurations, with for the
non-TEE use case and one for the TDX use case.
It's important to note that the Cloud Hypervisor used upstream is
already built with TDX support.
Fixes: #4831
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
TDX kernel is based on a kernel version which doesn't have the
CONFIG_SPECULATION_MITIGATIONS option.
Having this in the allow list for missing configs avoids a breakage in
the TDX CI.
Fixes: #4998
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
With the current TDX kernel used with Kata Containers, `tdx_guest` is
not needed, as TDX_GUEST is now a kernel configuration.
With this in mind, let's just drop the kernel parameter.
Fixes: #4981
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As right now the TDX guest kernel doesn't support "serial" console,
let's switch to using HVC in this case.
Fixes: #4980
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The runtime will crash when trying to resize memory when memory hotplug
is not allowed.
This happens because we cannot simply set the hotplug amount to zero,
leading is to not set memory hotplug at all, and later then trying to
access the value of a nil pointer.
Fixes: #4979
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
While doing tests using `ctr`, I've noticed that I've been hitting those
timeouts more frequently than expected.
Till we find the root cause of the issue (which is *not* in the Kata
Containers), let's increase the timeouts when dealing with a
Confidential Guest.
Fixes: #4978
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When booting the TDX kernel with `tdx_disable_filter`, as it's been done
for QEMU, VirtioFS can work without any issues.
Whether this will be part of the upstream kernel or not is a different
story, but it easily could make it there as Cloud Hypervisor relies on
the VIRTIO_F_IOMMU_PLATFORM feature, which forces the guest to use the
DMA API, making these devices compatible with TDX.
See Sebastien Boeuf's explanation of this in the
3c973fa7ce208e7113f69424b7574b83f584885d commit:
"""
By using DMA API, the guest triggers the TDX codepath to share some of
the guest memory, in particular the virtqueues and associated buffers so
that the VMM and vhost-user backends/processes can access this memory.
"""
Fixes: #4977
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The `force_tdx_guest` kernel parameter was only needed in the early
development stages of the TDX kernel driver. We can safely drop it with
the kernel version we've been currently using.
Fixes: #4985
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Generate rootfs hash data during creating the kata rootfs,
current kata image only have one partition, we add another
partition as hash device to save hash data of rootfs data blocks.
Fixes: #4966
Signed-off-by: Wang, Arron <arron.wang@intel.com>
The local-build script should honor the value of SKOPEO exported in the
environment so that it will be able to build the image without skopeo
inside. This remove the hard-coded "SKOPEO=yes".
Fixes#4959
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Initialize the trusted stroage when the device is defined
as "/dev/trusted_store" with shell script as first step.
Fixes: #4882
Signed-off-by: Wang, Arron <arron.wang@intel.com>
After enable data integrity for trusted storage, the initialize
time will take three times more and IO performance will drop more than
30%, the default value will be NOT enabled but add this config to
allow the user to enable if they care more strict security.
Fixes: #4882
Signed-off-by: Wang, Arron <arron.wang@intel.com>
By default the pause image and runtime config will provided
by host side, this may have potential security risks when the
host config a malicious pause image, then we will use the pause
image packaged in the rootfs.
Fixes: #4882
Signed-off-by: Wang, Arron <arron.wang@intel.com>
With the runtime-rs changes the agent services need to be asynchronous,
so attempt to update the image_service to match this
Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Instead of setting:
```
firmware = "/path/to/OVMF.fd"
firmware_volume = "/path/to/OVMF_VARS.fd"
```
We should either be setting:
```
firmware = "/path/to/OVMF.fd"
```
Or:
```
firmware = "/path/to/OVMF_CODE.fd"
firmware_volume = "/path/to/OVMF_VARS.fd"
```
I'm taking the approach to setting up the latter, as that's what's been
tested as part of our TDX CI.
Fixes: #4926
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Add required kernel config for dm-crypt/dm-integrity/dm-verity
and related crypto config.
Add userspace command line tools for disk encryption support
and ext4 file system utilities.
Fixes: #4761
Signed-off-by: Arron Wang <arron.wang@intel.com>
For CoCo stack, the pause image is managed by host side,
then it may configure a malicious pause image, we need package
a pause image inside the rootfs and don't the pause image from host.
Fixes: #4768
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Pick up a new verison of image-rs as the pinned version depended on a
version of ocicrypt-rs that doesn't build anymore
Fixes: #4867
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
AMD SEV pre-attestation is handled by the runtime before the guest is
launched. Guest VM is started paused and the runtime communicates with a
remote keybroker service (e.g., simple-kbs) to validate the attestation
measurement and to receive launch secret. Upon validation, the launch
secret is injected into guest memory and the VM is started.
Fixes: #4280
Signed-off-by: Jim Cadden <jcadden@ibm.com>
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Let's add the QEMU TDX targets to be generated together with the cc
targets, when calling `make cc-tarball`.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As the previous commit added a new runtime class to be used with TDX,
let's make sure this gets shipped and configured as part of the
kata-deploy-cc script, which is used by the Confidential Containers
Operator.
This commit also cleans up all the extra artefacts that will be
installed in order to run the QEMU TDX workloads.
Fixes: #4832
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a new configuration file for using a QEMU (and all the needed
artefacts) that are TDX capable.
This PR extends the Makefile in order to provide variables to be set
during the build time that are needed for the proper configuration of
the VMM, such as:
* Specific kernel parameters to be used with TDX
* Specific kernel features to be used when using TDX
* Artefacts path for the artefacts built to be used with TDX
* QEMU
* Kernel
* TDVF
The reason we don't hack into the current QEMU configuration file is
because we want to ship both configurations, with for the non-TEE use
case and one for the TDX use case.
Fixes: #4830
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building TD-shim, a firmware used with
Cloud Hypervisor to start TDX capable VMs for CC.
Fixes: #4780
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building a TDVF, a firmware used with QEMU
to start TDX capable VMs for CC.
Fixes: #4625
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's create the td-shim tarball in the directory where the script was
called from, instead of doing it in the $DESTDIR.
This aligns with the logic being used for creating / extracting the
tarball content, which is already in use by the kata-deploy local build
scripts.
Fixes: #4809
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's create the OVMF tarball in the directory where the script was
called from, instead of doing it in the $DESTDIR.
This aligns with the logic being used for creating / extracting the
tarball content, which is already in use by the kata-deploy local build
scripts.
Fixes: #4808
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Simple-kbs keybroker protocol is used by runtime for SEV(-ES)
pre-attestation. Includes protobuf module.
Fixes: #4280
Signed-off-by: Jim Cadden <jcadden@ibm.com>
We don't need skopeo to get the encrypted container image
scenario working, so remove that instruction from the doc
Fixes: #4587
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We're adding a new target for building a TDX capable Cloud Hypervisor
for CC.
As the current version of Cloud Hypervisor is already built with TDX
support, we just rely on calling the same `install_cc_clh()` function,
as done for the non-tee `cc` target.
Fixes: #4659
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building a TDX capable QEMU for CC.
This commit, differently than b307531c29,
introduces support for building the artefacts that are TEE specific.
Fixes: #4623
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Currently $BUILD_DIR will be used to create a directory as:
/opt/kata/share/kata-qemu${BUILD_DIR}
It means that when passing a BUILD_DIR, like "foo", a name would be
built like /opt/kata/share/kata-qemufoo
We should, instead, be building it as /opt/kata/share/kata-qemu-foo.
Fixes: #4638
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of always naming the binary as "-experimental", let's take
advantage of the $BUILD_SUFFIX that's already passed and correctly name
the binary according to it.
Fixes: #4638
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building a TDX capable kernel for CC.
This commit, differently than c4cc16efcd,
introduces support for building the artefacts that are TEE specific.
Fixes: #4622
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Passing the URL to be used to download the kernel tarball is useful in
various scenarios, mainly when doing a downstream build, thus let's add
this new option.
This new option also works around a known issue of the Dockerfile used
to build the kernel not having `yq` installed.
Fixes: #4629
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
There's no need to have the entire function for building SEV / TDX
duplicated.
Let's remove those functions and create a `get_tee_kernel()` which takes
the TEE as the argument.
Fixes: #4627
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Although I don't like the duplication introduced here, it's (at least
for now) way cleaner to have a specific daemonset for the Confidential
Containers effort.
As soon as we have all the bits and pieces upstreamed (kernel, QEMU, and
specific dependencies for each one of the TEEs), we'll be easily able to
get rid of this one. However, for now, focusing on this different set
of files will make our lives easier.
This new daemonset includes the configurations needed for containerd in
order to use the `cc` specific `cri_handler`, which is not and will not
be upstream on the containerd side.
Note, CRI-O is **not** supported for now.
Fixes: #4620
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of hacking the original `kata-deploy.sh` script, let's add a
totally new folder where we'll be adding content that's CC related.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've added a bunch of new options related to Confidential Containers
builds as part of the kata-deploy-binaries.sh. Let's make sure those
are displayed to the users of the script when it's called with --help.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Similar to what we have with the `all` option, let's also add a `cc`
one, allowing others to easily call the script and build all the `cc`
related components.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Quite similar to the `kata-tarball` target, let's add a `cc-tarball`
target so we can build all the CC related tarballs in a single command,
with all the tarballs being merged together in the end.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Quite similar to the `all-parallel` target, let's add a `cc-parallel`
target so we can build all the CC related tarballs in parallel.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Quite similar to the `all` target, let's add a `cc` target so we can
build all the CC related tarballs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building virtiofsd for CC, but it's
important to note that the only difference between this one and the
"vanilla" build is the installation path.
Moreover, virtiofsd will **NOT** be used by the CC effort, but as the
very first release target doesn't include TEE support, let's not force
those who want to give it a try to setup devicemapper.
Fixes: #4569
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building QEMU for CC, but it's important
to note that the only difference between this one and the "vanilla"
build is the installation path.
The reason we're taking this approach is because the first release
target for CC doesn't include TEE support.
We had to also include a new builder for QEMU, a specific one for CC, as
for now that's the easiest way to override the prefix in a way that
we'll be easily able to expand the script to support TEE capable builds
in the very near future.
Fixes: #4568
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building the Kernel for CC, but it's
important to note that the only difference between this one and the
"vanilla" build is the installation path.
The reason we're taking this approach is because the first release
target for CC doesn't include TEE support.
Fixes: #4567
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building Cloud Hypervisor for CC, but it's
important to note that the only difference between this one and the
"vanilla" build is the installation path.
The reasons we're taking this approach are:
* Cloud Hypervisor, for the `main` and `stable` branches, is already
built with TDX support.
* The first target for the CC release doesn't include TEE support.
Fixes: #4566
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a new build target for our local-build scripts, cc-shim-v2,
and use it to build Kata Containers properly configured for the CC
use-case.
Fixes: #4564
Depends-on: github.com/kata-containers/tests#4895
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a new "REMOVE_VMM_CONFIGS" environment variable that can be
passsed to the script responsible for building Kata Containers.
Right now this is not useful for the `main` or `stable` branch, but for
the CC release we only have been working and testing with QEMU and Cloud
Hypervisor.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
For the CC build we need to enable such a flag, and the cleaner way to
do so is exposing it in the Makefile and, later on, making sure its
correct value to the build script.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
While this has never been needed for the `main` and `stable` releases,
for the coming CC release we need to pass a few extra options when
building the shim.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a new build target for our local-build scripts,
cc-rootfs-image-tarball, and use it to build an image that has skopeo
and umoci embedded in, and that using the offline_fs_kbc as the
attenstation agent KBC.
Fixes: #4557
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
SKOPEO, UMOCI, and AA_KBC have been unset so far as we have not been
generating rootfs images that would be used for CC as part of our
workflow.
Now, as we're targetting the first release of the operator with the CCv0
branch, let's stop unsetting those and start taking advantage of our
tools to help us building a CC capable image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're still depending on components that are only being tested on
Ubuntu, let's make sure the VM image distributed is exactly the same
we've been testing.
Fixes: #4551
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Refactored ccv0.sh to utilise new automated tests for pulling encrypted images and creating a pod.
Fixes: #4512
Depends-on: github.com/kata-containers/tests#4866
Co-authored-by: Megan Wright <Megan.Wright@ibm.com>
Signed-off-by: Georgina Kinge <georgina.kinge@ibm.com>
Let's pin a specific version of image-rs, one that pins a specific
version of ocicrypt-rs on their side, and ensure we don't fall into
issues by consuming the content from main on those repos, and also
helping to ensure reproducible builds from our side.
Fixes: #4517
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's update the Cargo.lock file to bring in all the new dependencies
and to decrease the diff after pinning a specific version of image-rs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The error shown below was caught during a dependency bump in the CCv0
branch, but we better fix it here first.
```
error: this boolean expression can be simplified
--> src/random.rs:85:21
|
85 | assert!(!ret.is_ok());
| ^^^^^^^^^^^^ help: try: `ret.is_err()`
|
= note: `-D clippy::nonminimal-bool` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#nonminimal_bool
error: this boolean expression can be simplified
--> src/random.rs:93:17
|
93 | assert!(!ret.is_ok());
| ^^^^^^^^^^^^ help: try: `ret.is_err()`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#nonminimal_bool
```
Fixes: #4523
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The error shown below was caught during a dependency bump in the CCv0
branch, but we better fix it here first.
```
error: use of `ok_or` followed by a function call
--> src/netlink.rs:526:14
|
526 | .ok_or(anyhow!(nix::Error::EINVAL))?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try this: `ok_or_else(|| anyhow!(nix::Error::EINVAL))`
|
= note: `-D clippy::or-fun-call` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#or_fun_call
error: use of `ok_or` followed by a function call
--> src/netlink.rs:615:49
|
615 | let v = u8::from_str_radix(split.next().ok_or(anyhow!(nix::Error::EINVAL))?, 16)?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try this: `ok_or_else(|| anyhow!(nix::Error::EINVAL))`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#or_fun_call
```
Fixes: #4523
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- Update `ccv0.sh` to use the new lib method which updates the CC pod config yaml
to add a a unique id
for compatibility with crictl 1.24.0+
Fixes: #4867
Depends-on: github.com/kata-containers/tests#4867
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Ensure that our documented crictl pod config file contents have
uid and namespace fields for compatibility with crictl 1.24+
Fixes: #4513
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Encrypted image support with offline_fs_kbc mode
of the attesation-agent, currently required skopeo
so update the doc to clarify this
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Solve `fatal: unsafe repository` ownership error by using `lib.sh`
code to check out the kata-containers repo
- Update `~/rustup` and repo directory ownership to `${USER}`
in order to allow subsequent build steps to work as a non-root
user
Fixes: #4241
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Based on @fidencio's opoinon,
On Arm: static build virtiofsd using musl lib;
on ppc64 & s390: static build virtiofsd using gnu lib;
Fixes: #4258
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The current implementation of walking the
disks to match with the requested volume path
in agent doesn't work because the volume path
provided by the shim to the agent is the mount
path within the guest and not the device name.
The current logic is trying to match the
device name to the volume path which will never
match.
This change will simplify the
get_volume_capacity_stats and
get_volume_inode_stats to just call statfs and
get the bytes and inodes usage of the volume
path directly.
Fixes: #4297
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
Today the shim does a translation when doing
direct-volume stats where it takes the source and
returns the mount path within the guest.
The source for a direct-assigned volume is actually
the device path on the host and not the publish
volume path.
This change will perform a lookup of the mount info
during direct-volume stats to ensure that the
device path is provided to the shim for querying
the volume stats.
Fixes: #4297
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
The go default http mux AFAIK doesn’t support pattern
routing so right now client is padding the url
for direct-volume stats with a subpath of the volume
path and this will always result in 404 not found returned
by the shim.
This change will update the shim to take the volume
path as a GET query parameter instead of a subpath.
If the parameter is missing or empty, then return
400 BadRequest to the client.
Fixes: #4297
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
The action function expects a function that returns error
but the current direct-volume stats Action returns
(string, error) which is invalid.
This change fixes the format and print out the stats from
the command instead.
Fixes: #4293
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
This PR removes the clear containers reference as this is not longer
being used and is deprecated at the rootfs builder README.
Fixes#4278
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's reword the sentence so it's easier for someone who's not a native
nor familiar with the project to understand.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Right now the script only support QEMU, but there's not a reason to do
that, mainly considering we already have the tests parity in the CIs
between QEMU and Clouud Hypervisor.
With this in mind, let's expand this script to also using Cloud
Hypervisor.
Whether this script should use QEMU or Cloud Hypervisor is defined
according to the KATA_HYPERVISOR environment variable.
Fixes: #4038
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead, rely on the conntainerd-shim-kata-v2 process, as this makes
this script VMM agnostic.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This configuration option is valid for all the hypervisor that are going
to be used with the confidential containers effort, thus exposing the
configuration option for Cloud Hypervisor as well.
Fixes: #4022
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 98750d792b)
Refactor image verification documentation to be more user
focussed, using crictl rather than agent-ctl and re-using the
integration test config files
Fixes: #3958
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Use `multistrap` for building Ubuntu rootfs. Adds support for building
for foreign architectures using the `ARCH` environment variable
(including umoci).
In the process, the Ubuntu rootfs workflow is vastly simplified.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Containerd can support set a proxy when downloading images with a environment variable.
For CC stack, image download is offload to the kata agent, we need support similar feature.
Current we add https_proxy and no_proxy, http_proxy is added since it is insecure.
Fixes#3956
Signed-off-by: Arron Wang <arron.wang@intel.com>
Requires setting ARCH and CC.
- Add CC linker option for building agent.
- Set host for building libseccomp.
Fixes: #3681
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
- Add a doc comment
- Pass to build container, e.g. to build x86_64 with glibc (would
always use musl)
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Remove a lot of cruft of musl installations -- we needed those for the
Go agent, but Rustup just takes care of everything. aarch64 on
Debian-based & Alpine is an exception -- create a symlink
`aarch64-linux-musl-gcc` to `musl-tools`'s `musl-gcc` or `gcc` on
Alpine. This is unified -- arch-specific Dockerfiles are removed.
Furthermore, we should keep it in Ubuntu for supporting the offline SEV
KBC. We also keep it in Clear Linux, as that runs our internal checks,
but it is e.g. not shipped in CentOS Stream 9.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Hadolint DL3019. If you're wondering why this is in this PR, that's
because I touch the file later, and we're only triggering the lints for
changed files.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
- We've updated the CC logging scripts to log to the journal
rather than a socket, so remove socat scripts and instructions
to reflect this
Fixes: #3928
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
for config, as per suggestion from @jodh-intel in #3243.
- Uses the pre-established `kata-containers` folder which we can also
use for more
- Makes it clear the agent is used
Also, use curl instead of wget for uniformity.
Fixes: #3920
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
All scripts should use `EOF` as the shell here document delimiter as
this is checked by the static checker.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Added new `kata-manager` options to control the self-test behaviour. By
default, after installation the manager will run a test to ensure a Kata
Containers container can be created. New options allow:
- The self test to be disabled.
- Only the self test to be run (no installation).
These features allow changes to be made to the installed system before
the self test is run.
Fixes: #3851.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Make the `kata-manager` create a `containerd` link to ensure the
downloaded containerd systemd service file can find the daemon when
using the GitHub packaged version of containerd.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add test coverage for get_memory_info function in src/rpc.rs. Includes
some minor refactoring of the function.
Fixes#3837
Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>
Change the secret used by the GitHub Action that adds the PR size
label to one with the correct set of privileges.
Fixes: #3856.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
As 2.4.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup
tags back to "latest", and re-add the kata-deploy-stable and the
kata-cleanup-stable files.
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
- Enhancement: fix comments/logs and delete not used function
- storage: make k8s emptyDir volume creation location configurable
- Implement direct-assigned volume
- Bump containerd to 1.6.1
- experimentally enable vcpu hotplug and virtio-mem on arm64 in kernel part
- versions: Upgrade to Cloud Hypervisor v22.0
- katatestutils: remove distro constraints
- Minor fixes for the `disable_block_device_use` comments
- clh: stop virtofsd if clh fails to boot up the vm
- clh: tdx: Don't use sharedFS with Confidential Guests
- runtime: Build golang components with extra security options
- snap: Use git clone depth 1 for QEMU and dependencies
- snap: Don't build cloud-hypevisor on ppc64le
- build: always reset ARCH after getting it
- virtcontainers: remove temp dir created for vsock in test code
- docs: Add unit testing presentation
- virtcontainers: Use available s390x hugepages
- Update QEMU >= 6.1.0 in configure-hypervisor.sh
- Fix monitor listen address
- snap: clh: Re-use kata-deploy script here
- osbuilder: Add CentOS Stream rootfs
- runtime: Gofmt fixes
- Update `confidential_guest` comments
- cleanup runtime pkgs for Darwin build, add basic Darwin build/unit test
- docs: Update Readme document
- runtime: use Cmd.StdoutPipe instead of self-created pipe
- docs: Developer-Guide build a custom Kata agent with musl
- kata-agent: Fix mismatching error of cgroup and mountinfo.
- runtime, config: make selinux configurable
- Fix unbound variable / typo on error mesage
- clh: Add TDX support
- virtcontainers: Do not add a virtio-rng-ccw device
- kata-monitor: fix collecting metrics for sandboxes not started through CRI
- runtime: fix package declaration for ppc64le
- Make the hypervisor framework not Linux specific
- kata-deploy: Simplify Dockerfile and support s390x
- Support nerdctl OCI hooks
- shim: log events for CRI-O
- docs: Update contributing link
- kata-deploy: Use (kata with) qemu as the default shim-v2 binary
- kata-monitor: simplify sandbox cache management and attach kubernetes POD metadata to metrics
- nydus: add lazyload support for kata with clh
- kernel: remove SYS_SUPPORTS_HUGETLBFS from powerpc fragments
- packaging: Use `patch` for applying patches
- virtcontainers: Remove duplicated assert messages in utils test code
- versions: add nydus-snapshotter
- docs: Update limitations document
- packaging: support qemu-tdx
- Kata manager fix install
- versions: Linux 5.15.x
- trace-forwarder/agent-ctl: run cargo fmt/clippy in make check
- docs: Improve top-level README
- runtime: use github.com/mdlayher/vsock@v1.1.0
- tools: Build cloud-hypervisor with "--features tdx"
- virtiofsd: Use "-o announce_submounts"
- feature: hugepages support
- tools: clh: Allow to set when to build from sources and the build flags passed down to cargo
- docs: Remove docker run and shared memory from limitations
- versions: Udpate Cloud Hypervisor to 55479a64d237
- kernel: add missing config fragment for TDx
- runtime: The index variable is initialized multiple times in for
- scripts: fix a typo while to check build_type
- versions: bump CRI-O to its 1.23 release
- feature(nydusd): add nydusd support to introduce lazyload ability
- docs: Fix relative links in Markdown
- kernel: support TDx
- device: Actually update PCIDEVICE_ environment variables for the guest
- docs: Update link to EFK stack docs
- runtime: support QEMU SGX
- snap: update qemu version to 6.1.0 for arm
- Release process related fixes
- openshift-ci: switch to CentOS Stream
- virtcontainers: Split the rootless package into OS specific parts
- runtime: suppport split firmware
- kata-deploy: for testing, make sure we use the PR branch
- docs: Remove Zun documentation with kata containers
- agent: Fix execute_hook() args error
- workflows: stop checking revert commit
84dff440 release: Adapt kata-deploy for 2.4.0-rc0
b257e0e5 rustjail: delete function signal in BaseContainer
d647b28b agent: delete meaningless FIXME comment
1b34494b runtime: fix invalid comments for pkg/resourcecontrol
afc567a9 storage: make k8s emptyDir creation configurable
e76519af runtime: small refactor to improve readability
7e5f11a5 vendor: Update containerd to 1.6.1
42771fa7 runtime: don't set socket and thread for arm/virt
8828ef41 kernel: add arm experimental kernel build support
8a9007fe config: remove 2 config as they are removed in 5.15
1b6f7401 kernel: add arm experimental patches to support vcpu hotplug and virtio-mem
f905161b runtime: mount direct-assigned block device fs only once
27fb4902 agent: add get volume stats handler in agent
ea51ef1c runtime: forward the stat and resize requests from shimv2 to kata agent
c39281ad runtime: update container creation to work with direct assigned volumes
4e00c237 agent: add grpc interface for stat and resize operations
e9b5a255 runtime: add stat and resize APIs to containerd-shim-v2
6e0090ab runtime: persist direct volume mount info
fa326b4e runtime: augment kata-runtime CLI to support direct-assigned volume
b8844fb8 versions: Upgrade to Cloud Hypervisor v22.0
af804734 clh: stop virtofsd if clh fails to boot up the vm
97951a2d clh: Don't use SharedFS with Confidential Guests
c30b3a9f clh: Adding a volume is not supported without SharedFS
f889f1f9 clh: introduce supportsSharedFS()
54d27ed7 clh: introduce loadVirtiofsDaemon()
ae2221ea clh: introduce stopVirtiofsDaemon()
e8bc26f9 clh: introduce setupVirtiofsDaemon()
413b3b47 clh: introduce createVirtiofsDaemon()
55cd0c89 runtime: Build golang components with extra security options
76e4f6a2 Revert "hypervisors: Confidential Guests do not support Device hotplug"
fa8b9392 config: qemu: Fix disable_block_device_use comments
9615c8bc config: fc: Don't expose disable_block_device_use
c1fb4bb7 snap: Don't build cloud-hypevisor on ppc64le
58913694 snap: Use git clone depth 1 for QEMU and dependencies
b27c7f40 docs: Add unit testing presentation
e64c54a2 monitor: Listen to localhost only by default
e6350d3d monitor: Fix build options
a67b93bb snap: clh: Re-use kata-deploy script here
f31125fe version: Bump cloud-hypervisor to b0324f85571c441f
54d0a672 subsystem: build
edf20766 docs: Update Readme document
eda8ea15 runtime: Gofmt fixes
4afb278f ci: add github action to exercise darwin build, unit tests
e355a718 container: file is not linux specific
b31876ee device-manager: move linux-only test to a linux-only file
6a5c6344 resourcecontrol: SystemdCgroup check is not necessarily linux specific
cc58cf69 resourcecontrol: convert stats dev_t to unit64types
5be188cc utils: Add darwin stub
ad044919 virtcontainers: Convert stats dev_t to uint64
56751089 katautils: Use a syscall wrapper for the hook JSON state
7d64ae7a runtime: Add a syscall wrapper package
abc681ca katautils: Add Darwin stub for the netNS API
de574662 config: Expand confidential_guest comments
641d475f config: clh: Use "Intel TDX" instead of just "TDX"
0bafa2de config: clh: Mention supported TEEs
81ed269e runtime: use Cmd.StdoutPipe instead of self-created pipe
8edca8bb kata-agent: Fix mismatching error of cgroup and mountinfo.
a9ba7c13 clh: Fix typo on HotplugRemoveDevice
827ab82a tools: clh: Fix unbound variable
082d538c runtime: make selinux configurable
1103f5a4 virtcontainers: Use FilesystemSharer for sharing the containers files
533c1c0e virtcontainers: Keep all filesystem sharing prep code to sandbox.go
61590bbd virtcontainers: Add a Linux implementation for the FilesystemSharer
03fc1cbd virtcontainers: Add a filesystem sharing interface
72434333 clh: Add TDX support
a13b4d5a clh: Add firmware to the config file
a8827e0c hypervisors: Confidential Guests do not support NVDIMM
f50ff9f7 hypervisors: Confidential Guests do not support Memory hotplug
df8ffecd hypervisors: Confidential Guests do not support Device hotplug
28c4c044 hypervisors: Confidential Guests do not support VCPUs hotplug
29ee870d clh: Add confidential_guest to the config file
9621c596 clh: refactor image / initrd configuration set
dcdc412e clh: use common kernel params from the hypervisor code
4c164afb versions: Update Cloud Hypervisor to 5343e09e7b8db
b2a65f90 virtcontainers: Use available s390x hugepages
cb4230e6 runtime: fix package declaration for ppc64le
fec26f8e kata-monitor: trivial: rename symbols & labels
9fd4e551 runtime: Move the resourcecontrol package one layer up
823faee8 virtcontainers: Rename the cgroups package
0d1a7da6 virtcontainers: Rename and clean the cgroup interface
ad10e201 virtcontainers: cgroups: Move non Linux routine to utils.go
d49d0b6f virtcontainers: cgroups: Define a cgroup interface
3ac52e81 kata-monitor: fix updating sandbox cache at startup
160bb621 kata-monitor: bump version to 0.3.0
1a3381b0 docs: Developer-Guide build a custom Kata agent with musl
f6fc1621 shim: log events for CRI-O
1d68a08f docs: Update contributing link
9123fc09 kata-deploy: Simplify Dockerfile and support s390x
11220f05 kata-deploy: Use (kata with) qemu as the default shim-v2 binary
3175aad5 virtiofs-nydus: add lazyload support for kata with clh
94b831eb virtcontainers: remove temp dir created for vsock in test code
8cc1b186 kernel: remove SYS_SUPPORTS_HUGETLBFS from powerpc fragments
5c9d2b41 packaging: Use `patch` for applying patches
5b3fb6f8 kernel: Build SGX as part of the vanilla kernel
2c35d8cb workflows: Stop building the experimental kernel
32e7845d snap: Build vanilla kernel for all arches
27de212f runtime: Always add network endpoints from the pod netns
1cee0a94 virtcontainers: Remove duplicated assert messages in utils test code
6c1d149a docs: Update limitations document
7c4ee6ec packaging/qemu: create no_patches file for qemu-tdx
d47c488b versions: add qemu tdx section
77c29bfd container: Remove VFIO lazy attach handling
7241d618 versions: add nydus-snapshotter
26b3f001 virtcontainers: Split hypervisor into Linux and OS agnostic bits
fa0e9dc6 virtcontainers: Make all Linux VMMs only build on Linux
c91035d0 virtcontainers: Move non QEMU specific constants to hypervisor.go
10ae0591 virtcontainers: Move guest protection definitions to hypervisor.go
b28d0274 virtcontainers: Make max vCPU config less QEMU specific
a5f6df6a govmm: Define the number of supported vCPUs per architecture
a6b40151 tools: clh: Remove unused variables
5816c132 tools: Build cloud-hypervisor with "--features tdx"
e6060cb7 versions: Linux 5.15.x
9818cf71 docs: Improve top-level and runtime README
36c3fc12 agent: support hugepages for containers
81a8baa5 runtime: add hugepages support
7df677c0 runtime: Update calculateSandboxMemory to include Hugepages Limit
948a2b09 tools: clh: Ensure the download binary is executable
72bf5496 agent: handle hook process result
80e8dbf1 agent: valid envs for hooks
4f96e3ea katautils: Pass the nerdctl netns annotation to the OCI hooks
a871a33b katautils: Run the createRuntime hooks
d9dfce14 katautils: Run the preStart hook in the host namespace
6be6d0a3 katautils: Pass the OCI annotations back to the called OCI hooks
493ebc8c utils: Update kata manager docs
34b2e67d utils: Added more kata manager cli options
714c9f56 utils: Improve containerd configuration
c464f326 utils: kata-manager: Force containerd sym link creation
4755d004 utils: Fix unused parameter
601be4e6 utils: Fix containerd installation
ae21fcc7 utils: Fix Kata tar archive check
f4d1e45c utils: Add kata-manager CLI options for kata and containerd
395cff48 docs: Remove docker run and shared memory from limitations
e07545a2 tools: clh: Allow passing down a build flag
55cdef22 tools: clh: Add the possibility to always build from sources
3f87835a utils: Switch kata manager to use getopts
4bd945b6 virtiofsd: Use "-o announce_submounts"
37df1678 build: always reset ARCH after getting it
3a641b56 katatestutils: remove distro constraints
90fd625d versions: Udpate Cloud Hypervisor to 55479a64d237
573a37b3 osbuilder: Add CentOS Stream rootfs
f10642c8 osbuilder: Source .cargo/env before checking Rust
955d359f kernel: add missing config fragment for TDx
734b618c agent-ctl: run cargo fmt/clippy in make check
12c37faf trace-forwarder: add make check for Rust
c1ce67d9 runtime: use github.com/mdlayher/vsock@v1.1.0
42a878e6 runtime: The index variable is initialized multiple times in for
1797b3eb packaging/kernel: build TDX guest kernel
98752529 versions: add url and tag for tdx kernel
bc8464e0 packaging/kernel: add option -s option
2d9f89ae feature(nydusd): add nydusd support to introduse lazyload ability
b19b6938 docs: Fix relative links in Markdown
9590874d device: Update PCIDEVICE_ environment variables for the guest
7b7f426a device: Keep host to VM PCI mapping persistently
0b2bd641 device: Rework update_spec_pci() to update_env_pci()
982f14fa runtime: support QEMU SGX
40aa43f4 docs: Update link to EFK stack docs
54e1faec scripts: fix a typo while to check build_type
07b9d93f virtcontainer: Simplify the sandbox network creation flow
2c7087ff virtcontainers: Make all endpoints Linux only
49d2cde1 virtcontainers: Split network tests into generic and OS specific parts
0269077e virtcontainers: Remove the netlink package dependency from network.go
7fca5792 virtcontainers: Unify Network endpoints management interface
c67109a2 virtcontainers: Remove the Network PostAdd method
e0b26443 virtcontainers: Define a Network interface
5e119e90 virtcontainers: Rename the Network structure fields and methods
b858d0de virtcontainers: Make all Network fields private
49eee79f virtcontainers: Remove the NetworkNamespace structure
844eb619 virtcontainers: Have CreateVM use a Network reference
d7b67a7d virtcontainers: Network API cleanups and simplifications
2edea883 virtcontainers: Make the Network structure manage endpoints
8f48e283 virtcontainers: Expand the Network structure
5ef522f7 runtime: check kvm module `sev` correctly
419d8134 snap: update qemu version to 6.1.0 for arm
00722187 docs: update Release-Process.md
496bc10d tools: check for yq before using it
88a70d32 Revert "workflows: Ensure a label change re-triggers the actions"
a9bebb31 openshift-ci: switch to CentOS Stream
89047901 kata-deploy-push: only run if PR modifying tools path
7ffe9e51 virtcontainers: Do not add a virtio-rng-ccw device
1f29478b runtime: suppport split firmware
24796d2f kata-deploy: for testing, make sure we use the PR branch
1cc1c8d0 docs: Remove images from Zun documentation
5861e52f docs: Remove Zun documentation with kata containers
903a6a45 versions: Bump critools to its 1.23 release
63eb1158 versions: bump CRI-O to its 1.23 release
5083ae65 workflows: stop checking revert commit
14e7f52a virtcontainers: Split the rootless package into OS specific parts
ab447285 kata-monitor: add kubernetes pod metadata labels to metrics
834e199e kata-monitor: drop unused functions
7516a8c5 kata-monitor: rework the sandbox cache sync with the container manager
e78d80ea kata-monitor: silently ignore CHMOD events on the sandboxes fs
e9eb34ce kata-monitor: improve debug logging
4fc4c76b agent: Fix execute_hook() args error
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
kata-deploy files must be adapted to a new release. The cases where it
happens are when the release goes from -> to:
* main -> stable:
* kata-deploy-stable / kata-cleanup-stable: are removed
* stable -> stable:
* kata-deploy / kata-cleanup: bump the release to the new one.
There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
During the release of 2.4.0-rc0 @egernst noticed an incositency in the
way we handle release tags, as release candidates are being taken as
"stable" releases, while both the kata-deploy tests and the release
action consider this as "latest".
Ideally we should have our own tag for "release candidate", but that's
something that could and should be discussed more extensively outside of
the scope of this quick fix.
For now, let's align the code generating the PR for bumping the release
with what we already do as part of the release action and kata-deploy
test, and tag "-rc" as latest, regardless of which branch it's coming
from.
Fixes: #3847
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
allocate_hugepages() writes to the kernel sysfs file to allocate hugepages
in the Kata VM. However, even if the write succeeds, it's not certain that
the kernel will actually be able to allocate as many hugepages as we
requested.
This patch reads back the file after writing it to check if we were able to
allocate all the required hugepages.
fixes#3816
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
allocate_hugepages() constructs the path for the sysfs directory containing
hugepage configuration, then attempts to create this directory if it does
not exist.
This doesn't make sense: sysfs is a view into kernel configuration, if the
kernel has support for the hugepage size, the directory will already be
there, if it doesn't, trying to create it won't help.
For the same reason, attempting to create the "nr_hugepages" file
itself is pointless, so there's no reason to call
OpenOptions::create(true).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Image-rs crate image pull/decrypt/decompression/unpack/mount
features are ready now.
With image-rs pull_image API, the downloaded container image layers
will store at IMAGE_RS_WORK_DIR, and generated bundle dir with rootfs
and config.json will be saved under CONTAINER_BASE/cid directory.
Fixes: #3860
Signed-off-by: Arron Wang <arron.wang@intel.com>
1.Update AA's launch command according to latest implementation
2.Enable get_resource port which will be used by signature verification
Fixes: #3827
Signed-off-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
- Clippy checks were introduced that cause a warning
for a function with more than 7 arguments.
The image service addition means handle_cmd
has 8 and re-factoring it would take us further
away from main, so ignore for now
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add scripts and documentation to build, configure and test
the ssh-demo encrypted image sample in Kubernetes
Fixes: #3637
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add scripts and documentation to build, configure and test
created a Kata CC unencrypted container using Kubernetes
- Switch test images to quay.io as image_rpc.rs has some
problems with docker.io?
- Update documentation to better fit the kata documentation
requirements and fix typos
- Fixes: #3511
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
General doc enchancements including:
- Change `cd`s for `pushd` and `popd`s
- Remove hard coded architectures
- Tighten up the security where we `chmod 777`
- Add support for not running as source
- Updates so it doesn't do `ctr pull` if the image is on the
local system already
- Doc and Test running as non-root user (covered by #2879)
- Update doc to match image_rpc changes
Fixes: #3549Fixes: #2879
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add scripts and documentation to build, configure and test
created a Kata CC unencrypted container using crictl
- Update documentation to better fit the kata documentation requirements
- Fixes: #3510
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Push the demo image to `quay.io/confidential-containers/runtime-payload`
(which, as opposed to `.../kata-demo`, existed all along).
Fixes: #3345
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Basic config, no debug endpoints, no exec/reseed. Uses the
`$AA_KBC_PARAMS` variable to be used with `envsubst`.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Run `realpath` on `INCLUDE_ROOTFS` so it is not required to provide a
full path. This simplifies the required GitHub Actions workflow, as
GitHub's `env` cannot use shell expansions, as well as the usability
overall.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
With the new rust image pull service skopeo we can parameterise whether to build
and install skopeo and turn it off by default if we don't need
signature verification support
Fixes: #3170
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
With the new rust image pull service skopeo we can parameterise whether to build
and install skopeo and turn it off by default if we don't need
signature verification support
Fixes: #3170
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The initrd build process now supports facultatively installing Skopeo
while still installing Umoci. Mirror this change in the respective
kata-deploy process.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Integrate EAA KBC into ubuntu rootfs image.
Fix build failure if build with AA_KBC=eaa_kbc option.
Fixes: #3167
Signed-off-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
The CI runner fails to clone the git crates as it probably is confused
about its CARGO_HOME value. That prevents vendoring to succeed as the
runner has nothing to copy over to the vendoring code.
We fix that by temporarily setting CARGO_HOME to tmpfs, only for the
vendoring step. It's hackish.
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
We use tonic to build GRPC client to talk with attestation agent,
and tonic require newer version of rust.
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
':' will have special meaning for umoci during upack, then we
do not use it as part of the image store path
Signed-off-by: Arron Wang <arron.wang@intel.com>
For the initrd build, add makeopts for $SKOPEO_UMOCI and $AA_KBC. Use
the $INCLUDE_ROOTFS variable to specify a directory of files that should
be recursively merged into the guest.
Fixes: #3126
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
As discussed in #2908, Ubuntu is used as a base for CCv0 for building
umoci in the guest. Currently, CCv0 only works with initrd, so this only
applies to initrd.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Some generated merge commit messages are >75 chars
Allow these to not trigger the subject line length failure
Fixes: #3132
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Weekly merge of main branch into CCv0 26th November
Fixes: #3132
Depends-on: github.com/kata-containers/tests#4226
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Pull an encrypted image using the Attestation Agent as
a keyprovider.
Fixes: #3022
Signed-off-by: Tobin Feldman-Fitzthum <tobin@linux.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
Add the "Merge pull request (kata-containers)?#<x> from" message to the
subsystem check to allow commit check on merges between branches to work
Fixes: #3085
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add new agent configuration policy path parameter
- Update agent pull image to use the policy path if specified and
otherwise fall back to the accept all policy
- Remove the double copy of the image during pulling
- Ensure that temporary directories are always removed
Fixes: #2682
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Document how to test the signature validation with
a number of different scenarios and test images
- Update ccv0.sh to add policy_path to kernel_params
Fixes: #2682
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add hardcoded gpg, signature and polict files
- Modify rootfs.sh to put these in the correct place in the kata image
if skopeo and umoci are being used
Fixes: #2682
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We do not get a root filesystem path from the agent when creating a
new container for which the container image was not pulled by
containerd. That prevents the agent from creating the container.
To fix that, we populate the container root path with the internal
rootfs path by fetching the containerd added image name annotation and
mapping it back to a path through our image hash map.
Fixes#3009
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
In the confidential computing scenario, there is no Image
information on the host, so skip handling Rootfs at
CreateContainer.
Fixes#3009
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
- Replace containerd to `confidential-containers/containerd` in go.mod
- Use separate ImageService to support PullImage
Fixes#3009
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
The version `v1.6.0-beta.2` released support for shim service,
which is needed for our implementation of ImageService.
Fixes#3009
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Since the `utils::get_option` interface is modified,
PullImage needs to adapt to this modification in CCv0 branch.
Fixes#3044
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
When the environment variable $SKOPEO_UMOCI is set to "yes", Skopeo and
umoci are built inside the guest build container and installed to the
guest rootfs. The respective build- and runtime dependencies are added.
This respects the (existing) $LIBC variable (gnu/musl) and avoids issues
with glibc mismatches.
This is currently only supported for Ubuntu guests, as the system Golang
packages included in the versions of other distros that we use are too
old to build these packages, and re-enabling installing Golang from
golang.org is cumbersome, given especially that it is unclear how long
we will keep using Skopeo and umoci.
Additionally, when the environment variable $AA_KBC is set,
attestation-agent (with that KBC) is included.
This replaces some logic in ccv0.sh that is removed.
Fixes: #2907
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
- Update CCv0 to use the new confidential containers fork of containerd
- Start using the current-CCv0 branch
Fixes#2947
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add a new PullImage endpoint to the shim API.
Add new PullImage functions to the virtcontainers files, which allows
the PullImage endpoint on the agent to be called.
Update the containerd vendor files to support new PullImage API changes.
Fixes#2651
Signed-off-by: Dave Hay <david_hay@uk.ibm.com>
Co-authored-by: ashleyrobertson <ashleyro@uk.ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Just in case someone thinks about using kata-deploy directly from this
branch, let's point to the `-cc:v0`image.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
This is a dirty hack, that we should expand later so we can pass one or
n number of repos where we'll upload our images, and use it as part of
the release scripts.
For now, however, let's just do this quick & dirty hack so we can
present the CCv0 demo using the operator, even knowing that the
kubernetes part of the work is not done yet and that the demo itself
will be done connecting to a node and doing all the shenanigans
manually.
Fixes: #2854
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
- Update the CCv0 demo script to use ubuntu instead of fedora
- Update the extra packages to reflect the apt vs dnf namings
- Build and add the skopeo binary to the rootfs image
- Minor kubernetes init fix
Fixes#2849
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Revert fedora OS changes made in #2556 as we aren't using it anymore.
- They should be done in main under #2116Fixes#2849
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add skopeo --insecure-policy tag to reflect that ubuntu doesn't
create a default container policy
Fixes#2849
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add source credentials field to pull_image endpoint
If field is not blank, send to skopeo in image pull command
Add source_creds to agentl-ctl pull command
Fixes: #2653
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add support for a new source credentials environment variable in the
test script
Add documentation of it into the how-to guide
Fixes#2653
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add a check in setup_bundle to see if the bundle already exists
and if it does then skip the setup.
Fixes: #2617
Co-authored-by: Dave Hay <david_hay@uk.ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add a pr_id field to the cri-containerd config in versions.yaml
so the CI scripts can use this in the CCv0 builds
Fixes#2576
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This commit add documentation and a script to help people to build, run,
test and demo the CCv0 changes around PullImage on guest.
It is currently limited to the Agent pullimage, but can be expanded
as more code is shared.
Fixes#2574
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This commit adds the PullImge endpoint to the agent
and the agent-ctl command to test it.
Fixes: #2509
Signed-off-by: Georgina Kinge <georgina.kinge@ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
2021-11-05 14:49:20 +00:00
1802 changed files with 402989 additions and 14234 deletions
[](https://github.com/kata-containers/kata-containers/actions/workflows/payload-after-push.yaml) [](https://github.com/kata-containers/kata-containers/actions/workflows/ci-nightly.yaml)
# Kata Containers
@@ -132,8 +134,10 @@ The table below lists the remaining parts of the project:
| [packaging](tools/packaging) | infrastructure | Scripts and metadata for producing packaged binaries<br/>(components, hypervisors, kernel and rootfs). |
| [kernel](https://www.kernel.org) | kernel | Linux kernel used by the hypervisor to boot the guest image. Patches are stored [here](tools/packaging/kernel). |
| [osbuilder](tools/osbuilder) | infrastructure | Tool to create "mini O/S" rootfs and initrd images and kernel for the hypervisor. |
| [kata-debug](tools/packaging/kata-debug/README.md) | infrastructure | Utility tool to gather Kata Containers debug information from Kubernetes clusters. |
| [`agent-ctl`](src/tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |
| [`kata-ctl`](src/tools/kata-ctl) | utility | Tool that provides advanced commands and debug facilities. |
| [`log-parser-rs`](src/tools/log-parser-rs) | utility | Tool that aid in analyzing logs from the kata runtime. |
This document is written **specifically for developers**: it is not intended for end users.
If you want to contribute changes that you have made, please read the [community guidelines](https://github.com/kata-containers/community/blob/main/CONTRIBUTING.md) for information about our processes.
# Assumptions
- You are working on a non-critical test or development system.
@@ -585,10 +587,15 @@ $ sudo kata-monitor
#### Connect to debug console
Command `kata-runtime exec` is used to connect to the debug console.
You need to start a container for example:
```bash
$ sudo ctr run --runtime io.containerd.kata.v2 -d docker.io/library/ubuntu:latest testdebug
```
Then, you can use the command `kata-runtime exec <sandbox id>` to connect to the debug console.
If you create a new stable branch, i.e. if your release changes a major or minor version number (not a patch release), then
you should modify the `tests` repository to point to that newly created stable branch and not the `main` branch.
The objective is that changes in the CI on the main branch will not impact the stable branch.
In the test directory, change references the main branch in:
* `README.md`
* `versions.yaml`
* `cmd/github-labels/labels.yaml.in`
* `cmd/pmemctl/pmemctl.sh`
* `.ci/lib.sh`
* `.ci/static-checks.sh`
See the commits in [the corresponding PR for stable-2.1](https://github.com/kata-containers/tests/pull/3504) for an example of the changes.
### Merge all bump version Pull requests
- The above step will create a GitHub pull request in the Kata projects. Trigger the CI using `/test` command on each bump Pull request.
@@ -63,6 +46,24 @@
$ ./tag_repos.sh -p -b "$BRANCH" tag
```
### Point tests repository to stable branch
If your release changes a major or minor version number(not a patch release), then the above
`./tag_repos.sh` script will create a new stable branch in all the repositories in addition to tagging them.
This happens when you are making the first `rc` release for a new major or minor version in Kata.
In this case, you should modify the `tests` repository to point to the newly created stable branch and not the `main` branch.
The objective is that changes in the CI on the main branch will not impact the stable branch.
In the test directory, change references of the `main` branch to the new stable branch in:
* `README.md`
* `versions.yaml`
* `cmd/github-labels/labels.yaml.in`
* `cmd/pmemctl/pmemctl.sh`
* `.ci/lib.sh`
* `.ci/static-checks.sh`
See the commits in [the corresponding PR for stable-2.1](https://github.com/kata-containers/tests/pull/3504) for an example of the changes.
### Check Git-hub Actions
We make use of [GitHub actions](https://github.com/features/actions) in this [file](../.github/workflows/release.yaml) in the `kata-containers/kata-containers` repository to build and upload release artifacts. This action is auto triggered with the above step when a new tag is pushed to the `kata-containers/kata-containers` repository.
During container's lifecycle, different Hooks can be executed to do custom actions. In Kata Containers, we support two types of Hooks, `OCI Hooks` and `Kata Hooks`.
### OCI Hooks
The OCI Spec stipulates six hooks that can be executed at different time points and namespaces, including `Prestart Hooks`, `CreateRuntime Hooks`, `CreateContainer Hooks`, `StartContainer Hooks`, `Poststart Hooks` and `Poststop Hooks`. We support these types of Hooks as compatible as possible in Kata Containers.
The path and arguments of these hooks will be passed to Kata for execution via `bundle/config.json`. For example:
```
...
"hooks": {
"prestart": [
{
"path": "/usr/bin/prestart-hook",
"args": ["prestart-hook", "arg1", "arg2"],
"env": [ "key1=value1"]
}
],
"createRuntime": [
{
"path": "/usr/bin/createRuntime-hook",
"args": ["createRuntime-hook", "arg1", "arg2"],
"env": [ "key1=value1"]
}
]
}
...
```
### Kata Hooks
In Kata, we support another three kinds of hooks executed in guest VM, including `Guest Prestart Hook`, `Guest Poststart Hook`, `Guest Poststop Hook`.
The executable files for Kata Hooks must be packaged in the *guest rootfs*. The file path to those guest hooks should be specified in the configuration file, and guest hooks must be stored in a subdirectory of `guest_hook_path` according to their hook type. For example:
+ In configuration file:
```
guest_hook_path="/usr/share/hooks"
```
+ In guest rootfs, prestart-hook is stored in `/usr/share/hooks/prestart/prestart-hook`.
## Execution
The table below summarized when and where those different hooks will be executed in Kata Containers:
| Hook Name | Hook Type | Hook Path | Exec Place | Exec Time |
|---|---|---|---|---|
| `Prestart(deprecated)` | OCI hook | host runtime namespace | host runtime namespace | After VM is started, before container is created. |
| `CreateRuntime` | OCI hook | host runtime namespace | host runtime namespace | After VM is started, before container is created, after `Prestart` hooks. |
| `CreateContainer` | OCI hook | host runtime namespace | host vmm namespace* | After VM is started, before container is created, after `CreateRuntime` hooks. |
| `StartContainer` | OCI hook | guest container namespace | guest container namespace | After container is created, before container is started. |
| `Poststart` | OCI hook | host runtime namespace | host runtime namespace | After container is started, before start operation returns. |
| `Poststop` | OCI hook | host runtime namespace | host runtime namespace | After container is deleted, before delete operation returns. |
| `Guest Prestart` | Kata hook | guest agent namespace | guest agent namespace | During start operation, before container command is executed. |
| `Guest Poststart` | Kata hook | guest agent namespace | guest agent namespace | During start operation, after container command is executed, before start operation returns. |
| `Guest Poststop` | Kata hook | guest agent namespace | guest agent namespace | During delete operation, after container is deleted, before delete operation returns. |
+ `Hook Path` specifies where hook's path be resolved.
+ `Exec Place` specifies in which namespace those hooks can be executed.
+ For `CreateContainer` Hooks, OCI requires to run them inside the container namespace while the hook executable path is in the host runtime, which is a non-starter for VM-based containers. So we design to keep them running in the *host vmm namespace.*
+ `Exec Time` specifies at which time point those hooks can be executed.
| 🚧 | `kata_shim_pod_overhead_cpu`: <br> Kata Pod overhead for CPU resources(percent). | `GAUGE` | percent | <ul><li>`sandbox_id`</li></ul> |
| 🚧 | `kata_shim_pod_overhead_memory_in_bytes`: <br> Kata Pod overhead for memory resources(bytes). | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> |
| ✅ | `kata_shim_proc_stat`: <br> Kata containerd shim v2 process statistics. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/stat`)<ul><li>`cstime`</li><li>`cutime`</li><li>`stime`</li><li>`utime`</li></ul></li><li>`sandbox_id`</li></ul> |
| ✅ | `kata_shim_proc_status`: <br> Kata containerd shim v2 process status. | `GAUGE` | | <ul><li>`item` (see `/proc/<pid>/status`)<ul><li>`hugetlbpages`</li><li>`nonvoluntary_ctxt_switches`</li><li>`rssanon`</li><li>`rssfile`</li><li>`rssshmem`</li><li>`vmdata`</li><li>`vmexe`</li><li>`vmhwm`</li><li>`vmlck`</li><li>`vmlib`</li><li>`vmpeak`</li><li>`vmpin`</li><li>`vmpmd`</li><li>`vmpte`</li><li>`vmrss`</li><li>`vmsize`</li><li>`vmstk`</li><li>`vmswap`</li><li>`voluntary_ctxt_switches`</li></ul></li><li>`sandbox_id`</li></ul> |
| 🚧 | `kata_shim_process_cpu_seconds_total`: <br> Total user and system CPU time spent in seconds. | `COUNTER` | `seconds` | <ul><li>`sandbox_id`</li></ul> |
| 🚧 | `kata_shim_process_max_fds`: <br> Maximum number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> |
| 🚧 | `kata_shim_process_open_fds`: <br> Number of open file descriptors. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> |
| 🚧 | `kata_shim_process_start_time_seconds`: <br> Start time of the process since `unix` epoch in seconds. | `GAUGE` | `seconds` | <ul><li>`sandbox_id`</li></ul> |
| 🚧 | `kata_shim_process_virtual_memory_max_bytes`: <br> Maximum amount of virtual memory available in bytes. | `GAUGE` | `bytes` | <ul><li>`sandbox_id`</li></ul> |
| ✅ | `kata_shim_threads`: <br> Kata containerd shim v2 process threads. | `GAUGE` | | <ul><li>`sandbox_id`</li></ul> |
### Kata Hypervisor
Different from golang runtime, hypervisor and shim in runtime-rs belong to the **same process**, so all previous metrics for hypervisor and shim only need to be gathered once. Thus, we currently only collect previous metrics in kata shim.
At the same time, we added the interface(`VmmAction::GetHypervisorMetrics`) to gather hypervisor metrics, in case we design tailor-made metrics for hypervisor in the future. Here're metrics exposed from [src/dragonball/src/metric.rs](https://github.com/kata-containers/kata-containers/blob/main/src/dragonball/src/metric.rs).
# Virtual machine vCPU sizing in Kata Containers 3.0
> Preview:
> [Kubernetes(since 1.23)][1] and [Containerd(since 1.6.0-beta4)][2] will help calculate `Sandbox Size` info and pass it to Kata Containers through annotations.
> In order to adapt to this beneficial change and be compatible with the past, we have implemented the new vCPUs handling way in `runtime-rs`, which is slightly different from the original `runtime-go`'s design.
## When do we need to handle vCPUs size?
vCPUs sizing should be determined by the container workloads. So throughout the life cycle of Kata Containers, there are several points in time when we need to think about how many vCPUs should be at the time. Mainly including the time points of `CreateVM`, `CreateContainer`, `UpdateContainer`, and `DeleteContainer`.
*`CreateVM`: When creating a sandbox, we need to know how many vCPUs to start the VM with.
*`CreateContainer`: When creating a new container in the VM, we may need to hot-plug the vCPUs according to the requirements in container's spec.
*`UpdateContainer`: When receiving the `UpdateContainer` request, we may need to update the vCPU resources according to the new requirements of the container.
*`DeleteContainer`: When a container is removed from the VM, we may need to hot-unplug the vCPUs to reclaim the vCPU resources introduced by the container.
## On what basis do we calculate the number of vCPUs?
When Kata calculate the number of vCPUs, We have three data sources, the `default_vcpus` and `default_maxvcpus` specified in the configuration file (named `TomlConfig` later in the doc), the `io.kubernetes.cri.sandbox-cpu-quota` and `io.kubernetes.cri.sandbox-cpu-period` annotations passed by the upper layer runtime, and the corresponding CPU resource part in the container's spec for the container when `CreateContainer`/`UpdateContainer`/`DeleteContainer` is requested.
Our understanding and priority of these resources are as follows, which will affect how we calculate the number of vCPUs later.
* From `TomlConfig`:
*`default_vcpus`: default number of vCPUs when starting a VM.
*`default_maxvcpus`: maximum number of vCPUs.
* From `Annotation`:
*`InitialSize`: we call the size of the resource passed from the annotations as `InitialSize`. Kubernetes will calculate the sandbox size according to the Pod's statement, which is the `InitialSize` here. This size should be the size we want to prioritize.
* From `Container Spec`:
* The amount of CPU resources that the Container wants to use will be declared through the spec. Including the aforementioned annotations, we mainly consider `cpu quota` and `cpuset` when calculating the number of vCPUs.
*`cpu quota`: `cpu quota` is the most common way to declare the amount of CPU resources. The number of vCPUs introduced by `cpu quota` declared in a container's spec is: `vCPUs = ceiling( quota / period )`.
*`cpuset`: `cpuset` is often used to bind the CPUs that tasks can run on. The number of vCPUs may introduced by `cpuset` declared in a container's spec is the number of CPUs specified in the set that do not overlap with other containers.
## How to calculate and adjust the vCPUs size:
There are two types of vCPUs that we need to consider, one is the number of vCPUs when starting the VM (named `Boot Size` in the doc). The second is the number of vCPUs when `CreateContainer`/`UpdateContainer`/`DeleteContainer` request is received (`Real-time Size` in the doc).
### `Boot Size`
The main considerations are `InitialSize` and `default_vcpus`. There are the following principles:
`InitialSize` has priority over `default_vcpus` declared in `TomlConfig`.
1. When there is such an annotation statement, the originally `default_vcpus` will be modified to the number of vCPUs in the `InitialSize` as the `Boot Size`. (Because not all runtimes support this annotation for the time being, we still keep the `default_cpus` in `TomlConfig`.)
2. When the specs of all containers are aggregated for sandbox size calculation, the method is consistent with the calculation method of `InitialSize` here.
### `Real-time Size`
When we receive an OCI request, it may be for a single container. But what we have to consider is the number of vCPUs for the entire VM. So we will maintain a list. Every time there is a demand for adjustment, the entire list will be traversed to calculate a value for the number of vCPUs. In addition, there are the following principles:
1. Do not cut computing power and try to keep the number of vCPUs specified by `InitialSize`.
* So the number of vCPUs after will not be less than the `Boot Size`.
2.`cpu quota` takes precedence over `cpuset` and the setting history are took into account.
* We think quota describes the CPU time slice that a cgroup can use, and `cpuset` describes the actual CPU number that a cgroup can use. Quota can better describe the size of the CPU time slice that a cgroup actually wants to use. The `cpuset` only describes which CPUs the cgroup can use, but the cgroup can use the specified CPU but consumes a smaller time slice, so the quota takes precedence over the `cpuset`.
* On the one hand, when both `cpu quota` and `cpuset` are specified, we will calculate the number of vCPUs based on `cpu quota` and ignore `cpuset`. On the other hand, if `cpu quota` was used to control the number of vCPUs in the past, and only `cpuset` was updated during `UpdateContainer`, we will not adjust the number of vCPUs at this time.
3.`StaticSandboxResourceMgmt` controls hotplug.
* Some VMMs and kernels of some architectures do not support hotplugging. We can accommodate this situation through `StaticSandboxResourceMgmt`. When `StaticSandboxResourceMgmt = true` is set, we don't make any further attempts to update the number of vCPUs after booting.
- [How to run Kata Containers with `nydus`](how-to-use-virtio-fs-nydus-with-kata.md)
- [How to run Kata Containers with AMD SEV-SNP](how-to-run-kata-containers-with-SNP-VMs.md)
- [How to use EROFS to build rootfs in Kata Containers](how-to-use-erofs-build-rootfs.md)
- [How to run Kata Containers with kinds of Block Volumes](how-to-run-kata-containers-with-kinds-of-Block-Volumes.md)
## Confidential Containers
- [How to use build and test the Confidential Containers `CCv0` proof of concept](how-to-build-and-test-ccv0.md)
- [How to generate a Kata Containers payload for the Confidential Containers Operator](how-to-generate-a-kata-containers-payload-for-the-confidential-containers-operator.md)
which should only show a single `rootfs` directory if the container image was pulled on the guest, not the host
- Looking that `rootfs` directory with
```bash
$ sudo ls -ltr $(sudo find /run/kata-containers/shared/sandboxes/${pod_id}/shared -name rootfs)
```
shows something similar to
```
total 668
-rwxr-xr-x 1 root root 682696 Aug 25 13:58 pause
drwxr-xr-x 2 root root 6 Jan 20 02:01 proc
drwxr-xr-x 2 root root 6 Jan 20 02:01 dev
drwxr-xr-x 2 root root 6 Jan 20 02:01 sys
drwxr-xr-x 2 root root 25 Jan 20 02:01 etc
```
which is clearly the pause container indicating that the `busybox` based container image is not exposed to the host.
### Using a Kata pod sandbox for testing with `agent-ctl` or `ctr shim`
Once you have a kata pod sandbox created as described above, either using
[`crictl`](#using-crictl-for-end-to-end-provisioning-of-a-kata-confidential-containers-pod-with-an-unencrypted-image), or [Kubernetes](#using-kubernetes-for-end-to-end-provisioning-of-a-kata-confidential-containers-pod-with-an-unencrypted-image)
, you can use this to test specific components of the Kata confidential
containers architecture. This can be useful for development and debugging to isolate and test features
that aren't broadly supported end-to-end. Here are some examples:
- In the first terminal run the pull image on guest command against the Kata agent, via the shim (`containerd-shim-kata-v2`).
This can be achieved using the [containerd](https://github.com/containerd/containerd) CLI tool, `ctr`, which can be used to
interact with the shim directly. The command takes the form
`ctr --namespace k8s.io shim --id <sandbox-id> pull-image <image> <new-container-id>` and can been run directly, or through
the `ccv0.sh` script to automatically fill in the variables:
- Optionally, set up some environment variables to set the image and credentials used:
- By default the shim pull image test in `ccv0.sh` will use the `busybox:1.33.1` based test image
`quay.io/kata-containers/confidential-containers:signed` which requires no authentication. To use a different
image, set the `PULL_IMAGE` environment variable e.g.
{"msg":"Command PullImage (1 of 1) returned (Ok(()), false)","level":"INFO","ts":"2021-09-15T08:40:43.828261708-07:00","subsystem":"rpc","pid":"830920","source":"kata-agent-ctl","version":"0.1.0","name":"kata-agent-ctl"}
```
> **Note**: The first time that `~/ccv0.sh agent_pull_image` is run, the `agent-ctl` tool will be built
which may take a few minutes.
- To validate that the image pull was successful, you can open a shell into the Kata guest with:
```bash
$ ~/ccv0.sh open_kata_shell
```
- Check the `/run/kata-containers/` directory to verify that the container image bundle has been created in a directory
named either `01234556789` (for the container id), or the container image name, e.g.
```bash
$ ls -ltr /run/kata-containers/confidential-containers_signed/
```
which should show something like
```
total 72
drwxr-xr-x 10 root root 200 Jan 1 1970 rootfs
-rw-r--r-- 1 root root 2977 Jan 20 16:45 config.json
```
- Leave the Kata shell by running:
```bash
$ exit
```
## Verifying signed images
For this sample demo, we use local attestation to pass through the required
configuration to do container image signature verification. Due to this, the ability to verify images is limited
to a pre-created selection of test images in our test
For pulling images not in this test repository (called an *unprotected* registry below), we fall back to the behaviour
of not enforcing signatures. More documentation on how to customise this to match your own containers through local,
or remote attestation will be available in future.
In our test repository there are three tagged images:
| Test Image | Base Image used | Signature status | GPG key status |
| --- | --- | --- | --- |
| `quay.io/kata-containers/confidential-containers:signed` | `busybox:1.33.1` | [signature](https://github.com/kata-containers/tests/tree/CCv0/integration/confidential/fixtures/quay_verification/x86_64/signatures.tar) embedded in kata rootfs | [public key](https://github.com/kata-containers/tests/tree/CCv0/integration/confidential/fixtures/quay_verification/x86_64/public.gpg) embedded in kata rootfs |
| `quay.io/kata-containers/confidential-containers:unsigned` | `busybox:1.33.1` | not signed | not signed |
| `quay.io/kata-containers/confidential-containers:other_signed` | `nginx:1.21.3` | [signature](https://github.com/kata-containers/tests/tree/CCv0/integration/confidential/fixtures/quay_verification/x86_64/signatures.tar) embedded in kata rootfs | GPG key not kept |
Using a standard unsigned `busybox` image that can be pulled from another, *unprotected*, `quay.io` repository we can
test a few scenarios.
In this sample, with local attestation, we pass in the the public GPG key and signature files, and the [`offline_fs_kbc`
- Again this results in an error message from `crictl`:
`"PullImage from image service failed" err="rpc error: code = Internal desc = Security validate failed: Validate image failed: The signatures do not satisfied! Reject reason: [signature verify failed! There is no pubkey can verify the signature!]" image="quay.io/kata-containers/confidential-containers:other_signed"`
### Using Kubernetes to create a Kata confidential containers pod from the encrypted ssh demo sample image
The [ssh-demo](https://github.com/confidential-containers/documentation/tree/main/demos/ssh-demo) explains how to
demonstrate creating a Kata confidential containers pod from an encrypted image with the runtime created by the
# A new way for Kata Containers to use Kinds of Block Volumes
> **Note:** This guide is only available for runtime-rs with default Hypervisor Dragonball.
> Now, other hypervisors are still ongoing, and it'll be updated when they're ready.
## Background
Currently, there is no widely applicable and convenient method available for users to use some kinds of backend storages, such as File on host based block volume, SPDK based volume or VFIO device based volume for Kata Containers, so we adopt [Proposal: Direct Block Device Assignment](https://github.com/kata-containers/kata-containers/blob/main/docs/design/direct-blk-device-assignment.md) to address it.
## Solution
According to the proposal, it requires to use the `kata-ctl direct-volume` command to add a direct assigned block volume device to the Kata Containers runtime.
And then with the help of method [get_volume_mount_info](https://github.com/kata-containers/kata-containers/blob/099b4b0d0e3db31b9054e7240715f0d7f51f9a1c/src/libs/kata-types/src/mount.rs#L95), get information from JSON file: `(mountinfo.json)` and parse them into structure [Direct Volume Info](https://github.com/kata-containers/kata-containers/blob/099b4b0d0e3db31b9054e7240715f0d7f51f9a1c/src/libs/kata-types/src/mount.rs#L70) which is used to save device-related information.
We only fill the `mountinfo.json`, such as `device` ,`volume_type`, `fs_type`, `metadata` and `options`, which correspond to the fields in [Direct Volume Info](https://github.com/kata-containers/kata-containers/blob/099b4b0d0e3db31b9054e7240715f0d7f51f9a1c/src/libs/kata-types/src/mount.rs#L70), to describe a device.
The JSON file `mountinfo.json` placed in a sub-path `/kubelet/kata-test-vol-001/volume001` which under fixed path `/run/kata-containers/shared/direct-volumes/`.
And the full path looks like: `/run/kata-containers/shared/direct-volumes/kubelet/kata-test-vol-001/volume001`, But for some security reasons. it is
encoded as `/run/kata-containers/shared/direct-volumes/L2t1YmVsZXQva2F0YS10ZXN0LXZvbC0wMDEvdm9sdW1lMDAx`.
Finally, when running a Kata Containers with `ctr run --mount type=X, src=Y, dst=Z,,options=rbind:rw`, the `type=X` should be specified a proprietary type specifically designed for some kind of volume.
Now, supported types:
-`directvol` for direct volume
-`vfiovol` for VFIO device based volume
-`spdkvol` for SPDK/vhost-user based volume
## Setup Device and Run a Kata-Containers
### Direct Block Device Based Volume
#### create raw block based backend storage
> **Tips:** raw block based backend storage MUST be formatted with `mkfs`.
As `/run/kata-containers/shared/direct-volumes/` is a fixed path , we will be able to run a kata pod with `--mount` and set
`src` sub-path. And the `--mount` argument looks like: `--mount type=spdkvol,src=/kubelet/kata-test-vol-001/volume001,dst=/disk001`.
#### Run a Kata container with SPDK vhost-user block device
In the case, `ctr run --mount type=X, src=source, dst=dest`, the X will be set `spdkvol` which is a proprietary type specifically designed for SPDK volumes.
```bash
$ # ctr run with --mount type=spdkvol,src=/kubelet/kata-test-vol-001/volume001,dst=/disk001
To improve security, Kata Container supports running the VMM process (currently only QEMU) as a non-`root` user.
To improve security, Kata Container supports running the VMM process (QEMU and cloud-hypervisor) as a non-`root` user.
This document describes how to enable the rootless VMM mode and its limitations.
## Pre-requisites
@@ -27,7 +27,7 @@ Another necessary change is to move the hypervisor runtime files (e.g. `vhost-fs
## Limitations
1. Only the VMM process is running as a non-root user. Other processes such as Kata Container shimv2 and `virtiofsd` still run as the root user.
2. Currently, this feature is only supported in QEMU. Still need to bring it to Firecracker and Cloud Hypervisor (see https://github.com/kata-containers/kata-containers/issues/2567).
2. Currently, this feature is only supported in QEMU and cloud-hypervisor. For firecracker, you can use jailer to run the VMM process with a non-root user.
3. Certain features will not work when rootless VMM is enabled, including:
1. Passing devices to the guest (`virtio-blk`, `virtio-scsi`) will not work if the non-privileged user does not have permission to access it (leading to a permission denied error). A more permissive permission (e.g. 666) may overcome this issue. However, you need to be aware of the potential security implications of reducing the security on such devices.
2. `vfio` device will also not work because of permission denied error.
@@ -27,6 +27,8 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.runtime.internetworking_model` | string| determines how the VM should be connected to the container network interface. Valid values are `macvtap`, `tcfilter` and `none` |
| `io.katacontainers.config.runtime.sandbox_cgroup_only`| `boolean` | determines if Kata processes are managed only in sandbox cgroup |
| `io.katacontainers.config.runtime.enable_pprof` | `boolean` | enables Golang `pprof` for `containerd-shim-kata-v2` process |
| `io.katacontainers.config.runtime.image_request_timeout` | `uint64` | the timeout for pulling an image within the guest in `seconds`, default is `60` |
| `io.katacontainers.config.runtime.sealed_secret_enabled` | `boolean` | enables the sealed secret feature, default is `false` |
## Agent Options
| Key | Value Type | Comments |
@@ -94,6 +96,16 @@ There are several kinds of Kata configurations and they are listed below.
| `io.katacontainers.config.hypervisor.enable_guest_swap` | `boolean` | enable swap in the guest |
| `io.katacontainers.config.hypervisor.use_legacy_serial` | `boolean` | uses legacy serial device for guest's console (QEMU) |
| [Using kata-deploy](#kata-deploy-installation) | The preferred way to deploy the Kata Containers distributed binaries on a Kubernetes cluster | **No!** | Best way to give it a try on kata-containers on an already up and running Kubernetes cluster. |
| [Using official distro packages](#official-packages) | Kata packages provided by Linux distributions official repositories | yes | Recommended for most users. |
| [Using snap](#snap-installation) | Easy to install | yes | Good alternative to official distro packages. |
| [Automatic](#automatic-installation) | Run a single command to install a full system | **No!** | For those wanting the latest release quickly. |
| [Manual](#manual-installation) | Follow a guide step-by-step to install a working system | **No!** | For those who want the latest release with more control. |
| [Build from source](#build-from-source-installation) | Build the software components manually | **No!** | Power users and developers only. |
@@ -42,12 +41,6 @@ Kata packages are provided by official distribution repositories for:
| [CentOS](centos-installation-guide.md) | 8 |
| [Fedora](fedora-installation-guide.md) | 34 |
### Snap Installation
The snap installation is available for all distributions which support `snapd`.
[Use snap](snap-installation-guide.md) to install Kata Containers from https://snapcraft.io.
### Automatic Installation
[Use `kata-manager`](/utils/README.md) to automatically install a working Kata Containers system.
| [Using kata-deploy](#kata-deploy-installation) | The preferred way to deploy the Kata Containers distributed binaries on a Kubernetes cluster | **No!** | Best way to give it a try on kata-containers on an already up and running Kubernetes cluster. | Yes |
| [Using official distro packages](#official-packages) | Kata packages provided by Linux distributions official repositories | yes | Recommended for most users. | No |
| [Using snap](#snap-installation) | Easy to install | yes | Good alternative to official distro packages. | No |
| [Automatic](#automatic-installation) | Run a single command to install a full system | **No!** | For those wanting the latest release quickly. | No |
| [Manual](#manual-installation) | Follow a guide step-by-step to install a working system | **No!** | For those who want the latest release with more control. | No |
| [Build from source](#build-from-source-installation) | Build the software components manually | **No!** | Power users and developers only. | Yes |
@@ -36,8 +35,6 @@ architectures:
Follow the [`kata-deploy`](../../tools/packaging/kata-deploy/README.md).
### Official packages
`ToDo`
### Snap Installation
`ToDo`
### Automatic Installation
`ToDo`
### Manual Installation
@@ -49,14 +46,14 @@ Follow the [`kata-deploy`](../../tools/packaging/kata-deploy/README.md).
* Download `Rustup` and install `Rust`
> **Notes:**
> Rust version 1.62.0 is needed
> For Rust version, please set `RUST_VERSION` to the value of `languages.rust.meta.newest-version key` in [`versions.yaml`](../../versions.yaml) or, if `yq` is available on your system, run `export RUST_VERSION=$(yq read versions.yaml languages.rust.meta.newest-version)`.
Example for `x86_64`
```
$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.