kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2026-03-01 18:22:12 +00:00

Author	SHA1	Message	Date
Markus Rudy	884b217acf	Revert "runtime: generate proto files" This reverts commit `f0c61dc217`.	2025-07-31 14:52:40 +02:00
Markus Rudy	aec350f40a	ci: add codegen to static-checks Signed-off-by: Markus Rudy <mr@edgeless.systems> Fixes: #11631 Co-authored-by: Steve Horsman <steven@uk.ibm.com>	2025-07-31 14:49:26 +02:00
Markus Rudy	418e4d4926	tools: add image for Go proto bindings In order to have a reproducible code generation process, we need to pin the versions of the tools used. This is accomplished easiest by generating inside a container. This commit adds a container image definition with fixed dependencies for Golang proto/ttrpc code generation, and changes the agent Makefile to invoke the update-generated-proto.sh script from within that container. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-07-31 14:35:07 +02:00
Markus Rudy	f0c61dc217	runtime: generate proto files The generated Go bindings for the agent are out of date. This commit was produced by running src/agent/src/libs/protocols/hack/update-generated-proto.sh with protobuf compiler versions matching those of the last run, according to the generated code comments. Since there are new RPC methods, those needed to be added to the HybridVSockTTRPCMockImp. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-07-30 10:06:38 +02:00
Saul Paredes	1aaaef2134	Merge pull request #11553 from microsoft/danmihai1/genpolicy-cleanup genpolicy: reduce complexity	2025-07-28 14:32:59 -07:00
Dan Mihai	c11c972465	genpolicy: config layer logging clean-up Use a simple debug!() for logging the config_layer string, instead of transcoding, etc. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:13 +00:00
Dan Mihai	30bfa2dfcc	genpolicy: use CoCo settings by default - "confidential_emptyDir" becomes "emptyDir" in the settings file. - "confidential_configMap" becomes "configMap" in settings. - "mount_source_cpath" becomes "cpath". - The new "root_path" gets used instead of the old "cpath" to point to the container root path.. - "confidential_guest" is no longer used. By default it gets replaced by "enable_configmap_secret_storages"=false, because CoCo is using CopyFileRequest instead of the Storage data structures for ConfigMap and/or Secret volume mounts during CreateContainerRequest. - The value of "guest_pull" becomes true by default. - "image_layer_verification" is no longer used - just CoCo's guest pull is supported. - The Request input files from unit tests are changing to reflect the new default settings values described above. - tests/integration/kubernetes/tests_common.sh adjusts the settings for platforms that are not set-up for CoCo during CI (i.e., platforms other than SNP, TDX, and CoCo Dev). Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:13 +00:00
Dan Mihai	94995d7102	genpolicy: skip pulling layers for guest-pull Skip pulling container image layers when guest-pull=true. The contents of these layers were ignored due to: - #11162, and - tarfs snapshotter support having been removed from genpolicy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:13 +00:00
Dan Mihai	f6016f4f36	genpolicy: remove tarfs snapshotter support AKS Confidential Containers are using the tarfs snapshotter. CoCo upstream doesn't use this snapshotter, so remove this Policy complexity from upstream. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-07-28 18:30:10 +00:00
Steve Horsman	077c59dd1f	Merge pull request #11385 from wainersm/ci_make_coco_nontee_required ci/gatekeeper: make run-k8s-tests-coco-nontee job required	2025-07-28 14:16:23 +01:00
Steve Horsman	74fba9c736	Merge pull request #11619 from kata-containers/install-dependencies-gh-cli ci: Try passing api token into githubh api call	2025-07-28 13:35:12 +01:00
Xuewei Niu	2a3c8b04df	Merge pull request #11613 from RuoqingHe/clippy-fix-for-libs-20250721 mem-agent: Ignore Cargo.lock	2025-07-28 17:45:29 +08:00
RuoqingHe	3f46347dc5	Merge pull request #11618 from RuoqingHe/fix-dragonball-default-build dragonball: Fix warnings in default build	2025-07-28 11:24:46 +08:00
Xuewei Niu	e5d5768c75	Merge pull request #11626 from RuoqingHe/bump-cloud-hypervisor-v47 versions: Upgrade to Cloud Hypervisor v47.0	2025-07-28 10:34:45 +08:00
Ruoqing He	4ca6c2d917	mem-agent: Ignore Cargo.lock `mem-agent` here is now a library and do not contain examples, ignore Cargo.lock to get rid of untracked file noise produced by `cargo run` or `cargo test`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-28 10:32:46 +08:00
Ruoqing He	3ec10b3721	runtime: clh: Re-generate client code against v47.0 Re-generates the client code against Cloud Hypervisor v47.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 20:44:14 +02:00
Ruoqing He	14e9d2c815	versions: Upgrade to Cloud Hypervisor v47.0 Details of v47.0 release can be found in our roadmap project as iteration v47.0: https://github.com/orgs/cloud-hypervisor/projects/6. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 20:42:24 +02:00
Xuewei Niu	6f6d64604f	Merge pull request #11598 from justxuewei/cgroups	2025-07-25 17:53:03 +08:00
Hyounggyu Choi	860779c4d9	Merge pull request #11621 from Apokleos/enhance-copyfile runtime-rs: Some extra work to enhance copyfile with sharedfs disabled	2025-07-25 11:27:03 +02:00
Ruoqing He	639273366a	dragonball: Gate `MmapRegion` behind `virtio-fs` `MmapRegion` is only used while `virtio-fs` is enabled during testing dragonball, gate the import behind `virtio-fs` feature. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 09:09:35 +00:00
Ruoqing He	2e81ac463a	dragonball: Allow unused to suppress warnings Some variables went unused if certain features are not enabled, use `#[allow(unused)]` to suppress those warnings at the time being. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 09:07:19 +00:00
Ruoqing He	5f7da1ccaa	dragonball: Silence never read fields Some fields in structures used for testing purpose are never read, rename to send out the message. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 09:07:19 +00:00
Ruoqing He	225e6fffbc	dragonball: Gate `VcpuManagerError` behind `host-device` `VcpuManagerError` is only needed when `host-device` feature is enabled, gate the import behind that feature. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 09:07:19 +00:00
Ruoqing He	0502b05718	dragonball: Remove `with-serde` feature assertion Code inside `test_mac_addr_serialization_and_deserialization` test does not actually require this `with-serde` feature to test, removing the assertion here to enable this test. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-25 09:05:55 +00:00
Xuewei Niu	60e3679eb7	runtime-rs: Add full cgroups support on host Add full cgroups support on host. Cgroups are managed by `FsManager` and `SystemdManager`. As the names impies, the `FsManager` manages cgroups through cgroupfs, while the `SystemdManager` manages cgroups through systemd. The two manages support cgroup v1 and cgroup v2. Two types of cgroups path are supported: 1. For colon paths, for example "foo.slice:bar:baz", the runtime manages cgroups by `SystemdManager`; 2. For relative/absolute paths, the runtime manages cgroups by `FsManager`. vCPU threads are added into the sandbox cgroups in cgroup v1 + cgroupfs, others, cgroup v1 + systemd, cgroup v2 + cgroupfs, cgroup v2 + systemd, VMM process is added into the cgroups. The systemd doesn't provide a way to add thread to a unit. `add_thread()` in `SystemdManager` is equivalent to `add_process()`. Cgroup v2 supports threaded mode. However, we should enable threaded mode from leaf node to the root node (`/`) iteratively [1]. This means the runtime needs to modify the cgroups created by container runtime (e.g. containerd). Considering cgroupfs + cgroup v2 is not a common combination, its behavior is aligned with systemd + cgroup v2, which is not allowed to manage process at the thread level. 1: https://www.kernel.org/doc/html/v4.18/admin-guide/cgroup-v2.html#threads Fixes: #11356 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-25 14:52:55 +08:00
alex.lyn	613dba6f1f	runtime-rs: Some extra work to enhance copyfile with sharedfs disabled As some reasons, it first should make it align with runtime-go, this commit will do this work. Fixes #11543 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-25 11:39:20 +08:00
Xuewei Niu	6aa3517393	tests: Prevent the shim from being killed in k8s-oom test The actual memory usage on the host is equal to the hypervisor memory usage plus the user memory usage. An OOM killer might kill the shim when the memory limit on host is same with that of container and the container consumes all available memory. In this case, the containerd will never receive OOM event, but get "task exit" event. That makes the `k8s-oom.bats` test fail. The fix is to add a new container to increase the sandbox memory limit. When the container "oom-test" is killed by OOM killer, there is still available memory for the shim, so it will not be killed. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-24 23:44:21 +08:00
Steve Horsman	c762a3dd4f	Merge pull request #11372 from kata-containers/dependabot/cargo/src/dragonball/openssl-af8515b6e0 build(deps): bump the openssl group across 4 directories with 1 update	2025-07-24 13:27:24 +01:00
Fupan Li	fdbe549368	Merge pull request #11547 from Apokleos/virtio-scsi runtime-rs: support block device driver virtio-scsi within qemu-rs	2025-07-24 18:02:11 +08:00
Xuewei Niu	635272f3e8	runtime-rs: Ignore SIGTERM signal in shim When enabling systemd cgroup driver and sandbox cgroup only, the shim is under a systemd unit. When the unit is stopping, systemd sends SIGTERM to the shim. The shim can't exit immediately, as there are some cleanups to do. Therefore, ignoring SIGTERM is required here. The shim should complete the work within a period (Kata sets it to 300s by default). Once a timeout occurs, systemd will send SIGKILL. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-24 17:15:15 +08:00
Xuewei Niu	79f29bc523	runtime-rs: QEMU get_thread_ids() returns real vCPU's tids The information is obtained through QMP query_cpus_fast. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-24 17:15:15 +08:00
stevenhorsman	475baf95ad	ci: Try passing api token into githubh api call Our CI keeps on getting ``` jq: error (at <stdin>:1): Cannot index string with string "tag_name" ``` during the install dependencies phase, which I suspect might be due to github rate limits being reduced, so try to pass through the `GH_TOKEN` env and use it in the auth header. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-24 08:49:32 +01:00
alex.lyn	b40d65bc1b	runtime-rs: support block device driver virtio-scsi within qemu-rs It is important that we continue to support VirtIO-SCSI. While VirtIO-BLK is a common choice, virtio-scsi offers significant performance advantages in specific scenarios, particularly when utilizing iothreads and with NVMe Fabrics. Maintaining Flexibility and Choice by supporting both virtio-blk and virtio-scsi, we provide greater flexibility for users to choose the optimal storage（virtio-blk, virtio-scsi) interface based on their specific workload requirements and hardware configurations. As virtio-scsi controller has been created when qemu vm starts with block device driver is set to `virtio-scsi`. This commit is for blockdev_add the backend block device and device_add frondend virtio-scsi device via qmp. Fixes #11516 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 14:00:02 +08:00
alex.lyn	e683a7fd37	runtime-rs: Change the device_id with block device index As block device index is an very important unique id of a block device and can indicate a block device which is equivalent to device_id. In case of index is required in calculating scsi LUN and reduce useless arguments within reusing `hotplug_block_device`, we'd better change the device_id with block device index. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	4521cae0c0	runtime-rs: Support AIO for hotplugging block device within qemu In this commit, block device aio are introduced within hotplug_block_device within qemu via qmp and the "iouring" is set the default. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	b4d276bc2b	runtime-rs: Handle virtio-scsi within device manager It should be correctly handled within the device manager when do create_block_device if the driver_option is virtio-scsi. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	fbd84fd3f4	runtime-rs: Support virtio-scsi device within handle_block_volume It supports handling scsi device when block device driver is `scsi`. And it will ensure a correct storage source with LUN. Fixes #11516 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	57645c0786	runtime-rs: Add support for block device AIO In this commit, three block device aio modes are introduced and the "iouring" is set the default. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	40e6aacc34	runtime-rs: Introduce scsi_addr within BlockConfig for SCSI devices It's used to help discover scsi devices inside guest and also add a new const value `KATA_SCSI_DEV_TYPE` to help pass information. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:57:00 +08:00
alex.lyn	125383e53c	runtime-rs: Add support for configurable block device aio AIO is the I/O mechanism used by qemu with options: - threads Pthread based disk I/O. - native Native Linux I/O. - io_uring (default mode) Linux io_uring API. This provides the fastest I/O operations on Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-24 11:56:52 +08:00
dependabot[bot]	ef9d960763	build(deps): bump the openssl group across 4 directories with 1 update Bumps the openssl group with 1 update in the /src/dragonball directory: [openssl](https://github.com/sfackler/rust-openssl). Bumps the openssl group with 1 update in the /src/runtime-rs directory: [openssl](https://github.com/sfackler/rust-openssl). Bumps the openssl group with 1 update in the /src/tools/genpolicy directory: [openssl](https://github.com/sfackler/rust-openssl). Bumps the openssl group with 1 update in the /src/tools/kata-ctl directory: [openssl](https://github.com/sfackler/rust-openssl). Updates `openssl` from 0.10.72 to 0.10.73 - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73) Updates `openssl` from 0.10.72 to 0.10.73 - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73) Updates `openssl` from 0.10.72 to 0.10.73 - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73) Updates `openssl` from 0.10.72 to 0.10.73 - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.72...openssl-v0.10.73) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.73 dependency-type: indirect update-type: version-update:semver-patch dependency-group: openssl - dependency-name: openssl dependency-version: 0.10.73 dependency-type: indirect update-type: version-update:semver-patch dependency-group: openssl - dependency-name: openssl dependency-version: 0.10.73 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: openssl - dependency-name: openssl dependency-version: 0.10.73 dependency-type: indirect update-type: version-update:semver-patch dependency-group: openssl ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-23 15:17:12 +00:00
Fabiano Fidêncio	58925714d2	Merge pull request #11579 from Apokleos/fix-hotplug-blk runtime-rs: Support hotplugging host block devices within qemu-rs	2025-07-23 11:10:04 +02:00
alex.lyn	a12ae58431	runtime-rs: Support hotplugging host block devices within qemu-rs Although Previous implementation of hotplugging block device via QMP can successfully hot-plug the regular file based block device, but it fails when the backend is /dev/xxx(e.g. /dev/loop0). With analysis about it, we can know that it lacks the ablility to hotplug host block devices. This commit will fill the gap, and make it work well for host block devices. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-22 15:40:03 +08:00
Fabiano Fidêncio	acae4480ac	Merge pull request #11604 from fidencio/release/3.19.1 release: Bump version to 3.19.1	2025-07-22 09:00:15 +02:00
Fabiano Fidêncio	0220b4d661	release: Bump version to 3.19.1 As there were a few moderate security vulnerability fixes missed as part of the 3.19.0 release. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-21 20:09:21 +02:00
Steve Horsman	09efcfbd86	Merge pull request #11606 from kata-containers/dependabot/cargo/src/tools/genpolicy/zerocopy-0.6.6 build(deps): bump zerocopy from 0.6.1 to 0.6.6 in /src/tools/genpolicy	2025-07-21 18:58:56 +01:00
Steve Horsman	9f04d8e121	Merge pull request #11605 from kata-containers/dependabot/cargo/src/tools/kata-ctl/unsafe-libyaml-0.2.11 build(deps): bump unsafe-libyaml from 0.2.9 to 0.2.11 in /src/tools/kata-ctl	2025-07-21 18:50:01 +01:00
dependabot[bot]	a9c8377073	build(deps): bump zerocopy from 0.6.1 to 0.6.6 in /src/tools/genpolicy --- updated-dependencies: - dependency-name: zerocopy dependency-version: 0.6.6 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-21 12:50:38 +00:00
dependabot[bot]	0b4c434ece	build(deps): bump unsafe-libyaml in /src/tools/kata-ctl Bumps [unsafe-libyaml](https://github.com/dtolnay/unsafe-libyaml) from 0.2.9 to 0.2.11. - [Release notes](https://github.com/dtolnay/unsafe-libyaml/releases) - [Commits](https://github.com/dtolnay/unsafe-libyaml/compare/0.2.9...0.2.11) --- updated-dependencies: - dependency-name: unsafe-libyaml dependency-version: 0.2.11 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-21 12:46:27 +00:00
Fabiano Fidêncio	35629d0690	Merge pull request #11603 from stevenhorsman/security-updates-21-jul dependencies: More crate bumps to resolve security issues	2025-07-21 14:33:07 +02:00
stevenhorsman	162ba19b85	agent-ctl: Bump rusttls Bump rusttls to >=0.23.18 to remediate RUSTSEC-2024-0399 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-21 10:41:59 +01:00
stevenhorsman	42339e9cdf	dragonball: Update url crate Update url to 2.5.4 to bump idna to 1.0.3 and remediate RUSTSEC-2024-0421 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-21 10:35:05 +01:00
stevenhorsman	1795361589	runk: Update rustjail Update the rustjail crate to pull in the latest security fixes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-21 10:31:18 +01:00
stevenhorsman	28929f5b3e	runtime: Bump promethus Bump this crate to remove the old version of protobuf and remediate RUSTSEC-2024-0437 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-21 10:29:57 +01:00
stevenhorsman	e66aa1ef8c	runtime: Bump promethus and ttrpc-codegen Bump these crates to remove the old version of protobuf and remediate RUSTSEC-2024-0437 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-21 10:29:39 +01:00
Fabiano Fidêncio	d60513ece9	Merge pull request #11597 from kata-containers/topic/fix-release-static-tarball-content release: Copy the VERSION file to the tarball	2025-07-20 21:06:40 +02:00
Fabiano Fidêncio	55aae75ed7	shellcheck: Fix issues on kata-deploy-merge-builds.sh As we're already touching the file, let's get those fixed. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-20 09:33:50 +02:00
Fabiano Fidêncio	aaeb3b3221	release: Copy the VERSION file to the tarball For the release itself, let's simply copy the VERSION file to the tarball. To do so, we had to change the logic that merges the build, as at that point the tag is not yet pushed to the repo. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-20 00:06:14 +02:00
Fabiano Fidêncio	21ccaf4a80	Merge pull request #11596 from fidencio/release/v3.19.0 release: Bump version to 3.19.0	2025-07-19 18:27:36 +02:00
Fabiano Fidêncio	60f312b4ae	release: Bump version to 3.19.0 Bump VERSION and helm-chart versions Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-19 09:11:30 +02:00
Fabiano Fidêncio	1351ccb2de	Merge pull request #11576 from Tim-Zhang/update-protobuf-to-fix-CVE-2025-53605 chore: Update protobuf to fix CVE-2025-53605	2025-07-19 07:43:13 +02:00
Fabiano Fidêncio	7f5f032aca	runtime-rs: Update containerd-shim / containerd-shim-protos Let's bump those to their 0.10.0 releases, which contain fixes for the CVE-2025-53605. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-19 00:18:01 +02:00
Fabiano Fidêncio	6dc4c0faae	Merge pull request #11589 from fidencio/topic/fix-tdx-qemu-path-for-non-gpu qemu: tdx: Fix binary path for non-gpu TDX	2025-07-18 17:24:00 +02:00
Tim Zhang	2fe9df16cc	gent-ctl: update Cargo.lock to fix CVE-2025-53605 Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/392 Fixes: #11570 Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 16:13:25 +02:00
Tim Zhang	45b44742de	genpolicy: update Cargo.lock to fix CVE-2025-53605 Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/394 Fixes: #11570 Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 16:10:52 +02:00
Tim Zhang	fa9ff1b299	kata-ctl: update prometheus/protobuf to fix CVE-2025-53605 Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/395 Fixes: #11570 Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 16:05:13 +02:00
Tim Zhang	d0e7a51f7b	dragonball: update prometheus/protobuf to fix CVE-2025-53605 Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/396 Fixes: #11570 Signed-off-by: Tim Zhang <tim@hyper.sh>	2025-07-18 16:02:29 +02:00
Tim Zhang	222393375a	agent: update ttrpc-codegen to remove dependency on protobuf v2 To fix CVE-2025-53605. Fixes: https://github.com/kata-containers/kata-containers/security/dependabot/397 Fixes: #11570 Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 16:02:07 +02:00
Fabiano Fidêncio	60c3d89767	Merge pull request #11558 from gmintoco/feature/helm-nodeSelector helm: add nodeSelector support to kata-deploy chart	2025-07-18 15:52:19 +02:00
Fabiano Fidêncio	3143787f69	qemu: tdx: Fix binary path for non-gpu TDX On commit `90bc749a19`, we've changed the QEMUTDXPATH in order to get it to work with GPUs, but the change broke the non-GPU TDX use-case, which depends on the distro binary. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 15:26:27 +02:00
Fabiano Fidêncio	497a3620c2	tests: Remove references to qemu-sev As it's been removed from our codebase. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 12:49:54 +02:00
Fabiano Fidêncio	17ce44083c	runtime: Remove reference to sev package Otherwise it'll just break static checks. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-18 12:49:54 +02:00
Gus Minto-Cowcher	3b5cd2aad6	helm: remove qemu-sev references qemu-sev support has been removed, but those bits were left behind by mistake. Signed-off-by: Gus Minto-Cowcher <gus@basecamp-research.com>	2025-07-18 12:49:54 +02:00
Gus Minto-Cowcher	41d41d51f7	helm: add nodeSelector support to kata-deploy chart - Add nodeSelector configuration to values.yaml with empty default - Update DaemonSet template to conditionally include nodeSelector - Add documentation and examples for nodeSelector usage in README - Allows users to restrict kata-containers deployment to specific nodes by labeling them Signed-off-by: Gus Minto-Cowcher <gus@basecamp-research.com>	2025-07-18 12:49:54 +02:00
Fabiano Fidêncio	7d709a0759	Merge pull request #11493 from stevenhorsman/agent-ctl-tag-cache ci: cache: Tag agent-ctl cache	2025-07-18 12:12:46 +02:00
Fabiano Fidêncio	4a6c718f23	Merge pull request #11584 from zvonkok/fix-kernel-debug-enabled kernel: fix enable kernel debug	2025-07-18 11:38:36 +02:00
Sumedh Alok Sharma	47184e82f5	Merge pull request #11313 from Ankita13-code/ankitapareek/exec-id-agent-fix agent: update the processes hashmap to use exec_id as primary key	2025-07-18 14:07:15 +05:30
Fabiano Fidêncio	d9daddce28	Merge pull request #11578 from justxuewei/vsock-async runtime-rs: Fix the issue of blocking socket with Tokio	2025-07-18 10:13:03 +02:00
Xuewei Niu	629c942d4b	runtime-rs: Fix the issue of blocking socket with Tokio According to the issue [1], Tokio will panic when we are giving a blocking socket to Tokio's `from_std()` method, the information is as follows: ``` A panic occurred at crates/agent/src/sock/vsock.rs:59: Registering a blocking socket with the tokio runtime is unsupported. If you wish to do anyways, please add `--cfg tokio_allow_from_blocking_fd` to your RUSTFLAGS. ``` A workaround is to set the socket to non-blocking. 1: https://github.com/tokio-rs/tokio/issues/7172 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-18 10:55:48 +08:00
Xuewei Niu	1508e6f0f5	agent: Bump Tokio to v1.46.1 Tokio now has a newer version, let us bump it. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-18 10:55:48 +08:00
Xuewei Niu	5a4050660a	runtime-rs: Bump Tokio to v1.46.1 Tokio now has a newer version, let us bump it. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-07-18 10:55:48 +08:00
Zvonko Kaiser	a786dc48b0	kernel: fix enable kernel debug The KERNEL_DEBUG_ENABLED was missing in the outer shell script so overrides via make were not possible. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-18 02:24:19 +00:00
Fabiano Fidêncio	eb2bfbf7ac	Merge pull request #11572 from stevenhorsman/RUSTSEC-2024-0384-remediate More crate bumps for security remediations	2025-07-17 22:35:05 +02:00
Zvonko Kaiser	cef9485634	Merge pull request #11450 from kata-containers/dependabot/cargo/src/agent/nix-0.27.1 build(deps): bump nix to 0.26.4 in agent, libs, runtime-rs	2025-07-17 14:22:40 -04:00
stevenhorsman	41a608e5ce	tools: Bump borsh, liboci-cli and oci-spec Bump these crates to remove the unmaintained dependency proc-macro-error and remediate RUSTSEC-2024-0370 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 18:23:19 +01:00
stevenhorsman	e56f493191	deps: Bump zbus, serial_test & async-std Bump these crates across various components to remove the dependency on unmaintained instant crate and remediate RUSTSEC-2024-0384 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 18:23:19 +01:00
stevenhorsman	bb820714cb	agent-ctl: Update borsh - Update borsh to remove the unmaintained dependency proc-macro-error and remediate RUSTSEC-2024-0370 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 18:23:19 +01:00
Steve Horsman	549fd2a196	Merge pull request #11581 from stevenhorsman/osv-scanner-action-permissions-fix workflow: Fix osv-scanner action	2025-07-17 18:18:16 +01:00
stevenhorsman	a7e27b9b68	workflow: Fix osv-scanner action - The github generated template had an old version which isn't valid for the pr-scan, so update to the latest - The action needs also `actions: read` and `contents:read` to run in kata-containers Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 17:29:35 +01:00
Steve Horsman	8741f2ab3d	Merge pull request #11580 from kata-containers/osv-scanner-action workflow: Add osv-scanner action	2025-07-17 17:00:34 +01:00
stevenhorsman	1a75c12651	workflow: Add osv-scanner action Add action to check for vulnerabilities in the project and on each PR Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 16:41:56 +01:00
stevenhorsman	4c776167e5	trace-forwarder: Add nix features Some of the nix apis we are using are now enabled by features, so add these to resolve the compilation issues Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 15:09:21 +01:00
dependabot[bot]	cd79108c77	build(deps): bump nix in /src/tools/trace-forwarder Bumps [nix](https://github.com/nix-rust/nix) from 0.23.1 to 0.30.1. - [Changelog](https://github.com/nix-rust/nix/blob/master/CHANGELOG.md) - [Commits](https://github.com/nix-rust/nix/compare/v0.23.1...v0.30.1) --- updated-dependencies: - dependency-name: nix dependency-version: 0.30.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-17 15:09:06 +01:00
stevenhorsman	9185ef1a67	runtime-rs: Bump nix to matching version runtime-rs needs the same version as libs, so sync this up as well. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 15:08:46 +01:00
dependabot[bot]	219ad505c2	build(deps): bump nix from 0.24.3 to 0.26.4 in /src/agent Nix needs to be in sync between libs and agent, so bump the agent to the libs version Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-17 15:01:06 +01:00
dependabot[bot]	a4d22fe330	build(deps): bump nix from 0.24.2 to 0.26.4 in /src/libs --- updated-dependencies: - dependency-name: nix dependency-version: 0.26.4 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2025-07-17 15:01:06 +01:00
Fabiano Fidêncio	6dabb3683f	Merge pull request #10961 from zvonkok/shellcheck-zero shellcheck: fix kernel/build.sh	2025-07-17 12:59:00 +02:00
Steve Horsman	405f5283f0	Merge pull request #11573 from arvindskumar99/versions_comment OVMF: Making comment in versions.yaml for SEV-SNP	2025-07-17 10:11:58 +01:00
Fabiano Fidêncio	32d40849fa	Merge pull request #11577 from Xynnn007/bump-gc deps(chore): bump guest-components to candidate v0.14.0	2025-07-17 11:08:36 +02:00
Zvonko Kaiser	ca4f96ed00	shellcheck: fix kernel/build.sh Refactor code to make shellcheck happy Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-17 10:15:41 +02:00
Xynnn007	82b890349d	deps(chore): bump guest-components to candidate v0.14.0 This new version of gc fixes s390x attestation, also introduces registry configuration setting directly via initdata. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-07-17 10:19:02 +08:00
stevenhorsman	51f41b1669	ci: cache: Tag agent-ctl cache The peer pods project is using the agent-ctl tool in some tests, so tagging our cache will let them more easily identify development versions of kata for testing between releases. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-16 11:32:33 +01:00
Fupan Li	75d23b8884	Merge pull request #11504 from lifupan/fix_fd_leak agent: fix the issue of parent writer pipe fd leak	2025-07-16 18:29:24 +08:00
Fupan Li	83f54eec52	agent: fix the issue of parent writer pipe fd leak Sometimes, containers or execs do not use stdin, so there is no chance to add parent stdin to the process's writer hashmap, resulting in the parent stdin's fd not being closed when the process is cleaned up later. Therefore, when creating a process, first explicitly add parent stdin to the wirter hashmap. Make sure that the parent stdin's fd can be closed when the process is cleaned up later. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-16 16:15:31 +08:00
Fupan Li	752c8b611e	Merge pull request #11575 from Tim-Zhang/fix-runk-build runk: Fix build errors	2025-07-16 15:23:58 +08:00
Arvind Kumar	2a52351822	OVMF: Making comment in versions.yaml for SEV-SNP Adding comment to versions.yaml to indicate that the ovmf-sev is also used by AMD SEV-SNP, as per the discussion in https://github.com/kata-containers/kata-containers/pull/11561. Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-07-16 06:35:21 +02:00
Tim Zhang	c8183a2c14	runk: rename imported crate from users to uzers To adapt the new crate name and fix build errors introduced in the commit `39f51b4c6d` Fixes: #11574 Signed-off-by: Tim Zhang <tim@hyper.sh>	2025-07-16 11:35:39 +08:00
Fabiano Fidêncio	9cebbab29d	Merge pull request #11335 from zvonkok/fix-kata-deploy.sh gpu: Fix kata deploy.sh	2025-07-15 19:50:44 +02:00
Fabiano Fidêncio	c8b7a51d72	Merge pull request #11082 from zvonkok/debug-kernel kernel: debug config	2025-07-15 19:04:15 +02:00
Zvonko Kaiser	c56c896fc6	qemu: remove the experimental suffix for qemu-snp We switched to vanilla QEMU for the CPU SNP use-case. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-15 16:49:58 +02:00
Zvonko Kaiser	a282fa6865	gpu: Add TDX related runtime adjustments We have the QEMU adjustments for SNP but missing those for TDX Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-15 16:49:56 +02:00
Zvonko Kaiser	0d2993dcfd	kernel: bump kernel version Obligatory kernel version bump Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-15 16:48:23 +02:00
Zvonko Kaiser	a4597672c0	kernel: Add KERNEL_DEBUG_ENABLED to build scripts We want to be able to build a debug version of the kernel for various use-cases like debugging, tracing and others. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-15 16:48:03 +02:00
Fabiano Fidêncio	b7af7f344b	Merge pull request #11569 from Xynnn007/bump-coco deps(chore): update guest-components and trustee	2025-07-15 16:34:23 +02:00
Fabiano Fidêncio	aac555eeff	Merge pull request #11571 from fidencio/topic/fix-nvidia-gpu-initrd-cache build: Fix cache for nvidia-gpu-initrd builds	2025-07-15 16:28:03 +02:00
Fabiano Fidêncio	4415a47fff	Merge pull request #11557 from Apokleos/fix-initdata runtime-rs: Fix initdata length field missing when create block	2025-07-15 16:22:45 +02:00
Fabiano Fidêncio	11c744c5c3	Merge pull request #11567 from zvonkok/remove-gpu-admin-tools Remove gpu admin tools	2025-07-15 15:11:56 +02:00
Fabiano Fidêncio	fa7598f6ec	Merge pull request #11568 from zvonkok/tdx-qemu-path gpu: Add proper TDX config path	2025-07-15 14:54:13 +02:00
Fabiano Fidêncio	3e86f3a95c	build: Rename rootfs-nvidia-* to fix cache issues The convention for rootfs-* names is: * rootfs-${image_type}-${special_build} If this is not followed, cache will never work as expected, leading to building the initrd / image on every single build, which is specially constly when building the nvidia specific targets. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-15 14:48:45 +02:00
alex.lyn	56c0c172fa	runtime-rs: Fix initdata length field missing when create block The init data could not be read properly within kata-agent because the data length field was omitted, a consequence of a mismatch in the data write format. Fixes #11556 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-15 19:22:17 +08:00
Fabiano Fidêncio	b76efa2a25	Merge pull request #11564 from BbolroC/make-qemu-coco-dev-s390x-required ci: Make qemu-coco-dev for s390x (zVSI) required again	2025-07-15 12:04:18 +02:00
Xynnn007	4da31bf2f9	agent: deliver initdata toml to attestation agent Now AA supports to receive initdata toml plaintext and deliver it in the attestation. This patch creates a file under '/run/confidential-containers/initdata' to store the initdata toml and give it to AA process. When we have a separate component to handle initdata, we will move the logic to that component. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-07-15 17:10:56 +08:00
Steve Horsman	d219fc20e1	Merge pull request #11555 from stevenhorsman/rust-advisory-fixes-pre-3.19.0 Rust advisory fixes pre 3.19.0	2025-07-15 09:11:33 +01:00
Hui Zhu	3577e4bb43	Merge pull request #11480 from teawater/update_ma mem-agent: Update to https://github.com/teawater/mem-agent/tree/kata-20250627	2025-07-15 15:22:10 +08:00
Xynnn007	19001af1e2	deps(chore): update guest-components and trustee to the version of pre v0.14.0 Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-07-15 09:12:47 +08:00
teawater	028f25ac84	mem-agent: Update to kata-20250627 Update to https://github.com/teawater/mem-agent/tree/kata-20250627. The commit list: 3854b3a Update nix version from 0.23.2 to 0.30.1 d9a4ced Update tokio version from 1.33 to 1.45.1 9115c4d run_eviction_single_config: Simplify check evicted pages after eviction 68b48d2 get_swappiness: Use a rounding method to obtain the swappiness value 14c4508 run_eviction_single_config: Add max_seq and min_seq check with each info 8a3a642 run_eviction_single_config: Move infov update to main loop b6d30cf memcg.rs: run_aging_single_config: Fix error of last_inc_time check 54fce7e memcg.rs: Update anon eviction code 41c31bf cgroup.rs: Fix build issue with musl 0d6aa77 Remove lazy_static from dependencies a66711d memcg.rs: update_and_add: Fix memcg not work after set memcg issue cb932b1 Add logs and change some level of some logs 93c7ad8 Add per-cgroup and per-numa config support 092a75b Remove all Cargo.lock to support different versions of rust 540bf04 Update mem-agent-srv, mem-agent-ctl and mem-agent-lib to v0.2.0 81f39b2 compact.rs: Change default value of compact_sec_max to 300 c455d47 compact.rs: Fix psi_path error with cgroup v2 issue 6016e86 misc.rs: Fix log error ded90e9 Set mem-agent-srv and mem-agent-ctl as bin Fixes: #11478 Signed-off-by: teawater <zhuhui@kylinos.cn>	2025-07-15 08:57:41 +08:00
Zvonko Kaiser	90bc749a19	gpu: Add proper TDX config path This was missed during the GPU TDX experimental enablement Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-14 23:26:28 +00:00
Zvonko Kaiser	da17b06d28	gpu: Pin toolkit version New versions have incompatibilites, pin toolkit to a working version Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-14 22:07:21 +00:00
Zvonko Kaiser	97a4a1574e	gpu: Remove gpu-admin-tools NVRC got a new feature reading the CC mode directly from register Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-14 21:59:31 +00:00
stevenhorsman	18597588c0	agent: Bump cdi version Bump cdi version to the pick up fixes to: - RUSTSEC-2025-0024 - RUSTSEC-2025-0023 - RUSTSEC-2024-0370 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-14 16:54:30 +01:00
stevenhorsman	661d88b11f	versions: Bump oci-spec Try bumping oci-spec to 0.8.1 as it included fixes for vulnerabilities including RUSTSEC-2024-0370 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-14 16:54:30 +01:00
Fabiano Fidêncio	579d373623	Merge pull request #11521 from stevenhorsman/idna-1.0.4-bump versions: Bump idna crate to >= 1.0.3	2025-07-14 17:39:30 +02:00
Fabiano Fidêncio	f5decea13e	Merge pull request #11550 from stevenhorsman/runtime-rs-bump-chrono-0.4.41 runtime-rs \| trace-forwarder: Bump chrono crate version	2025-07-14 16:45:58 +02:00
Steve Horsman	0fa2cd8202	Merge pull request #11519 from wainersm/tests_teardown_common tests/k8s: instrument some tests for debugging	2025-07-14 13:20:01 +01:00
Hyounggyu Choi	a224b4f9e4	ci: Make qemu-coco-dev for s390x (zVSI) required again As the following job has passed 10 days in a row for the nightly test: ``` kata-containers-ci-on-push / run-k8s-tests-on-zvsi / run-k8s-tests (nydus, qemu-coco-dev, kubeadm) ``` this commit makes the job required again. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-07-14 11:03:54 +02:00
Wainer dos Santos Moschetta	f0f1974e14	tests/k8s: call teardown_common in k8s-parallel.bats The teardown_common will print the description of the running pods, kill them all and print the system's syslogs afterwards. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-12 10:13:51 +01:00
Wainer dos Santos Moschetta	8dfeed77cd	tests/k8s: add handler for Job in set_node() Set the node in the spec template of a Job manifest, allowing to use set_node() on tests like k8s-parallel.bats Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-12 10:13:51 +01:00
Wainer dos Santos Moschetta	806d63d1d8	tests/k8s: call teardown_common in k8s-credentials-secrets.bats The teardown_common will print the description of the running pods, kill them all and print the system's syslogs afterwards. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-12 10:13:51 +01:00
Wainer dos Santos Moschetta	c8f40fe12c	tests/k8s: call teardown_common in k8s-sandbox-vcpus-allocation.bats The teardown_common will print the description of the running pods, kill them all and print the system's syslogs afterwards. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-12 10:13:51 +01:00
Fabiano Fidêncio	4a79c2520d	Merge pull request #11491 from Apokleos/default-blk-driver runtime-rs: Change default block device driver from virtio-scsi to virtio-blk-*	2025-07-11 23:14:13 +02:00
alex.lyn	9cc14e4908	runtime-rs: Update block device driver docs within configuration The previous description for the `block_device_driver` was inaccurate or outdated. This commit updates the documentation to provide a more precise explanation of its function. Fixes #11488 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-11 17:40:58 +02:00
alex.lyn	92160c82ff	runtime-rs: Change block device driver defualt with virtio-blk-* When we run a kata pod with runtime-rs/qemu and with a default configuration toml, it will fail with error "unsupported driver type virtio-scsi". As virtio-scsi within runtime-rs is not so popular, we set default block device driver with `virtio-blk-*`. Fixes #11488 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-11 17:40:58 +02:00
Ankita Pareek	5f08cc75b3	agent: update the processes hashmap to use exec_id as primary key This patch changes the container process HashMap to use exec_id as the primary key instead of PID, preventing exec_id collisions that could be exploited in Confidential Computing scenarios where the host is less trusted than the guest. Key changes: - Changed `processes: HashMap<pid_t, Process>` to `HashMap<String, Process>` - Added exec_id collision detection in `start()` method - Updated process lookup operations to use exec_id directly - Simplified `get_process()` with direct HashMap access This prevents multiple exec operations from reusing the same exec_id, which could be problematic in CoCo use cases where process isolation and unique identification are critical for security. Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>	2025-07-11 10:10:23 +00:00
Steve Horsman	878e50f978	Merge pull request #11554 from fidencio/topic/fix-version-file-on-release gh: Fix released VERSION file	2025-07-11 09:20:06 +01:00
Fabiano Fidêncio	fb22e873cd	gh: Fix released VERSION file The `/opt/kata/VERSION` file, which is created using `git describe --tags`, requires the newly released tag to be updated in order to be accurate. To do so, let's add a `fetch-tags: true` to the checkout action used during the `create-kata-tarball` job. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-11 09:47:11 +02:00
Alex Lyn	87e41e2a09	Merge pull request #11549 from stevenhorsman/bump-remove_dir_all runtime-rs: Switch tempdir to tempfile	2025-07-11 13:46:12 +08:00
Alex Lyn	f22272b8f7	Merge pull request #11540 from Apokleos/coldplug-vfio-clh runtime-rs: Add vfio support with coldplug for cloud-hypervisor	2025-07-11 10:33:59 +08:00
RuoqingHe	7cd4e3278a	Merge pull request #11545 from RuoqingHe/remove-lockfile-for-libs libs: Remove lockfile for libs	2025-07-10 21:56:10 +08:00
stevenhorsman	c740896b1c	trace-forwarder: Bump chrono crate version Bump chrono version to drop time@0.1.43 and remediate vulnerability CVE-2020-26235 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-10 14:55:20 +01:00
stevenhorsman	3916507553	runtime-rs: Bump chrono crate version Bump chrono version to drop time@0.1.45 and remediate vulnerability CVE-2020-26235 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-10 13:47:05 +01:00
Wainer dos Santos Moschetta	3ab6a8462d	ci/gatekeeper: make run-k8s-tests-coco-nontee job required The CoCo non-TEE job (run-k8s-tests-coco-nontee) used to be required but we had to withdraw it to fix a problem (#11156). Now the job is back running and stable, so time to make it required again. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-10 12:19:19 +01:00
stevenhorsman	c5ceae887b	runtime-rs: Switch tempdir to tempfile tempdir hasn't been updated for seven years and pulls in remove_dir_all@0.5.3 which has security advisory GHSA-mc8h-8q98-g5hr, so replace this with using tempfile, which the crate got merged into and we use elsewhere in the project Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-10 12:16:35 +01:00
Ruoqing He	4039506740	libs: Ignore Cargo.lock in libs workspace Ignore Cargo.lock in `libs` to prevent developers from accidentally track lock files in `libs` workspace. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-10 09:31:45 +00:00
alex.lyn	3fbe493edc	runtime-rs: Convert host devices within VmConfig for cloud-hypervisor This PR adds support for adding a network device before starting the cloud-hypervisor VM. This commit will get the host devices from NamedHypervisorConfig and assign it to VmConfig's devices which is for vfio devices when clh starts launching. And with this, it successfully finish the vfio devices conversion from a generic Hypervisor config to a clh specific VmConfig. Signed-off-by: alex.lyn <alex.lyn@antgroup.com> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-10 16:33:43 +08:00
alex.lyn	0b5b8f549d	runtime-rs: Introduce a field host_devices within NamedHypervisorConfig This commit introduce `host_devices` to help convert vfio devices from a generic hypervisor config to a cloud-hypervisor specific VmConfig. Signed-off-by: alex.lyn <alex.lyn@antgroup.com> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-10 16:33:41 +08:00
alex.lyn	d37183d754	runtime-rs: Add vfio support with coldplug for cloud-hypervisor This PR adds support for adding a vfio device before starting the cloud-hypervisor VM (or cold-plug vfio device). This commit changes "pending_devices" for clh implementation via adding DeviceType::Vfio() into pending_devices. And it will get shared host devices after correctly handling vfio devices (Specially for primary device). Signed-off-by: alex.lyn <alex.lyn@antgroup.com> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-10 16:32:21 +08:00
Ruoqing He	ffa3a5a15e	libs: Remove Cargo.lock crates in `libs` workspace do not ship binaries, they are just libraries for other workspace to reference, the `Cargo.lock` file hence would not take effect. Removing Cargo.lock for `libs` workspace. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-07-10 03:14:55 +00:00
Fabiano Fidêncio	c68eb58f3f	Merge pull request #11529 from fidencio/topic/only-use-fixed-version-of-k0s-for-crio tests: k0s: Always use latest version, apart from CRI-O tests	2025-07-09 18:47:18 +02:00
Hyounggyu Choi	09297b7955	Merge pull request #11537 from BbolroC/set-sharedfs-to-none-for-ibm-sel runtime/runtime-rs: Set shared_fs to none for IBM SEL in config file	2025-07-09 18:30:08 +02:00
Hyounggyu Choi	bca31d5a4d	runtime/runtime-rs: Set shared_fs to none for IBM SEL in config file In line with configuration for other TEEs, shared_fs should be set to none for IBM SEL. This commit updates the value for runtime/runtime-rs. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-07-09 14:22:28 +02:00
Fabiano Fidêncio	5f17e61d11	tests: kata-deploy: Remove --wait from helm uninstall As we're using a `kubectl wait --timeout ...` to check whether the kata-deploy pod's been deleted or not, let's remove the `--wait` from the `helm uninstall ...` call as k0s tests were failing because the `kubectl wait --timeout...` was starting after the pod was deleted, making the test fail. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-09 14:01:30 +02:00
Fabiano Fidêncio	842e17b756	tests: k0s: Always use latest version, apart from CRI-O tests We've been pinning a specific version of k0s for CRI-O tests, which may make sense for CRI-O, but doesn't make sense at all when it comes to testing that we can install kata-deploy on latest k0s (and currently our test for that is broken). Let's bump to the latest, and from this point we start debugging, instead of debugging on an ancient version of the project. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-09 13:27:18 +02:00
Steve Horsman	7bc25b0259	Merge pull request #11494 from katexochen/p/opa-1.6 versions: bump opa 1.5.1 -> 1.6.0	2025-07-09 11:45:54 +01:00
Steve Horsman	967f66f677	Merge pull request #11380 from arvindskumar99/sev-deprecation Sev deprecation	2025-07-09 11:38:13 +01:00
stevenhorsman	f96b8fb690	kata-ctl: Update expected test failure message Update expected error after url crate bump Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-09 11:34:27 +01:00
stevenhorsman	b7bf46fdfa	versions: Bump idna crate to >= 1.0.4 Bump url, reqwests and idna crates in order to move away from idna <1.0.3 and remediate CVE-2024-12224. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-09 11:34:27 +01:00
Xuewei Niu	b8838140d0	Merge pull request #11527 from StevenFryto/fix-runtime-rootless-bugs runtime: Fix rootlessDir not correctly set in rootless VMM mode	2025-07-09 16:40:11 +08:00
Steve Horsman	990c4e68ee	Merge pull request #11523 from wainersm/ci_setup_kubectl workflows: adopting azure/setup-kubectl	2025-07-09 09:09:38 +01:00
stevenfryto	3c7a670129	runtime: Fix rootlessDir not correctly set in rootless VMM mode Previously, the rootlessDir variable in `src/runtime/virtcontainers/pkg/rootless.go` was initialized at package load time using `os.Getenv("XDG_RUNTIME_DIR")`. However, in rootless VMM mode, the correct value of $XDG_RUNTIME_DIR is set later during runtime using os.Setenv(), so rootlessDir remained empty. This patch defers the initialization of rootlessDir until the first call to `GetRootlessDir()`, ensuring it always reflects the current environment value of $XDG_RUNTIME_DIR. Fixes: #11526 Signed-off-by: stevenfryto <sunzitai_1832@bupt.edu.cn>	2025-07-09 09:51:48 +08:00
Wainer dos Santos Moschetta	e4da3b84a3	workflows: adopting azure/setup-kubectl There are workflows that rely on `az aks install-cli` to get kubectl installed. There is a well-known problem on install-cli, related with API usage rate limit, that has recently caused the command to fail quite often. This is replacing install-cli with the azure/setup-kubectl github action which has no such as rate limit problem. While here, removed the install_cli() function from gha-run-k8s-common.sh so avoid developers using it by mistake in the future. Fixes #11463 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-07-08 15:15:54 -03:00
Alex Lyn	294b2c1c10	Merge pull request #11528 from Apokleos/remote-initdata runtime-rs: add initdata annotation for remote hypervisor	2025-07-08 09:13:13 +08:00
Arvind Kumar	afedad0965	kernel: Removing SEV kernel packages Removing kernel config files realting to SEV as part of the SEV deprecation efforts. Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com> Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-07-07 11:21:11 -05:00
Arvind Kumar	ecac3d2d28	runtime: Removing runtime logic for SEV Removing runtime SEV functionality, such as the kbs, ovmf, VMSA handling, and SEV configs as part of deprecating SEV from kata. Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com> Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-07-07 11:17:32 -05:00
Arvind Kumar	8eebcef8fb	tests: Removing testing framework for SEV Removing files pertaining to SEV from the CI framework. Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com> Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-07-07 11:17:32 -05:00
Arvind Kumar	675ea86aba	kata-deploy: Removing SEV from kata-deploy Removing files related to SEV, responsible for installing and configuring Kata containers. Co-authored-by: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com> Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-07-07 11:17:32 -05:00
Paul Meyer	ff7ac58579	versions: bump opa 1.5.1 -> 1.6.0 Bumping opa to latest release. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-07-07 14:19:08 +02:00
alex.lyn	fcaade24f4	runtime-rs: add initdata annotation for remote hypervisor Add init data annotation within preparing remote hypervisor annotations when prepare vm, so that it can be passed within CreateVMRequest. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-07 12:46:05 +01:00
Fabiano Fidêncio	110f68a0f1	Merge pull request #11530 from fidencio/topic/tests-fix-runtime-class-check tests: runtimeclasses: Adjust gpu runtimeclasses	2025-07-07 13:42:46 +02:00
Fabiano Fidêncio	2c2995b7b0	tests: runtimeclasses: Adjust gpu runtimeclasses `679cc9d47c` was merged and bumped the podoverhead for the gpu related runtimeclasses. However, the bump on the `kata-runtimeClasses.yaml` as overlooked, making our tests fail due to that discrepancy. Let's just adjust the values here and move on. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-07 11:43:40 +02:00
Fabiano Fidêncio	ef545eed86	Merge pull request #11513 from lifupan/dragonball_6.12.x tools: port the dragonball kernel patch to 6.12.x	2025-07-07 10:31:49 +02:00
Steve Horsman	d291e9bda0	Merge pull request #11336 from zvonkok/fix-podoverhead gpu: Update runtimeClasses for correct podoverhead	2025-07-07 09:20:07 +01:00
Fabiano Fidêncio	a2faf93211	kernel: Bump to v6.12.36 As that's the latest releasesd LTS. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-07-06 23:48:20 +02:00
Fupan Li	fd21c9df59	tools: port the dragonball kernel patch to 6.12.x Backport the dragonball's kernel patches to 6.12.x kernel version. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-06 23:48:20 +02:00
Zvonko Kaiser	679cc9d47c	gpu: Update runtimeClasses for correct podoverhead We cannot only rely only on default_cpu and default_memory in the config, default is 1 and 2Gi but we need some overhead for QEMU and the other related binaries running as the pod overhead. Especially when QEMU is hot-plugging GPUs, CPUs, and memory it can consume more memory. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-04 12:20:15 -04:00
Steve Horsman	1c718dbcdd	Merge pull request #11506 from stevenhorsman/remove-atty-dependency Remove atty dependency	2025-07-04 10:46:28 +01:00
Alex Lyn	362ea54763	Merge pull request #11517 from zvonkok/fix-nvrc-build gpu: NVRC static build	2025-07-04 13:51:03 +08:00
Alex Lyn	2e35a8067d	Merge pull request #11482 from Apokleos/fix-force-guestpull runtime-rs: refactor and fix the implementation of guest-pull	2025-07-04 11:29:33 +08:00
stevenhorsman	6f23608e96	ci: Remove atty group atty is unmaintained, with the last release almost 3 years ago, so we don't need to check for updates, but instead will remove it from out dependency tree. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-04 09:43:34 +08:00
stevenhorsman	7ffbdf7b3a	mem-agent: Remove structopts crate structopt features were integrated into clap v3 and so is not actively updated and pulls in the atty crate which has a security advisory, so update clap, remove structopts, update the code that used it to remove the outdated dependencies. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-04 09:43:34 +08:00
stevenhorsman	7845129bdc	versions: Bump slog-term to 2.9.1 slog-term 2.9.0 included atty, which is unmaintained as has a security advisory GHSA-g98v-hv3f-hcfr, so bump the version across our components to remove this dependency. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-07-04 09:43:34 +08:00
Aurélien Bombo	fe532f9d04	Merge pull request #11475 from kata-containers/sprt/zizmor-fixes security: ci: Fixes for Zizmor GHA security scanning	2025-07-03 13:29:47 -05:00
Zvonko Kaiser	c3b2d69452	gpu: NVRC static build We had the proper config.toml configuration for static builds but were building the glibc target and not the musl target. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-07-03 15:31:00 +00:00
Aurélien Bombo	8723eedad2	gha: Remove path restriction for Zizmor workflow The way GH works, we can only require Zizmor results on ALL PR runs, or none, so remove the path filter. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-07-03 08:18:34 -05:00
Alex Lyn	c857f59a1a	Merge pull request #11510 from lifupan/sync_resize_vcpu runtime-rs: make the resize_vcpu api support sync	2025-07-03 17:35:08 +08:00
alex.lyn	2b95facc6f	kata-type: Relax Mandatory source Field Check in Guest-Pull Mode Previously, the source field was subject to mandatory checks. However, in guest-pull mode, this field doesn't consistently provide useful information. Our practical experience has shown that relying on this field for critical data isn't always necessary. In other aspect, not all cases need mandatory check for KataVirtualVolume. based on this fact, we'd better to make from_base64 do only one thing and remove the validate(). Of course, We also keep the previous capability to make it easy for possible cases which use such method and we rename it clearly with from_base64_and_validate. This commit relaxes the mandatory checks on the KataVirtualVolume specifically for guest-pull mode, acknowledging its diminished utility in this context. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-03 17:07:20 +08:00
alex.lyn	8f8b196705	runtime-rs: refactor merging metadata within image_pull refactor implementation for merging metadata. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-03 17:07:08 +08:00
Fupan Li	fb1c35335a	runtime-rs: make the resize_vcpu sync When hot plugging vcpu in dragonball hypervisor, use the synchronization interface and wait until the hot plug cpu is executed in the guest before returning. This ensures that the subsequent device hot plug will not conflict with the previous call. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-03 15:11:36 +08:00
Fupan Li	72a38457f0	dragonball: make the resize_vcpu api support sync Let dragonball's resize_vcpu api support synchronization, and only return after the hot-plug of the CPU is successfully executed in the guest kernel. This ensures that the subsequent device hot-plug operation can also proceed smoothly. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-07-03 15:11:36 +08:00
Alex Lyn	210844ce6b	Merge pull request #11509 from teawater/agent_test kata-agent: mount.rs: Fix warning of test	2025-07-03 15:05:04 +08:00
Alex Lyn	95d513b379	Merge pull request #11423 from zhaodiaoer/test test: fix broken testing code in libs	2025-07-03 11:15:39 +08:00
teawater	0347698c59	kata-agent: mount.rs: Fix warning of test Got follow warning with make test of kata-agent: Compiling rustjail v0.1.0 (/data/teawater/kata-containers/src/agent/rustjail) Compiling kata-agent v0.1.0 (/data/teawater/kata-containers/src/agent) warning: unused import: `std::os::unix::fs` --> rustjail/src/mount.rs:1147:9 \| 1147 \| use std::os::unix::fs; \| ^^^^^^^^^^^^^^^^^ \| = note: `#[warn(unused_imports)]` on by default This commit fixes it. Fixes: #11508 Signed-off-by: teawater <zhuhui@kylinos.cn>	2025-07-03 10:01:19 +08:00
alex.lyn	7a59d7f937	runtime-rs: Import the public const value from libs Introduce a const value `KATA_VIRTUAL_VOLUME_PREFIX` defined in the libs/kata-types, and it'll be better import such const value from there. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-03 09:42:17 +08:00
Aurélien Bombo	8d86bcea4b	Merge pull request #11499 from kata-containers/sprt/fix-commit-check gha: Eliminate use of force-skip-ci label	2025-07-02 10:53:55 -05:00
Aurélien Bombo	8d7d859e30	gha: Eliminate use of force-skip-ci label This was originally implemented as a Jenkins skip and is only used in a few workflows. Nowadays this would be better implemented via the gatekeeper. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-07-02 10:29:50 -05:00
Saul Paredes	e7b9eddced	Merge pull request #11248 from microsoft/archana1/storages genpolicy: add validation for storages	2025-07-01 10:02:10 -07:00
Fabiano Fidêncio	07b41c88de	Merge pull request #11490 from Apokleos/fix-noise runtime-rs: Fix noise with frequently appearing in unstaged changes	2025-07-01 17:43:41 +02:00
Archana Choudhary	6932beb01f	policy: fix parse errors in rules.rego This patch fixes the rules.rego file to ensure that the policy is correctly parsed and applied by opa. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 12:43:41 +00:00
Archana Choudhary	abbe1be69f	tests: enable confidential_guest setting for coco This commit updates the `tests_common.sh` script to enable the `confidential_guest` setting for the coco tests in the Kubernetes integration tests. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	9dd365fdb5	genpolicy: fix mount source check in rules.rego This commit fixes the mount source check in rules.rego. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	1cbea890f1	genpolicy: tests: update testcases for execprocess This patch removes storages from the testcases.json file for execprocess. This is because input storage objects are invalid for two reasons: 1. "io.katacontainers.fs-opt.layer=" is missing option in annotations. 2. by default, we don't have host-tarfs-dm-verity enabled, so the storage objects are not created in policy. Signed-off-by: Archana Choudhary <archana1@microsoft.com> ---	2025-07-01 10:35:20 +00:00
Archana Choudhary	6adec0737c	genpolicy: add rules for image_guest_pull storage This patch introduces some basic checks for the `image_guest_pull` storage type in the genpolicy tool. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	bd2dc1422e	genpolicy: add test for container images having volumes This patch adds a test case to genpolicy for container images that have volumes. Examples of such container images include: - quay.io/opstree/redis - https://github.com/kubernetes/examples/blob/master/cassandra/image/Dockerfile Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	d7f998fbd5	genpolicy: tests: update test for emptydir volumes This patch - updates testcases.json for emptydir volumes/storages Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	68c8c31718	genpolicy: tests: add test for config_map volumes This patch adds test for config_map volumes. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	9ebbc08d70	genpolicy: enable storage checks This patch - adds condition to add container image layers as storages - enable storage checks - fix CI policy test cases - update genpolicy-settings.json to enable storage checks - remove storage object addition in container image parsing Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Archana Choudhary	5b1459e623	genpolicy: test framework: enable config map usage This patch improves the test framework for the genpolicy tool by enabling the use of config maps. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-07-01 10:35:20 +00:00
Alex Lyn	8784cebb84	Merge pull request #10693 from Apokleos/guest-pullimage-timeout runtime-rs: support setting create_container timeout with request_timeout_ms for image pulling in guest	2025-07-01 11:40:19 +08:00
alex.lyn	b7c1d04a47	runtime-rs: Fix noise with frequently appearing in unstaged changes Fixes #11489 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-07-01 10:19:02 +08:00
alex.lyn	9839c17cad	build: add Makefile variable for create_container_timeout Add the definiation of variable DEFCREATECONTAINERTIMEOUT into Makefile target with default timeout 30s. Fixes: #485 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
alex.lyn	1a06bd1f08	kata-types: Introduce annotation *_RUNTIME_CREATE_CONTAINTER_TIMEOUT It's used to indicate timeout value set for image pulling in guest during creating container. This allows users to set this timeout with annotation according to the size of image to be pulled. Fixes #10692 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
alex.lyn	f886e82f03	runtime-rs: support setting create_container_timeout It allows users to set this create container timeout within configuration.toml according to the size of image to be pulled inside guest. Fixes #10692 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
alex.lyn	ce524a3958	kata-types: Give a more comprehensive definition of request_timeout_ms To better understand the impact of different timeout values on system behavior, this section provides a more comprehensive explanation of the request_timeout_ms: This timeout value is used to set the maximum duration for the agent to process a CreateContainerRequest. It's also used to ensure that workloads, especially those involving large image pulls within the guest, have sufficient time to complete. Based on explaination above, it's renamed with `create_container_timeout`, Specially, exposed in 'configuration.toml' Fixes #10692 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-30 20:04:56 +08:00
Steve Horsman	f04bb3f34c	Merge pull request #11479 from stevenhorsman/skip-weekly-coco-stability-tests workflows: Skip weekly coco stability tests	2025-06-30 09:05:14 +01:00
Fabiano Fidêncio	b024d8737c	Merge pull request #11481 from fidencio/topic/fix-passing-image-size-alignment build: Allow passing IMAGE_SIZE_ALIGNMENT_MB as an env var	2025-06-30 09:04:39 +02:00
Alex Lyn	69d2c078d1	Merge pull request #11484 from stevenhorsman/bump-nydus-snapshotter-0.15.2 version: Bump nydus-snapshotter	2025-06-30 14:44:01 +08:00
Alex Lyn	e66baf503b	Merge pull request #11474 from Apokleos/remote-annotation runtime-rs: Add GPU annotations for remote hypervisor	2025-06-30 14:05:15 +08:00
Fabiano Fidêncio	8d4e3b47b1	Merge pull request #11470 from fidencio/topic/runtime-rs-fix-odd-memory-size-calculation runtime-rs: Fix calculation of odd memory sizes	2025-06-30 07:26:30 +02:00
Champ-Goblem	91cadb7bfe	runtime-rs: Fix calculation of odd memory sizes An odd memory size leads to the runtime breaking during its startup, as shown below: ``` Warning FailedCreatePodSandBox 34s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox "708c81910f4e67e53b4170b6615083339b220154cb9a0c521b3232cdb40d50f9": failed to create containerd task: failed to create shim task: Others("failed to handle message start sandbox in task handler\n\nCaused by:\n 0: start vm\n 1: set vm base config\n 2: set vm configuration\n 3: Failed to set vm configuration VmConfigInfo { vcpu_count: 2, max_vcpu_count: 16, cpu_pm: \"on\", cpu_topology: CpuTopology { threads_per_core: 1, cores_per_die: 1, dies_per_socket: 1, sockets: 1 }, vpmu_feature: 0, mem_type: \"shmem\", mem_file_path: \"\", mem_size_mib: 4513, serial_path: Some(\"/run/kata/708c81910f4e67e53b4170b6615083339b220154cb9a0c521b3232cdb40d50f9/console.sock\"), pci_hotplug_enabled: true }\n 4: vmm action error: MachineConfig(InvalidMemorySize(4513))\n\nStack backtrace:\n 0: anyhow::error::<impl anyhow::Error>::msg\n 1: hypervisor::dragonball::vmm_instance::VmmInstance::handle_request\n 2: hypervisor::dragonball::vmm_instance::VmmInstance::set_vm_configuration\n 3: hypervisor::dragonball::inner::DragonballInner::set_vm_base_config\n 4: <hypervisor::dragonball::Dragonball as hypervisor::Hypervisor>::start_vm::{{closure}}::{{closure}}\n 5: <hypervisor::dragonball::Dragonball as hypervisor::Hypervisor>::start_vm::{{closure}}\n 6: <virt_container::sandbox::VirtSandbox as common::sandbox::Sandbox>::start::{{closure}}::{{closure}}\n 7: <virt_container::sandbox::VirtSandbox as common::sandbox::Sandbox>::start::{{closure}}\n 8: runtimes::manager::RuntimeHandlerManager::handler_task_message::{{closure}}::{{closure}}\n 9: runtimes::manager::RuntimeHandlerManager::handler_task_message::{{closure}}\n 10: <service::task_service::TaskService as containerd_shim_protos::shim::shim_ttrpc_async::Task>::create::{{closure}}\n 11: <containerd_shim_protos::shim::shim_ttrpc_async::CreateMethod as ttrpc::asynchronous::utils::MethodHandler>::handler::{{closure}}\n 12: <tokio::time::timeout::Timeout<T> as core::future::future::Future>::poll\n 13: ttrpc::asynchronous::server::HandlerContext::handle_msg::{{closure}}\n 14: <core::future::poll_fn::PollFn<F> as core::future::future::Future>::poll\n 15: <ttrpc::asynchronous::server::ServerReader as ttrpc::asynchronous::connection::ReaderDelegate>::handle_msg::{{closure}}::{{closure}}\n 16: tokio::runtime::task::core::Core<T,S>::poll\n 17: tokio::runtime::task::harness::Harness<T,S>::poll\n 18: tokio::runtime::scheduler::multi_thread::worker::Context::run_task\n 19: tokio::runtime::scheduler::multi_thread::worker::Context::run\n 20: tokio::runtime::context::runtime::enter_runtime\n 21: tokio::runtime::scheduler::multi_thread::worker::run\n 22: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll\n 23: tokio::runtime::task::core::Core<T,S>::poll\n 24: tokio::runtime::task::harness::Harness<T,S>::poll\n 25: tokio::runtime::blocking::pool::Inner::run\n 26: std::sys::backtrace::__rust_begin_short_backtrace\n 27: core::ops::function::FnOnce::call_once{{vtable.shim}}\n 28: std::sys::pal::unix::thread::Thread::new::thread_start") ``` As we cannot control what the users will set, let's just round it up to the next acceptable value. Signed-off-by: Champ-Goblem <cameron@northflank.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-28 14:29:18 +02:00
Fabiano Fidêncio	e2b93fff3f	build: Allow passing IMAGE_SIZE_ALIGNMENT_MB as an env var This helps considerably to avoid patching the code, and just adjusting the build environment to use a smaller alignment than the default one. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-28 00:05:20 +02:00
stevenhorsman	fe5d43b4bd	workflows: Skip weekly coco stability tests These tests are not passing, or being maintained, so as discussed on the AC meeting, we will skip them from automatically running until they can be reviewed and re-worked, so avoid wasting CI cycles. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-27 16:51:53 +01:00
stevenhorsman	61b12d4e1b	version: Bump nydus-snapshotter Bump to version v0.15.2 to pick up fix to mount source in https://github.com/containerd/nydus-snapshotter/pull/636 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-27 14:04:00 +01:00
RuoqingHe	a43e06e0eb	Merge pull request #11461 from stevenhorsman/bump-guest-components-4cd62c3 versions: Bump guest-components	2025-06-27 10:45:06 +08:00
Aurélien Bombo	d94085916e	ci: set Zizmor as required test This adds Zizmor GHA security scanning as a PR gate. Note that this does NOT require that Zizmor returns 0 alerts, but rather that Zizmor's invocation completes successfully (regardless of how many alerts it raises). I will set up the former after this commit is merged (through the GH UI). Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-26 12:36:41 -05:00
Aurélien Bombo	820c1389db	security: ci: remove overly broad permission This removes the permission from the workflow since it's already present at the job level. https://github.com/kata-containers/kata-containers/security/code-scanning/111 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-26 12:29:23 -05:00
Aurélien Bombo	bb2a427a8a	security: ci: fix template injection This fixes a Zizmor error where some variables are vulnerable to template injection. https://github.com/kata-containers/kata-containers/security/code-scanning/67 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-26 12:29:11 -05:00
Saul Paredes	8c57beb943	Merge pull request #11471 from microsoft/saulparedes/fix_kata_monitor_dockerfile tools: kata-monitor: update go version used to build in Dockerfile	2025-06-26 08:37:08 -07:00
Chao Wu	ac928218f3	Merge pull request #11434 from hsiangkao/erofs runtime: improve EROFS snapshotter support	2025-06-26 22:40:48 +08:00
Cameron McDermott	b6cd6e6914	Merge pull request #11469 from fidencio/topic/dragonball-set-default_maxvcpus-to-zero runtime-rs: Set default_maxvcpus to 0	2025-06-26 15:20:21 +01:00
Aurélien Bombo	a1aa3e79d4	Merge pull request #11392 from kata-containers/sprt/zizmor ci: Run zizmor for GHA security analysis	2025-06-26 08:55:22 -05:00
Fupan Li	1ff54a95d2	Merge pull request #11422 from lifupan/memory_hotplug runtime-rs: Add the memory and vcpu hotplug for cloud-hypervisor	2025-06-26 17:56:49 +08:00
Aurélien Bombo	34c8cd810d	ci: Run zizmor for GHA security analysis This runs the zizmor security lint [1] on our GH Actions. The initial workflow uses [2] as a base. [1] https://docs.zizmor.sh/ [2] https://docs.zizmor.sh/usage/#use-in-github-actions Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-26 10:52:28 +01:00
alex.lyn	e6e4cd91b8	runtime-rs: Enable GPU annotations in remote hypervisor configuration Enable GPU annotations by adding `default_gpus` and `default_gpu_model` into the list of valid annotations `enable_annotations`. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:29:36 +08:00
alex.lyn	e5f44fae30	runtime-rs: Add GPU annotations during remote hypervisor preparation Add GPU specific annotations used by remote hypervisor for instance selection during `prepare_vm`. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:27:41 +08:00
alex.lyn	866d3facba	kata-types: Introduce two GPU annotations for remote hypervisor Two annotations: `default_gpus and `default_gpu_model` as GPU annotations are introduced for Kata VM configurations to improve instance selection on remote hypervisors. By adding these annotations: (1) `default_gpus`: Allows users to specify the minimum number of GPUs a VM requires. This ensures that the remote hypervisor selects an instance with at least that many GPUs, preventing resource under-provisioning. (2) `default_gpu_model`: Lets users define the specific GPU model needed for the VM. This is crucial for workloads that depend on particular GPU archs or features, ensuring compatibility and optimal performance. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:27:41 +08:00
alex.lyn	ed0c0b2367	kata-types: Introduce GPU related fields in RemoteInfo To provide the remote hypervisor with the necessary intelligence to select the most appropriate instance for a given GPU instance, leading to better resource allocation, two fields `default_gpus` and `default_gpu_model` are introduced in `RemoteInfo`. Fixes #10484 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-26 17:27:28 +08:00
Alex Lyn	9a1d4fc5d6	Merge pull request #11468 from Apokleos/fix-sharefs-none runtime-rs: Support shared fs with "none" on non-tee platforms	2025-06-26 15:37:44 +08:00
Gao Xiang	9079c8e598	runtime: improve EROFS snapshotter support To better support containerd 2.1 and later versions, remove the hardcoded `layer.erofs` and instead parse `/proc/mounts` to obtain the real mount source (and `/sys/block/loopX/loop/backing_file` if needed). If the mount source doesn't end with `layer.erofs`, it should be marked as unsupported, as it may be a filesystem meta file generated by later containerd versions for the EROFS flattened filesystem feature. Also check whether the filesystem type is `overlay` or not, since the containerd mount manager [1] may change it after being introduced. [1] https://github.com/containerd/containerd/issues/11303 Fixes: `f63ec50ba3` ("runtime: Add EROFS snapshotter with block device support") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-06-26 10:12:12 +08:00
Saul Paredes	d53c720ac1	tools: kata-monitor: update go version used to build in Dockerfile Current Dockerfile fails when trying to build from the root of the repo docker build -t kata-monitor -f tools/packaging/kata-monitor/Dockerfile . with "invalid go version '1.23.0': must match format 1.23" Using go 1.23 in the Dockerfile fixes the build error Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-06-25 15:32:41 -07:00
stevenhorsman	290fda9b97	agent-ctl: Bump image-rs version I notices that agent-ctl is including a 9 month old version of image-rs and the libs crates haven't been update for potentially many years, so bump all of these. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-25 16:30:58 +01:00
stevenhorsman	c7da62dd1e	versions: Bump guest-components Bump to pick up the new guest-components and matching trustee which use rust 1.85.1 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-25 15:05:07 +01:00
Fabiano Fidêncio	bebe377f0d	runtime-rs: Set default_maxvcpus to 0 Otherwise we just cannot start a container that requests more than 1 vcpu. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-25 14:36:46 +02:00
Steve Horsman	9ff30c6aeb	Merge pull request #11462 from kata-containers/add-scorecard-action ci: Add scorecard action	2025-06-25 12:48:11 +01:00
Fabiano Fidêncio	69c706b570	Merge pull request #11441 from stevenhorsman/protobuf-3.7.2-bump versions: Bump protobuf to 3.7.2	2025-06-25 13:47:28 +02:00
alex.lyn	eae62ca9ac	runtime-rs: Support shared fs with "none" on non-tee platforms This commit introduces the ability to run Pods without shared fs mechanism in Kata. The default shared fs can lead to unnecessary resource consumption and security risks for certain use cases. Specifically, scenarios where files only need to be copied into the VM once at Pod creation (e.g., non-tee envs) and don't require dynamic updates make the shared fs redundant and inefficient. By explicitly disabling shared fs functionality, we reduce resource overhead and shrink the attack surface. Users will need to employ alternative methods(e.g. guest-pull) to ensure container images are shared into the guest VM for these specific scenarios. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-25 17:36:57 +08:00
Fabiano Fidêncio	4719c08184	Merge pull request #11467 from lifupan/fixblockfile runtime-rs: fix the issue return the wrong volume	2025-06-25 09:56:28 +02:00
Fupan Li	48c8e0f296	runtime-rs: fix the issue return the wrong volume In the pre commit:74eccc54e7b31cc4c9abd8b6e4007c3a4c1d4dd4, it missed return the right rootfs volume. In the is_block_rootfs fn, if the rootfs is based on a block device such as devicemapper, it should clear the volume's source and let the device_manager to use the dev_id to get the device's host path. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-25 10:02:52 +08:00
Alex Lyn	648fef4f52	Merge pull request #11466 from lifupan/blockfile runtime-rs: add the blockfile based rootfs support	2025-06-25 09:46:54 +08:00
Dan Mihai	2d43b3f9fc	Merge pull request #11424 from katexochen/p/regorus-oras-cache ci/static-checks: use oras cache for regorus	2025-06-24 14:49:00 -07:00
Fupan Li	74eccc54e7	runtime-rs: add the blockfile based rootfs support For containerd's Blockfile Snapshotter, it will pass a rootfs mounts with a rawfile as a mount source and mount options with "loop" embeded. To support this type of rootfs, it is necessary to identify this as a blockfile rootfs through the "loop" flag, and then use the volume source of the rootfs as the source of the block device to hot-insert it into the guest. Fixes:#11464 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 22:31:54 +08:00
Paul Meyer	43739cefdf	ci/static-checks: use oras cache for regorus Instead of building it every time, we can store the regorus binary in OCI registry using oras and download it from there. This reduces the install time from ~1m40s to ~15s. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-24 13:14:18 +02:00
Fupan Li	9bdbd82690	Merge pull request #11181 from Apokleos/initdata-runtime-rs runtime-rs: Implement Initdata Spec Support in runtime-rs for CoCo	2025-06-24 18:59:34 +08:00
Fupan Li	1c59516d72	runtime-rs: add support resize_vcpu for cloud-hypervisor This commit add support of resize_vcpu for cloud-hypervisor using the it's vm resize api. It can support bothof vcpu hotplug and hot unplug. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	a3671b7a5c	runtime-rs: Add the memory hotplug for cloud-hypervisor For cloud-hypervisor, currently only hot plugging of memory is supported, but hot unplugging of memory is not supported. In addition, by default, cloud-hypervisor uses ACPI-based memory hot-plugging instead of virtio-mem based memory hot-plugging. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	7df29605a4	runtime-rs: add the vm resize and get vminfo api for clh Add API interfaces for get vminfo and resize. get vminfo can obtain the memory size and number of vCPUs from the cloud hypervisor vmm in real time. This interface provides information for the subsequent resize memory and vCPU. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	9a51ade4e2	runtime-rs: impl the Deserialize trait for MacAddr The system's own Deserialize cannot implement parsing from string to MacAddr, so we need to implement this trait ourself. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
Fupan Li	ceaae3049c	runtime-rs: move the bytes_to_megs and megs_to_bytes to utils Since those two functions would be used by other hypervisors, thus move them into the utils crate. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-24 17:15:05 +08:00
alex.lyn	871465f5d3	kata-agent: Allow unrecognized fields in InitData To make it flexibility and extensibility This change modifies the Kata Agent's handling of `InitData` to allow for unrecognized key-value pairs. The `InitData` field now directly utilizes `HashMap<String, String>`, enabling it to carry arbitrary metadata and information that may be consumed by other components Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	afcb042c28	runtime-rs: Specify the initdata to mrconfigid correctly During sandbox preparation, initdata should be specified to TdxConfig, specially mrconfigid, which is used to pass to tdx guest report for measurement. Fixes #11180 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	d6d8497b56	runtime-rs: Add host-data property to sev-snp-guest object SEV-SNP guest configuration utilizes a different set of properties compared to the existing 'sev-guest' object. This change introduces the `host-data` property within the sev-snp-guest object. This property allows for configuring an SEV-SNP guest with host-provided data, which is crucial for data integrity verification during attestation. The `host-data` property is specifically valid for SEV-SNP guests running on a capable platform. It is configured as a base64-encoded string when using the sev-snp-guest object. the example cmdline looks like: ```shell -object sev-snp-guest,id=sev-snp0,host-data=CGNkCHoBC5CcdGXir... ``` Fixes #11180 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	4a4361393c	runtime-rs: Introduce host-data in SevSnpConfig for validation To facilitate the transfer of initdata generated during `prepare_initdata_device_config`, a new parameter has been introduced into the `prepare_protection_device_config` function. Furthermore, to specifically pass initdata to SEV-SNP Guests, a `host_data` field has been added to the `SevSnpConfig` structure. However, this field is exclusively applicable to the SEV-SNP platform. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	5c8170dbb9	runtime-rs: Handle initdata block device config during sandbox start Retrieve the Initdata string content from the security_info of the Configuration. Based on the Protection Platform type, calculate the digest of the Initdata. Write the Initdata content to the block device. Subsequently, construct the BlockConfig based on this block device information. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	6ea1494701	runtime-rs: Add InitData Resource type for block device management To correctly manage initdata as a block device, a new InitData Resource type, inherently a block device, has been introduced within the ResourceManager. As a component of the Sandbox's resources, this InitData Resource needs to be appropriately handled by the Device Manager's handler. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	8c1482a221	runtime-rs: Introduce coco_data dir and initdata block Implement resource storage infrastructure with initial initdata support: 1. Create dedicated `coco_data` directory for: - Centralized management of CoCo resources; - Future expansion of CoCo artifacts; 2. Atomic initdata block as foundational component in `coco_data`, it will implement creation of compressed initdata blocks with: - Gzip compression with level customization (0-9) - Sector-aligned (512B) image format with magic header - Adaptive buffering (4KB-128KB) based on payload size - Temp-file atomic writes with 0o600 permissions Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	9b21d062c9	kata-types: Implement InitData retrieval from Pod annotation This commit implements the retrieval and processing of InitData provided via a Pod annotation. Specifically, it enables runtime-rs to: (1) Parse the "io.katacontainers.config.hypervisor.cc_init_data" annotation from the Pod YAML. (2) Perform reverse operations on the annotation value: base64 decoding followed by gzip decompression. (3) Deserialize the decompressed data into the internal InitData structure. (4) Serialize the resulting InitData into a string and store it in the Configuration. This allows users to inject configuration data into the TEE Guest by encoding and compressing it and passing it as an annotation in the Pod configuration. This mechanism supports scenarios where dynamic config is required for Confidential Containers. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	4ca394f4fc	kata-types: Implement Initdata Spec and Digest Calculation Logic This commit introduces the Initdata Spec and the logic for calculating its digest. It includes: (1) Define a `ProtectedPlatform` enum to represent major TEE platform types. (2) Create an `InitData` struct to support building and serializing initialization data in TOML format. (3) Implement adaptation for SHA-256, SHA-384, and SHA-512 digest algorithms. (4) Provide a platform-specific mechanism for adjusting digest lengths (zero-padding). (5) Supporting the decoding and verification of base64+gzip encoded Initdata. The core functionality ensures the integrity of data injected by the host through trusted algorithms, while also accommodating the measurement requirements of different TEE platforms. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
alex.lyn	2603ee66b8	kata-types: Introduce initdata to SecurityInfo for data injection This commit introduces a new `initdata` field of type String to hypervisor `SecurityInfo`. In accordance with the Initdata Specification, this field will facilitate the injection of well-defined data from an untrusted host into the TEE. To ensure the integrity of this injected data, the TEE evidence's hostdata capability or the (v)TPM dynamic measurement capability will be leveraged, as outlined in the specification. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-24 10:25:57 +08:00
Dan Mihai	89dcc8fb27	Merge pull request #11444 from microsoft/danmihai1/k8s-policy-rc tests: k8s-policy-rc: print pod descriptions	2025-06-23 16:14:56 -07:00
Dan Mihai	0a57e09259	Merge pull request #11426 from charludo/fix/genpolicy-corruption-of-layer-cache-file genpolicy: prevent corruption of the layer cache file	2025-06-23 14:00:45 -07:00
Dan Mihai	8aecf14b34	Merge pull request #11405 from kata-containers/dependabot/cargo/src/agent/clap-77d1155c52 build(deps): bump the clap group across 6 directories with 1 update	2025-06-23 13:05:59 -07:00
Dan Mihai	62c9845623	tests: k8s-policy-rc: print pod descriptions Don't use local launched_pods variable in test_rc_policy(), because teardown() needs to use this variable to print a description of the pods, for debugging purposes. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-23 16:23:26 +00:00
stevenhorsman	649e31340b	doc: Add scorecard badge Add our scorecard badge to our readme for transparency and to help motivate us to update our score Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-23 16:22:59 +01:00
stevenhorsman	6dd025d0ed	workflows: Add scorecard workflow Add a workflow to update our scorecard score on each change Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-23 16:09:14 +01:00
Steve Horsman	4f245df4a0	Merge pull request #11420 from kata-containers/pin-gha-actions workflows: Pin action hashes	2025-06-23 15:26:03 +01:00
charludo	4e57cc0ed2	genpolicy: keep layers cache in-memory to prevent corruption The locking mechanism around the layers cache file was insufficient to prevent corruption of the file. This commit moves the layers cache's management in-memory, only reading the cache file once at the beginning of `genpolicy`, and only writing to it once, at the end of `genpolicy`. In the case that obtaining a lock on the cache file fails, reading/writing to it is skipped, and the cache is not used/persisted. Signed-off-by: charludo <git@charlotteharludo.com>	2025-06-23 16:16:42 +02:00
RuoqingHe	8c1f6e827d	Merge pull request #11448 from RuoqingHe/remove-dup-ignore ci: Remove duplicated `rust-vmm` dependencies	2025-06-23 10:34:30 +08:00
Ruoqing He	1d2d2cc3d5	ci: Remove duplicated `rust-vmm` dependencies `vmm-sys-util` was duplicated while updating the `ignore` list of `rust-vmm` crates in #11431, remove duplicated one and sort the list. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-21 21:02:59 +00:00
stevenhorsman	9685e2aeca	trace-forwarder: Replace removed clap functions When moving from clap v2 to v4 a bunch of functions have been removed, so update the code to handle these replacements Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-21 17:15:12 +01:00
stevenhorsman	e204847df5	agent-ctl: Replace removed clap functions When moving from clap v2 to v4 a bunch of functions have been removed, so update the code to handle these replacements Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-21 17:15:12 +01:00
stevenhorsman	e11fc3334e	agent: Clap v4 updates AppSettings was removed, so refactor based on new documentation Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-21 17:15:12 +01:00
dependabot[bot]	0aa80313eb	build(deps): bump the clap group across 6 directories with 1 update Bumps the clap group with 1 update in the /src/agent directory: [clap](https://github.com/clap-rs/clap). Bumps the clap group with 1 update in the /src/tools/agent-ctl directory: [clap](https://github.com/clap-rs/clap). Bumps the clap group with 1 update in the /src/tools/genpolicy directory: [clap](https://github.com/clap-rs/clap). Bumps the clap group with 1 update in the /src/tools/kata-ctl directory: [clap](https://github.com/clap-rs/clap). Bumps the clap group with 1 update in the /src/tools/runk directory: [clap](https://github.com/clap-rs/clap). Bumps the clap group with 1 update in the /src/tools/trace-forwarder directory: [clap](https://github.com/clap-rs/clap). Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.37 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.1.8 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 4.4.10 to 4.5.13 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 3.2.25 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) Updates `clap` from 2.34.0 to 4.5.40 - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.25...clap_complete-v4.5.37) --- updated-dependencies: - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.37 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.13 dependency-type: direct:production update-type: version-update:semver-minor dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap - dependency-name: clap dependency-version: 4.5.40 dependency-type: direct:production update-type: version-update:semver-major dependency-group: clap ... Signed-off-by: dependabot[bot] <support@github.com>	2025-06-21 17:15:12 +01:00
RuoqingHe	b22135f4e5	Merge pull request #11431 from RuoqingHe/udpate-rust-vmm-ignore-list ci: Update dependabot ignore list	2025-06-21 18:20:41 +08:00
Ruoqing He	6628ba3208	ci: Update dependabot ignore list Update dependabot ignore list in cargo ecosystem to ignore upgrades from rust-vmm crates, since those crates need to be managed carefully and manually. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-21 08:18:20 +01:00
stevenhorsman	9d3b9fb438	workflows: Pin action hashes Pin Github owned actions to specific hashes as recommended as tags are mutable see https://pin-gh-actions.kammel.dev/. This one of the recommendations that scorecard gives us. Note this was generated with `frizbee actions` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-21 08:14:13 +01:00
Steve Horsman	4bfa74c2a5	Merge pull request #11331 from stevenhorsman/helm-ghcr-login-update workflow: Remove code injection in helm login	2025-06-21 08:13:40 +01:00
Steve Horsman	353b4bc853	Merge pull request #11440 from stevenhorsman/osbuilder-fedora-42-update osbuilder: Update image-builder base to f42	2025-06-21 08:11:12 +01:00
Steve Horsman	cac1cb75ce	Merge pull request #11378 from kata-containers/dependabot/cargo/src/tools/agent-ctl/rustix-0.37.28 build(deps): bump rustix in various components	2025-06-21 08:05:21 +01:00
stevenhorsman	900d9be55e	build(deps): bump rustix in various components Bumps of rustix 0.36, 0.37 and 0.38 to resolve CVE-2024-43806 Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-20 14:52:43 -05:00
stevenhorsman	d9defd5102	osbuilder: Update image-builder base to f42 Fedora 40 is EoL, and I've seen the registry pull fail a few times recently, so let's bump to fedora 42 which has 10 months of support left. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-20 20:52:30 +01:00
stevenhorsman	0f1c326ca0	versions: Bump protobuf to 3.7.2 Now we are decoupled from the image-rs crate, we can bump the protobuf version across our project to resolve the GHSA-2gh3-rmm4-6rq5 advisory Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-20 20:52:04 +01:00
Saul Paredes	cc27966aa1	Merge pull request #11443 from microsoft/saulparedes/update_image tests: update container image for ci and unit test	2025-06-20 12:50:42 -07:00
Archana Choudhary	e093919b42	tests: update container image for ci and unit test This patch updates the container image for the CI test workloads: - `k8s-layered-sc-deployment.yaml` - `k8s-pod-sc-deployment.yaml` - `k8s-pod-sc-nobodyupdate-deployment.yaml` - `k8s-pod-sc-supplementalgroups-deployment.yaml` - `k8s-policy-deployment.yaml` Also updates unit tests: - `test_create_container_security_context` - `test_create_container_security_context_supplemental_groups` This fixes tests failing due to an image pull error as the previous image is no longer available in the container registry. Signed-off-by: Archana Choudhary <archana1@microsoft.com> Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-06-20 10:46:56 -07:00
stevenhorsman	776c89453c	workflow: Remove code injection in helm login In theory `github.actor` could be used for code injection, so swap it out. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-20 16:27:52 +01:00
Fabiano Fidêncio	6722ea2fd9	Merge pull request #11439 from stevenhorsman/multi-arch-manifest-permissions-fix release: Add more permissions	2025-06-19 12:45:37 +02:00
stevenhorsman	8da75bf55d	release: Add more permissions Add package: write to the multi-arch manifest upload to ghcr.io Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-19 11:04:29 +01:00
Fabiano Fidêncio	d0c1ce1367	Merge pull request #11438 from stevenhorsman/helm-upload-fix release: Fix helm push typo	2025-06-19 12:01:04 +02:00
stevenhorsman	eaf42b3e0f	release: Fix helm push typo Switch the hyper for an underscore, so the ghcr helm publish can work properly. Co-authored-by: Fabiano Fidêncio <fidencio@northflank.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-19 10:56:50 +01:00
Fabiano Fidêncio	f7d3ea0c55	Merge pull request #11437 from kata-containers/release-flow-permissions-fixes-iii workflows: Release permissions	2025-06-19 11:23:46 +02:00
stevenhorsman	19597b8950	workflows: Release permissions Add more permissions to the release workflow in order to enable `gh release` commands to run Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-19 10:05:23 +01:00
Fabiano Fidêncio	254ada2f6a	Merge pull request #11436 from kata-containers/release-flow-permission-fix-ii workflows: Add extra permissions	2025-06-19 10:45:26 +02:00
stevenhorsman	7c6c6f3c15	workflows: Add extra permissions Add permissions to the ppc release Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-19 09:39:01 +01:00
Steve Horsman	00c9e61b60	Merge pull request #11435 from kata-containers/release-flow-permissions-fix(es) workflows: Fix permissions	2025-06-19 09:35:23 +01:00
stevenhorsman	9adf989555	workflows: Fix permissions Add extra permissions for reusable workflow calls that need them later on Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-19 08:44:18 +01:00
Fabiano Fidêncio	e82de65d5d	Merge pull request #11425 from stevenhorsman/release-3.18.0-bump release: Bump version to 3.18.0	2025-06-18 21:39:51 +02:00
stevenhorsman	6fc622ef0f	release: Bump version to 3.18.0 Bump VERSION and helm-chart versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-18 19:09:42 +01:00
Steve Horsman	060faa3d1a	Merge pull request #11433 from kata-containers/cri-containerd-test-fast-fail-false workflows: Add fail-fast: false to cri-containerd tests	2025-06-18 19:08:59 +01:00
Steve Horsman	e0084a958c	Merge pull request #11432 from stevenhorsman/golang-1.23.10 versions: Bump golang to 1.23.10	2025-06-18 17:25:07 +01:00
Steve Horsman	4e3238b9dc	Merge pull request #11337 from zvonkok/fix-module-signing gpu: Fix module signing	2025-06-18 17:23:51 +01:00
Steve Horsman	547b6c5781	Merge pull request #11429 from stevenhorsman/cri-containerd-required-test-rename Cri containerd required test rename	2025-06-18 15:45:14 +01:00
Zvonko Kaiser	e2f18057a4	kernel: Add config option for signing Only sign the kernel if the user has provided the KBUILD_SIGN_PIN otherwise ignore. Whole here, let's move the functionality to the common fragments as it's not a GPU specific functionality. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-06-18 15:32:26 +02:00
stevenhorsman	73d7b4f258	workflows: Add fail-fast: false to cri-containerd tests At the moment if any of the tests in the matric fails then the rest of the jobs are cancelled, so we have to re-run everything. Add `fail-fast: false` to stop this behaviour. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-18 14:20:16 +01:00
stevenhorsman	aedbaa1545	versions: Bump golang to 1.23.10 Bump golang to fix CVEs GO-2025-3751 and GO-2025-3563 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-18 11:11:32 +01:00
stevenhorsman	b20f89b775	ci: required-tests: Remove test skip Remove the rule that causes gatekeeper to skip tests if we've only updated the required-tests.yaml list. Although update to just the required-tests.yaml doesn't change the outcome of any of the CI tests, it does change whether gatekeeper will still pass with the new rules. Although it's a bit of a hit to run the CI, it's probably worth it to keep gatekeeper validated. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-18 10:52:03 +01:00
stevenhorsman	d68b09a4f0	ci: required-tests: cri-containerd rename Update the names of the required jobs based on the changes done in #11019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-18 10:52:03 +01:00
Steve Horsman	0aca20986b	Merge pull request #11400 from miz060/mitchzhu/add-govulncheck ci: Add optional govulncheck security scanning to static checks	2025-06-18 10:34:56 +01:00
Steve Horsman	d754e3939b	Merge pull request #11427 from BbolroC/bump-rootfs-confidential-s390x rootfs: Bump rootfs-{image,initrd} to 24.04	2025-06-18 09:06:58 +01:00
Mitch Zhu	292c27130d	ci: Add optional govulncheck security scanning to static checks This adds govulncheck vulnerability scanning as a non-blocking check in the static checks workflow. The check scans Go runtime binaries for known vulnerabilities while filtering out verified false positives. Signed-off-by: Mitch Zhu <mitchzhu@microsoft.com>	2025-06-17 20:43:00 -07:00
Alex Lyn	b61b20eef3	Merge pull request #11394 from mythi/tdx-kata-deploy-bump kata-deploy: accept 25.04 as supported distro for TDX	2025-06-18 08:52:46 +08:00
Hyounggyu Choi	4be261f248	rootfs: Bump rootfs-{image,initrd} to 24.04 Since #11197 was merged, all confidential k8s e2e tests for s390x have been failing with the following errors: ``` attestation-agent: error while loading shared libraries: libcurl.so.4: cannot open shared object file libnghttp2.so.14: cannot open shared object file ``` In line with the update on x86_64, we need to upgrade the OS used in rootfs-{image,initrd} on s390x. This commit also bumps all 22.04 to 24.04 for all architectures. For s390x, this ensures the missing packages listed above are installed. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-06-17 22:03:26 +02:00
Steve Horsman	fd93e83a4f	Merge pull request #11019 from seungukshin/cri-containerd-tests-for-arm64 Enable cri-containerd-tests for arm64	2025-06-17 11:53:49 +01:00
Fupan Li	15b24b5be1	Merge pull request #10698 from Apokleos/kata-volume-rs runtime-rs: Support Pull Image in Guest with Kata Volume for CoCo	2025-06-17 15:00:02 +08:00
Lei Liu	71d1cdf40a	test: fix broken testing code in libs After commit `a3f973db3b` merged, protection::GuestProtection::[Snp,Sev] have changed to tuple variants, and can no longer be used in assert_eq marco without tuple values, or some errors will raised: ``` assert_eq!(actual.unwrap(), GuestProtection::Snp); \| ^^^^^^^^^^^^^^^^^^^^ expected \ `GuestProtection`, found enum constructor ``` Signed-off-by: Lei Liu <liulei.pt@bytedance.com>	2025-06-17 12:38:39 +08:00
Steve Horsman	a00f39e272	Merge pull request #11419 from katexochen/p/gitignore-direnv gitignore: ignore direnv	2025-06-16 17:26:10 +01:00
Seunguk Shin	4f9b7e4d4f	ci: Enable cri-containerd-tests for arm64 This change enables cri-containerd-test for arm64. Signed-off-by: Seunguk Shin <seunguk.shin@arm.com> Reviewed-by: Nick Connolly <nick.connolly@arm.com>	2025-06-16 15:12:17 +01:00
Paul Meyer	822f54c800	ci/static-checks: add dispatch trigger This simplifies executing the workflow on a fork during testing. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-16 16:12:10 +02:00
Seunguk Shin	203e3af94b	ci: Disable run-containerd-sandboxapi containerd-sandboxapi fails with `containerd v2.0.x` and passes with `containerd v1.7.x` regardless kata-containers. And it was not tested with `containerd v2.0.x` because `containerd v2.0.x` could not recognize `[plugins.cri.containerd]` in `config.toml`. Signed-off-by: Seunguk Shin <seunguk.shin@arm.com>	2025-06-16 15:02:07 +01:00
Mikko Ylinen	825b1cd233	kata-deploy: accept 25.04 as supported distro for TDX the latest Canonical TDX release supports 25.04 / Plucky as well. Users experimenting with the latest goodies in the 25.04 TDX enablement won't get Kata deployed properly. This change accepts 25.04 as supported distro for TDX. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-06-16 13:42:08 +01:00
Xuewei Niu	9b4518f742	Merge pull request #11359 from pawelbeza/fix-logs-on-virtiofs-shutdown Fix logging on virtiofs shutdown	2025-06-16 17:06:29 +08:00
Paul Meyer	b629b11ba0	gitignore: ignore direnv This allows contributors to setup direnv without having it detected by git. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-16 11:02:00 +02:00
Steve Horsman	64c95cb996	Merge pull request #11389 from kata-containers/checkout-persist-credentials-false workflows: Set persist-credentials: false on checkout	2025-06-16 09:58:22 +01:00
alex.lyn	cebb259e51	runtime-rs: Introduce force guest pulling image Container image integrity protection is a critical practice involving a multi-layered defense mechanism. While container images inherently offer basic integrity verification through Content-Addressable Storage (CAS) (ensuring pulled content matches stored hashes), a combination of other measures is crucial for production environments. These layers include: Encrypted Transport (HTTPS/TLS) to prevent tampering during transfer; Image Signing to confirm the image originates from a trusted source; Vulnerability Scanning to ensure the image content is "healthy"; and Trusted Registries with stringent access controls. In certain scenarios, such as when container image confidentiality requirements are not stringent, and integrity is already ensured via the aforementioned mechanisms (especially CAS and HTTPS/TLS), adopting "force guest pull" can be a viable option. This implies that even when pulling images from a container registry, their integrity remains guaranteed through content hashes and other built-in mechanisms, without relying on additional host-side verification or specialized transfer methods. Since this feature is already available in runtime-go and offers synergistic benefits with guest pull, we have chosen to support force guest pull. Fixes #10690 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-16 16:49:17 +08:00
alex.lyn	2157075140	kata-types: Introduce a helper method to adjust rootfs mounts This commit introduces the `adjust_rootfs_mounts` function to manage root filesystem mounts for guest-pull scenarios. When the force guest-pull mechanism is active, this function ensures that the rootfs is exclusively configured via a dedicated `KataVirtualVolume`. It disregards any provided input mounts, instead generating a single, default `KataVirtualVolume`. This volume is then base64-encoded and set as the sole mount option for a new, singular `Mount` entry, which is returned as the only item in the `Vec<Mount>`. This change guarantees consistent and exclusive rootfs configuration when utilizing guest-pull for container images. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-16 16:49:17 +08:00
alex.lyn	c9ffbaf30d	runtime-rs: Support handling Kata Virtual Volume in handle_rootfs In CoCo scenarios, there's no image pulling on host side, and it will disable such operations, that's to say, there's no files sharing between host and guest, especially for container rootfs. We introduce Kata Virtual Volume to help handle such cases: (1) Introduce is_kata_virtual_volume to ensure the volume is kata virtual volume. (2) Introduce VirtualVolume Handling logic in handle_rootfs when the mount is kata virtual volume. Fixes #10690 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-16 16:49:17 +08:00
alex.lyn	2600fc6f43	runtime-rs: Add Spec annotation to help pass image information We need get the relevent image ref from OCI runtime Spec, especially the annotation of it. Fixes #10690 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-16 16:49:17 +08:00
alex.lyn	d4e9369d3d	runtime-rs: Implement guest-pull rootfs via virtual volumes This commit introduces comprehensive support for rootfs mount mgmt through Kata Virtual Volumes, specifically enabling the guest-pull mechanism. It enhances the runtime's ability to: (1) Extract image references from container annotations (CRI/CRI-O). (2) Process `KataVirtualVolume` objects, configuring them for guest-pull operations. (3) Set up the agent's storage for guest-pulled images. This functionality streamlines the process of pulling container images directly within the guest for rootfs, aligning with guest-side image management strategies. Fixes #10690 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-16 16:49:17 +08:00
Alex Lyn	a966d1be50	Merge pull request #11197 from Xynnn007/move-image-pull Move image pull abilities to CDH	2025-06-16 16:43:59 +08:00
Xynnn007	e0b4cd2dba	initrd/image: update x86_64 base to ubuntu 24.04 The Multistrap issue has been fixed in noble thus we can use the LTS. Also, this will fix the error reported by CDH ``` /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found ``` Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:54:15 +08:00
Xynnn007	0b3a8c0355	initdata: delete coco_as token section in initdata The new version of AA allows the config not having a coco_as token config. If not provided, it will mark as None. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:54:15 +08:00
Xynnn007	5bab460224	chore(deps): update guest-components This patch updates the guest-components to new version with better error logging for CDH. It also allows the config of AA not having a coco_as token config. Also, the new version of CDH requires to build aws-lc-sys thus needs to install cmake for build. See https://github.com/kata-containers/kata-containers/actions/runs/15327923347/job/43127108813?pr=11197#step:6:1609 for details. Besides, the new version of guest-components have some fixes for SNP stack, which requires the updates of trustee side. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:54:15 +08:00
Xynnn007	aae64fa3d6	agent: add agent.image_pull_timeout parameter This new parameter for kata-agent is used to control the timeout for a guest pull request. Note that sometimes an image can be really big, so we set default timeout to 1200 seconds (20 minutes). Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:54:15 +08:00
Xynnn007	93826ff90c	tests: update negative test log assertions After moving image pulling from kata-agent to CDH, the failed image pull error messages have been slightly changed. This commit is to apply for the change. Note that in original and current image-rs implementation, both no key or wrong key will result in a same error information. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:54:15 +08:00
Xynnn007	7420194ea8	build: abandon PULL_TYPE build env Now kata-agent by default supports both guest pull and host pull abilities, thus we do not need to specify the PULL_TYPE env when building kata-agent. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 13:53:55 +08:00
Xynnn007	44a6d1a6f7	docs: update guest pull document After moving guest pull abilities to CDH, the document of guest pull should be updated due to new workflow. Also, replace the diagram of PNG into a mermaid one for better maintaince. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	105cb47991	agent: always try to override oci process spec In previous version, only when the `guest-pull` feature is enabled during the build time, the OCI process will be tried to be overrided when the storage has a guest pull volume and also it is sandbox. After getting rid of the feature, whether it is guest-pull is runtimely determined thus we can always do this trying override, by checking if there is kata guest pull volume in storages and it's sandbox. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	6b1249186f	agent: embed ocicrypt config in rootfs by default Now the ocicrypt configuration used by CDH is always the same and it's not a good practics to write it into the rootfs during runtime by kata-agent. Thus we now move it to coco-guest-components build script. The config will be embedded into guest image/initrd together with CDH binary. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	22e65024ce	agent: get rid of pull-type option The feature `guest-pull` and `default-pull` are both removed, because both guest pull and host pull are supported in building time without without involving new dependencies like image-rs before. The guest pull will depend on the CDH process, not the build time feature. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	0e15b49369	agent: get rid of init_image_service we do not need to initialize image service in kata-agent now, as it's initialized in CDH. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	22c50cae7b	agent: let image_pull_handler call cdh to pull image This is a higher level calling to pull image inside guest. Now it should call confidential_data_hub's API. As the previous pull_image API does 1. check is sandbox 2. generate bundle_path inside the original logic, and the new API does not do them to keep the API semantice clean, thus before we call the API, we explicitly do the two things. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	39cd430994	agent: add ocicrypt_config envs for CDH process now image pull ability is moved to CDH, thus the CDH process needs environment variables of ocicrypt to help find the keyprovider(cdh) to decrypt images. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	f67f5c2b69	agent: remove image pull configs As image pull ability is moved to CDH, kata-agent does not need the confugurations of image pulling anymore. All these configurations reading from kernel cmdline is now implemented by CDH. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:13:20 +08:00
Xynnn007	4436fe6d99	agent: move guest pull abilities to Confidential Data Hub Image pull abilities are all moved to the separate component Confidential Data Hub (CDH) and we only left the auxiliary functions except pull_image in confidential_data_hub/image.rs Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:10:09 +08:00
Xynnn007	5067aafd56	agent: move cdh.rs and image.rs to a separate module confidential_data_hub This is a little refactoring commit that moves the mod `cdh.rs` and `image.rs` to a directory module `confidential_data_hub`. This is because the image pull ability will be moved into confidential data hub, thus it is better to handle image pull things in the confidential data hub submodule. Also, this commit does some changes upon the original code. It gets rid of a static variable for CDH timeout config and directly use the global config variable's member. Also, this changes the `is_cdh_client_initialized` function to sync version as it does not need to be async. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:10:09 +08:00
Xynnn007	997a1f35ab	agent: add PullImage to CDH proto file CDH provides the image pull api. This commit adds the declaration of the API in the CDH proto file. This will be used in following commits. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-16 11:10:09 +08:00
Xuewei Niu	c27116fa8e	Merge pull request #11416 from lifupan/prealloc runtime-rs: add the memory prealloc support for qemu/ch	2025-06-15 11:01:05 +08:00
Xuewei Niu	b43a61e2c8	Merge pull request #11418 from microsoft/saulparedes/flag_secure_mount agent: add feature flag to secure_mount method	2025-06-15 10:59:20 +08:00
Saul Paredes	cdfc9fd2d9	agent: add feature flag to secure_mount method This method is not used when guest-pull is not used. Add a flag that prevents a compile error when building with rust version > 1.84.0 and not using guest-pull Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-06-13 11:25:58 -07:00
Fabiano Fidêncio	6f0ea595b7	Merge pull request #11402 from microsoft/danmihai1/disable-nvdimm runtime: build variable for disable_image_nvdimm=true	2025-06-13 16:35:57 +02:00
Dan Mihai	0f8e453518	Merge pull request #11412 from katexochen/rego-v1 genpolicy: fix rules syntax issues, rego v1 compatibility; ci: checks for rego parsing	2025-06-13 07:30:34 -07:00
Paweł Bęza	91db41227f	runtime: Fix logging on virtiofs shutdown Fixes a confusing log message shown when Virtio-FS is disabled. Previously we logged “The virtiofsd had stopped” regardless of whether Virtio-FS was actually enabled or not. Signed-off-by: Paweł Bęza <pawel.beza99@gmail.com>	2025-06-13 15:59:52 +02:00
Fupan Li	5163156676	runtime-rs: add the memory prealloc support for cloud-hypervisor Add the memory prealloc support for cloud hypervisor too. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-13 16:26:11 +08:00
Fupan Li	fb7cfcd2fb	runtime-rs: add the memory prealloc support for qemu Add the memory prealloc support for qemu hypervisor. When it was enabled, all of the memory will be allocated and locked. This is useful when you want to reserve all the memory upfront or in the cases where you want memory latencies to be very predictable. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-06-13 16:26:03 +08:00
Steve Horsman	707b8b8a98	Merge pull request #11374 from kata-containers/dependabot/cargo/src/dragonball/tracing-1900da1d01 build(deps): bump the tracing group across 7 directories with 1 update	2025-06-13 08:30:37 +01:00
dependabot[bot]	1e6962e4a8	build(deps): bump the tracing group across 7 directories with 1 update Bumps the tracing group with 1 update in the /src/dragonball directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/libs directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/tools/agent-ctl directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/tools/genpolicy directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/tools/kata-ctl directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/tools/runk directory: [tracing](https://github.com/tokio-rs/tracing). Bumps the tracing group with 1 update in the /src/tools/trace-forwarder directory: [tracing](https://github.com/tokio-rs/tracing). Updates `tracing` from 0.1.37 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.34 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.37 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.37 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.40 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.40 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) Updates `tracing` from 0.1.29 to 0.1.41 - [Release notes](https://github.com/tokio-rs/tracing/releases) - [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.37...tracing-0.1.41) --- updated-dependencies: - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: indirect update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing - dependency-name: tracing dependency-version: 0.1.41 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: tracing ... Signed-off-by: dependabot[bot] <support@github.com>	2025-06-12 15:45:35 +00:00
Steve Horsman	6bdc0cf495	Merge pull request #11417 from kata-containers/sprt/revert-validate-ok-to-test Revert "ci: gha: Remove ok-to-test label on every push"	2025-06-12 15:04:44 +01:00
Aurélien Bombo	5200034642	Revert "ci: gha: Remove ok-to-test label on every push" This reverts commit `2ee3470627`. This is mostly redundant given we already have workflow approval for external contributors. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-12 08:40:06 -05:00
Paul Meyer	64906e6973	tests/static-checks: parse rego with opa and regorus Ensure rego policies in tree can be parsed using opa and regorus. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-12 14:59:39 +02:00
Paul Meyer	107e7dfdf6	ci/static-checks: install regorus Make regorus available for static checks as prerequisite for rego checks. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-12 14:59:39 +02:00
Steve Horsman	843655c352	Merge pull request #11411 from stevenhorsman/runk-users-crate-switch runk: Switch users crate	2025-06-12 10:35:31 +01:00
Paul Meyer	71796f7b12	ci/static-checks: install opa Make open-policy-agent available for static checks as prerequisite for rego checks. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-12 10:46:43 +02:00
Paul Meyer	5baea34fff	genpolicy/rules: rego v1 compatibility Migrate policy to rego v1. See https://www.openpolicyagent.org/docs/v0-upgrade#changes-to-rego-in-opa-v10 Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-12 10:46:43 +02:00
Fupan Li	7c1f8c9009	Merge pull request #10697 from Apokleos/no-sharefs runtime-rs: Support shared_fs = "none" for CoCo	2025-06-12 11:48:00 +08:00
Fupan Li	a495dec9f4	Merge pull request #11305 from RuoqingHe/bump-rust-1.85.1 versions: Bump Rust from 1.80.0 to 1.85.1	2025-06-12 10:21:38 +08:00
Ruoqing He	26c7f941aa	versions: Bump rust to 1.85.1 As discussed in 2025-05-22's AC call, bump rust toolchian to 1.85.1. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	5011253818	agent-ctl: Bump `ttrpc-codegen` related dependencies Bump `ttrpc-codegen` related dependencies in response to `ttrpc-codegen` bump in `libs/protocol`. Relates: #11376 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	ba75b3299f	dragonball: Fix clippy `elided_named_lifetimes` Manually fix `elided_named_lifetimes` clippy warning reported by rust 1.85.1. ```console error: elided lifetime has a name --> src/vm/aarch64.rs:113:10 \| 107 \| fn get_fdt_vm_info<'a>( \| -- lifetime `'a` declared here ... 113 \| ) -> FdtVmInfo { \| ^^^^^^^^^ this elided lifetime gets resolved as `'a` \| = note: `-D elided-named-lifetimes` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(elided_named_lifetimes)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	1bbedb8def	dragonball: Fix clippy `repr_packed_without_abi` Fix `repr_packed_without_abi` clippy warning as suggested by rust 1.85.1. ```console error: item uses `packed` representation without ABI-qualification --> dbs_pci/src/msi.rs:468:1 \| 466 \| #[repr(packed)] \| ------ `packed` representation set here 467 \| #[derive(Clone, Copy, Default, PartialEq)] 468 \| / pub struct MsiState { 469 \| \| msg_ctl: u16, 470 \| \| msg_addr_lo: u32, 471 \| \| msg_addr_hi: u32, 472 \| \| msg_data: u16, 473 \| \| mask_bits: u32, 474 \| \| } \| \|_^ \| = warning: unqualified `#[repr(packed)]` defaults to `#[repr(Rust, packed)]`, which has no stable ABI = help: qualify the desired ABI explicity via `#[repr(C, packed)]` or `#[repr(Rust, packed)]` = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#repr_packed_without_abi = note: `-D clippy::repr-packed-without-abi` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::repr_packed_without_abi)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	e8be3c13fb	dragonball: Fix clippy `missing_docs` Fix `missing_docs` clippy warning as suggested by rust 1.85.1. ```console error: missing documentation for an associated function --> src/device_manager/mod.rs:1299:9 \| 1299 \| pub fn new_test_mgr() -> Self { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: `-D missing-docs` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(missing_docs)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	ceff1ed98d	dragonball: Fix clippy `needless_lifetimes` Fix `needless_lifetimes` clippy warning as suggested by rust 1.85.1. ```console error: the following explicit lifetimes could be elided: 'a --> dbs_virtio_devices/src/vhost/vhost_user/connection.rs:137:6 \| 137 \| impl<'a, AS: GuestAddressSpace, Q: QueueT, R: GuestMemoryRegion> EndpointParam<'a, AS, Q, R> { \| ^^ ^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes = note: `-D clippy::needless-lifetimes` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::needless_lifetimes)]` help: elide the lifetimes \| 137 - impl<'a, AS: GuestAddressSpace, Q: QueueT, R: GuestMemoryRegion> EndpointParam<'a, AS, Q, R> { 137 + impl<AS: GuestAddressSpace, Q: QueueT, R: GuestMemoryRegion> EndpointParam<'_, AS, Q, R> { \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	c04f1048d5	dragonball: Fix clippy `unnecessary_lazy_evaluations` Fix `unnecessary_lazy_evaluations` clippy warning as suggested by rust 1.85.1. ```console error: unnecessary closure used to substitute value for `Option::None` --> dbs_virtio_devices/src/vhost/vhost_user/block.rs:225:28 \| 225 \| let vhost_socket = config_path \| ____________________________^ 226 \| \| .strip_prefix("spdk://") 227 \| \| .ok_or_else(\|\| VirtIoError::InvalidInput)? \| \|_____________________________________________________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_lazy_evaluations = note: `-D clippy::unnecessary-lazy-evaluations` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::unnecessary_lazy_evaluations)]` help: use `ok_or` instead \| 227 \| .ok_or(VirtIoError::InvalidInput)? \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn> unnecessary_lazy_evaluations Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	16b45462a1	dragonball: Fix clippy `manual_inspect` Manually fix `manual_inspect` clippy warning reported by rust 1.85.1. ```console error: using `map_err` over `inspect_err` --> dbs_virtio_devices/src/net.rs:753:52 \| 753 \| self.device_info.read_config(offset, data).map_err(\|e\| { \| ^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_inspect = note: `-D clippy::manual-inspect` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_inspect)]` help: try \| 753 ~ self.device_info.read_config(offset, data).inspect_err(\|e\| { 754 ~ self.metrics.cfg_fails.inc(); \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	5e80293bfc	dragonball: Fix clippy `empty_line_after_doc_comments` Fix `empty_line_after_doc_comments` clippy warning as suggested by rust 1.85.1. ```console error: empty line after doc comment --> dbs_boot/src/x86_64/layout.rs:11:1 \| 11 \| / /// Magic addresses externally used to lay out x86_64 VMs. 12 \| \| \| \|_^ 13 \| /// Global Descriptor Table Offset 14 \| pub const BOOT_GDT_OFFSET: u64 = 0x500; \| ------------------------------ the comment documents this constant \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#empty_line_after_doc_comments = note: `-D clippy::empty-line-after-doc-comments` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::empty_line_after_doc_comments)]` = help: if the empty line is unintentional remove it help: if the documentation should include the empty line include it in the comment \| 12 \| /// \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	bb13b6696e	dragonball: Fix clippy `manual_div_ceil` Fix `manual_div_ceil` clippy warning as suggested by rust 1.85.1. ```console error: manually reimplementing `div_ceil` --> dbs_interrupt/src/kvm/mod.rs:202:24 \| 202 \| let elem_cnt = (total_sz + elem_sz - 1) / elem_sz; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider using `.div_ceil()`: `total_sz.div_ceil(elem_sz)` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_div_ceil = note: `-D clippy::manual-div-ceil` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_div_ceil)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	e58bd52dd8	dragonball: Fix clippy `precedence` Fix `precedence` clippy warning as suggested by rust 1.85.1. ```console error: operator precedence can trip the unwary --> dbs_interrupt/src/kvm/mod.rs:169:6 \| 169 \| (u64::from(type1) << 48 \| u64::from(entry.type_) << 32) \| u64::from(entry.gsi) \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider parenthesizing your expression: `(u64::from(type1) << 48) \| (u64::from(entry.type_) << 32)` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#precedence = note: `-D clippy::precedence` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::precedence)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	44142b13d3	genpolicy: Fix clippy `unstable_name_collisions` Manually fix `unstable_name_collisions` clippy warning reported by rust 1.85.1. ```console error: a method with this name may be added to the standard library in the future --> src/registry.rs:646:10 \| 646 \| file.unlock()?; \| ^^^^^^ \| = warning: once this associated item is added to the standard library, the ambiguity may cause an error or change in behavior! = note: for more information, see issue #48919 <https://github.com/rust-lang/rust/issues/48919> = help: call with fully qualified syntax `fs2::FileExt::unlock(...)` to keep using the current method = note: `-D unstable-name-collisions` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unstable_name_collisions)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	366d293141	genpolicy: Fix clippy `manual_unwrap_or_default` Manually fix `manual_unwrap_or_default` clippy warning reported by rust 1.85.1. ```console error: if let can be simplified with `.unwrap_or_default()` --> src/registry.rs:619:37 \| 619 \| let mut data: Vec<ImageLayer> = if let Ok(vec) = serde_json::from_reader(read_file) { \| _____________________________________^ 620 \| \| vec 621 \| \| } else { ... \| 624 \| \| }; \| \|_____^ help: replace it with: `serde_json::from_reader(read_file).unwrap_or_default()` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_unwrap_or_default = note: `-D clippy::manual-unwrap-or-default` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_unwrap_or_default)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	a71a77bfa3	genpolicy: Fix clippy `manual_div_ceil` Manually fix `manual_div_ceil` clippy warning reported by rust 1.85.1. ```console error: manually reimplementing `div_ceil` --> src/verity.rs:73:25 \| 73 \| let count = (data_size + entry_size - 1) / entry_size; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider using `.div_ceil()`: `data_size.div_ceil(entry_size)` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_div_ceil = note: `-D clippy::manual-div-ceil` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_div_ceil)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	5d491bd4f4	genpolicy: Bump `ttrpc-codegen` related dependencies Bump `ttrpc-codegen` related dependencies in response to `ttrpc-codegen` bump in `libs/protocol`. Relates: #11376 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	965f1d799c	kata-ctl: Fix clippy `empty_line_after_outer_attr` Manually fix `empty_line_after_outer_attr` clippy warning reported by rust 1.85.1. ```console error: empty line after outer attribute --> src/check.rs:515:9 \| 515 \| / #[allow(dead_code)] 516 \| \| \| \|_^ 517 \| struct TestData<'a> { \| ------------------- the attribute applies to this struct \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#empty_line_after_outer_attr = note: `-D clippy::empty-line-after-outer-attr` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::empty_line_after_outer_attr)]` = help: if the empty line is unintentional remove it ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	3d64b11454	kata-ctl: Fix clippy `question_mark` Manually fix `question_mark` clippy warning reported by rust 1.85.1. ```console error: this `match` expression can be replaced with `?` --> src/ops/check_ops.rs:49:13 \| 49 \| let f = match get_builtin_check_func(check) { \| _____________^ 50 \| \| Ok(fp) => fp, 51 \| \| Err(e) => return Err(e), 52 \| \| }; \| \|_____^ help: try instead: `get_builtin_check_func(check)?` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#question_mark = note: `-D clippy::question-mark` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::question_mark)]` ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	702ba4033e	kata-ctl: Bump `ttrpc-codegen` related dependencies Bump `ttrpc-codegen` related dependencies in response to `ttrpc-codegen` bump in `libs/protocol`. Relates: #11376 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	f70c17660a	runtime-rs: Fix clippy `unnecessary_map_or` Fix `unnecessary_map_or` clippy warning as suggested by rust 1.85.1. error: this `map_or` can be simplified --> crates/hypervisor/src/ch/inner_hypervisor.rs:1054:24 \| 1054 \| let have_tdx = fs::read(TDX_KVM_PARAMETER_PATH) \| ________________________^ 1055 \| \| .map_or(false, \|content\| !content.is_empty() && content[0] == b'Y'); \| \|_______________________________________________________________________________^ help: use is_ok_and instead: `fs::read(TDX_KVM_PARAMETER_PATH).is_ok_and(\|content\| !content.is_empty() && content[0] == b'Y')` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_map_or = note: `-D clippy::unnecessary-map-or` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::unnecessary_map_or)]` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	d7dfab92be	runtime-rs: Fix clippy `manual_inspect` Manually fix `manual_inspect` clippy warning reported by rust 1.85.1. ```console error: using `map` over `inspect` --> crates/resource/src/cdi_devices/container_device.rs:50:10 \| 50 \| .map(\|device\| { \| ^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_inspect = note: `-D clippy::manual-inspect` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_inspect)]` help: try \| 50 ~ .inspect(\|device\| { 51 \| // push every device's Device to agent_devices 52 ~ devices_agent.push(device.device.clone()); \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	4c467f57de	runtime-rs: Fix clippy `needless_return` Fix `needless_return` clippy warning as suggested by rust 1.85.1. ```console error: unneeded `return` statement --> crates/resource/src/rootfs/nydus_rootfs.rs:199:5 \| 199 \| return Some(prefetch_list_path.display().to_string()); \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_return = note: `-D clippy::needless-return` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::needless_return)]` help: remove `return` \| 199 - return Some(prefetch_list_path.display().to_string()); 199 + Some(prefetch_list_path.display().to_string()) \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	23365fc7e2	runtime-rs: Bump `ttrpc-codegen` related dependencies Bump `ttrpc-codegen` related dependencies in response to `ttrpc-codegen` bump in `libs/protocol`. Relates: #11376 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Ruoqing He	bd4d9cf67c	agent: Fix clippy `empty_line_after_doc_comments` Manually fix `empty_line_after_doc_comments` clippy warning reported by rust 1.85.1. ```console error: empty line after doc comment --> src/linux_abi.rs:8:1 \| 8 \| / /// Linux ABI related constants. 9 \| \| \| \|_^ 10 \| #[cfg(target_arch = "aarch64")] 11 \| use std::fs; \| ------- the comment documents this import \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#empty_line_after_doc_comments = note: `-D clippy::empty-line-after-doc-comments` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::empty_line_after_doc_comments)]` = help: if the empty line is unintentional remove it ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 13:50:10 +00:00
Paul Meyer	d488c998c7	genpolicy/rules: fix syntax issue Policy wan't parsable with OPA due to surplus whitespace. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-06-11 14:48:36 +02:00
Steve Horsman	c8fcda0d73	Merge pull request #11407 from Champ-Goblem/fix/nvidia-rootfs-only-copy-opa-when-agent-policy-enabled nvidia-rootfs: only copy `kata-opa` if `AGENT_POLICY` is enabled	2025-06-11 13:39:07 +01:00
stevenhorsman	39f51b4c6d	runk: Switch users crate The users@0.11.0 has a high severity CVE-2025-5791 and doesn't seem to be maintained, so switch to uzers which forked it. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-11 12:03:28 +01:00
Champ-Goblem	d6c45027f5	nvidia-rootfs: only copy `kata-opa` if `AGENT_POLICY` is enabled In the nvidia rootfs build, only copy in `kata-opa` if `AGENT_POLICY` is enabled. This fixes builds when `AGENT_POLICY` is disabled and opa is not built. Signed-off-by: Champ-Goblem <cameron@northflank.com>	2025-06-11 11:25:10 +02:00
Ruoqing He	2ccb306c0b	agent: Fix clippy `precedence` Fix `precedence` clippy warning as suggested by rust 1.85.1. ```console warning: operator precedence can trip the unwary --> src/pci.rs:54:19 \| 54 \| Ok(SlotFn(ss8 << FUNCTION_BITS \| f8)) \| ^^^^^^^^^^^^^^^^^^^^^^^^^ help: consider parenthesizing your expression: `(ss8 << FUNCTION_BITS) \| f8` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#precedence ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	048178bc5e	agent: Fix clippy `unnecessary_get_then_check` Manually fix `unnecessary_get_then_check` clippy warning as suggested by rust 1.85.1. ```console warning: unnecessary use of `get(&shared_mount.src_ctr).is_none()` --> src/sandbox.rs:431:25 \| 431 \| if src_ctrs.get(&shared_mount.src_ctr).is_none() { \| ---------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| \| \| help: replace it with: `!src_ctrs.contains_key(&shared_mount.src_ctr)` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_get_then_check ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	54ec432178	agent: Fix clippy `partialeq_to_none` Fix `partialeq_to_none` clippy warning as suggested by rust 1.85.1. ```console warning: binary comparison to literal `Option::None` --> src/sandbox.rs:431:16 \| 431 \| if src_ctrs.get(&shared_mount.src_ctr) == None { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use `Option::is_none()` instead: `src_ctrs.get(&shared_mount.src_ctr).is_none()` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#partialeq_to_none ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	95dca31ecc	agent: Fix clippy `question_mark` Fix `question_mark` clippy warning as suggested by rust 1.85.1. ```console warning: this `match` expression can be replaced with `?` --> rustjail/src/cgroups/fs/mod.rs:1327:20 \| 1327 \| let dev_type = match DeviceType::from_char(d.typ().as_str().chars().next()) { \| ____________________^ 1328 \| \| Some(t) => t, 1329 \| \| None => return None, 1330 \| \| }; \| \|_____^ help: try instead: `DeviceType::from_char(d.typ().as_str().chars().next())?` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#question_mark ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	5a95a65604	agent: Fix clippy `unnecessary_map_or` Fix `unnecessary_map_or` clippy warning as suggested by rust 1.85.1. ```console warning: this `map_or` can be simplified --> rustjail/src/container.rs:1424:20 \| 1424 \| if namespace \| ____________________^ 1425 \| \| .path() 1426 \| \| .as_ref() 1427 \| \| .map_or(true, \|p\| p.as_os_str().is_empty()) \| \|_______________________________________________________________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_map_or help: use is_none_or instead \| 1424 ~ if namespace 1425 + .path() 1426 + .as_ref().is_none_or(\|p\| p.as_os_str().is_empty()) \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	f9c76edd23	agent: Fix clippy `manual_inspect` Manually fix `manual_inspect` clippy warning reported by rust 1.85.1. ```console warning: using `map_err` over `inspect_err` --> rustjail/src/mount.rs:881:6 \| 881 \| .map_err(\|e\| { \| ^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_inspect help: try \| 881 ~ .inspect_err(\|&e\| { 882 ~ log_child!(cfd_log, "mount error: {:?}", e); \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Ruoqing He	7ff34f00c2	agent: Fix clippy `single_match` Fix `single_match` clippy warning as suggested by rust 1.85.1. ```console warning: you seem to be trying to use `match` for destructuring a single pattern. Consider using `if let` --> src/image.rs:241:9 \| 241 \| / match oci.annotations() { 242 \| \| Some(a) => { 243 \| \| if ImageService::is_sandbox(a) { 244 \| \| return ImageService::get_pause_image_process(); ... \| 247 \| \| None => {} 248 \| \| } \| \|_________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#single_match help: try \| 241 ~ if let Some(a) = oci.annotations() { 242 + if ImageService::is_sandbox(a) { 243 + return ImageService::get_pause_image_process(); 244 + } 245 + } \| ``` Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-11 07:18:09 +00:00
Alex Lyn	e99070afb4	Merge pull request #11343 from Apokleos/cc-blk-sharefs Enables block device and disable virtio-fs	2025-06-11 11:52:52 +08:00
Alex Lyn	2d570db08b	Merge pull request #11179 from Apokleos/tdx-qemu-rs runtime-rs: Add TDX Support to runtime-rs for Confidential Containers (CoCo)	2025-06-11 10:27:36 +08:00
alex.lyn	2e9d27c500	runtime-rs: Enables block device and disable virtio-fs via capabilities Kata runtime employs a CapabilityBits mechanism for VMM capability governance. Fundamentally, this mechanism utilizes predefined feature flags to manage the VMM's operational boundaries. To meet demands for storage performance and security, it's necessary to explicitly enable capability flags such as `BlockDeviceSupport` (basic block device support) and `BlockDeviceHotplugSupport` (block device hotplug) which ensures the VMM provides the expected caps. In CoCo scenarios, due to the potential risks of sensitive data leaks or side-channel attacks introduced by virtio-fs through shared file systems, the `FsSharingSupport` flag must be forcibly disabled. This disables the virtio-fs feature at the capability set level, blocking insecure data channels. Fixes #11341 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-11 10:19:13 +08:00
alex.lyn	23340b6b5f	runtime-rs: Support cold plug of block devices via virtio-blk for Qemu Two key important scenarios: (1) Support `virtio-blk-pci` cold plug capability for confidential guests instead of nvdimm device in CVM due to security constraints in CoCo cases. (2) Push initdata payload into compressed raw block device and insert it in CVM through `virtio-blk-pci` cold plug mechanism. Fixes #11341 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-11 10:19:13 +08:00
RuoqingHe	7916db9613	Merge pull request #11345 from Apokleos/fix-noise protocols: Fix the noise caused by non-formatted codes in protocols	2025-06-11 09:50:02 +08:00
Aurélien Bombo	66ae9473cb	Merge pull request #11397 from kata-containers/sprt/validate-ok-to-test ci: gha: Remove ok-to-test label on every push	2025-06-10 16:42:54 -05:00
Aurélien Bombo	31288ea7fc	Merge pull request #11398 from kata-containers/sprt/undo-mariner-hotfix Revert "ci: Fix Mariner rootfs build failure"	2025-06-10 16:09:08 -05:00
Aurélien Bombo	f34010cc94	Merge pull request #11388 from kata-containers/sprt/azure-oidc ci: Use OIDC to log into Azure	2025-06-10 13:08:44 -05:00
Steve Horsman	6424055eeb	Merge pull request #11393 from stevenhorsman/bump-chrono-0.4.41 libs: Bump chrono package	2025-06-10 16:47:18 +01:00
stevenhorsman	99e70100c7	workflows: Set persist-credentials: false on checkout By default the checkout action leave the credentials in the checked-out repo's `.git/config`, which means they could get exposed. Use persist-credentials: false to prevent this happening. Note: static-checks.yaml does use git diff after the checkout, but the git docs state that git diff is just local, so doesn't need authentication. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-10 10:33:41 +01:00
RuoqingHe	5b8f7b2e3c	Merge pull request #11391 from RuoqingHe/disable-runtime-rs-test-on-riscv runtime-rs: Skip test on RISC-V architecture	2025-06-10 17:28:12 +08:00
Xuewei Niu	ac6779428f	Merge pull request #11377 from justxuewei/hvsock-logging	2025-06-10 16:45:59 +08:00
alex.lyn	c8433c6b70	kata-sys-util: Update TDX platform detection for newer TDX platforms On newer TDX platforms, checking `/sys/firmware/tdx` for `major_version` and `minor_version` is no longer necessary. Instead, we only need to verify that `/sys/module/kvm_intel/parameters/tdx` is set to `'Y'`. This commit addresses the following: (1) Removes the outdated check and corrects related code, primarily impacting `cloud-hypervisor`. (2) Refines the TDX platform detection logic within `arch_guest_protection`. Fixes #11177 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	8652aa7417	kata-types: Enable QGS port via configuration Currently, the TDX Quote Generation Service (QGS) connection in QEMU with default vsock port 4050 for TD attestation. To make it flexible for users to modify the QGS port. Based on the introduced qgs_port, This commit supports the QGS port to be configured via configuration Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	f8d1ee8b1c	kata-types: Introduce QGS port for TD attestation in Hypervisor config Currently, the TDX Quote Generation Service (QGS) connection in QEMU is hardcoded to vsock port 4050, which limits flexibility for TD attestation. While the users will be able to modify the QGS port. To address this inflexibility, this commit introduces a new qgs_port field within security info and make it default with 4050. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	49ced4d43c	runtime-rs: Prepare Tdx protection device in start sandbox During the prepare for `start sandbox` phase, this commit ensures the correct `ProtectionDeviceConfig` is prepared based on the `GuestProtection` type in a TEE platform. Specifically, for the TDX platform, this commit sets the essential parameters within the ProtectionDeviceConfig, including the TDX ID, firmware path, and the default QGS port (4050). This information is then passed to the underlying VMM for further processing using the existing ResourceManager and DeviceManager infrastructure. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	bab77e2d65	runtime-rs: Introduce Tdx Protection Device and add it into cmdline This patch introduces TdxConfig with key fields, firmare, qgs_port, mrconfigid, and other useful things. With this config, a new ProtectionDeviceConfig type `Tdx(TdxConfig)` is added. With this new type supported, we finally add tdx protection device into the cmdline to launch a TDX-based CVM. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	09fddac2c4	runtime-rs: Introduce 'tdx-guest' object and its builder for TDX CVMs This commit introduces the `tdx-guest` designed to facilitate the launch of CVMs leveraging Intel's TDX. Launching a TDX-based CVM requires various properties, including `quote-generation-socket`, and `mrconfigid`,`sept-ve-disable` .etc. (1) The `quote-generation-socket` property is added to the `tdx-guest` object, which is of type `SocketAddress`, specifies the address of the Quote Generation Service (QGS). (2) The `mrconfigid` property, representing the SHA384 hash for non-owner-defined configurations of the guest TD, is introduced as a runtime or OS configuration parameter. (3) And the `sept-ve-disable` property allows control over whether EPT violation conversions to #VE exceptions are disabled when the guest TD accesses PENDING pages. With the introduction of the `tdx-guest` object and its associated properties, launching TDX-based CVMs is now supported. For example, a TDX guest can be configured via the command line as follows: ```shell -object {"qom-type":"tdx-guest", "id":"tdx", "sept-ve-disable":true,\ "mrconfigid":"vHswGkzG4B3Kikg96sLQ5vPCYx4AtuB4Ubfzz9UOXvZtCGat8b8ok7Ubz4AxDDHh",\ "quote-generation-socket":{"type":"vsock","cid":"2","port":"4050"} \ -machine q35,accel=kvm,confidential-guest-support=tdx ``` Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	1d4ffe6af3	runtime-rs: Implement serializable SocketAddress with Serde This enables consistent JSON representation of socket addresses across system components: (1) Add serde serialization/deserialization with standardized field naming convention. (2) Enforce string-based port/cid and unix/path representation for protocol compatibility. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:31:25 +08:00
alex.lyn	65931fb75f	protocols: Fix the noise caused by non-formatted codes in protocols ``` - decoded.strip_prefix("CAP_").unwrap_or(decoded) + decoded + .strip_prefix("CAP_") + .unwrap_or(decoded) .parse::<oci::Capability>() .unwrap_or_else(\|_\| panic!("Failed to parse {:?} to Enum Capability", cap)) }) @@ -1318,8 +1320,6 @@ mod tests { #[test] #[should_panic] fn test_cap_vec2hashset_bad() { - cap_vec2hashset(vec![ - "CAP_DOES_NOT_EXIST".to_string(), - ]); + cap_vec2hashset(vec!["CAP_DOES_NOT_EXIST".to_string()]); ``` Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:30:33 +08:00
alex.lyn	f3c8ef9200	kata-types: Support disabled sharefs with config of shared_fs = "none" For CoCo, shared_fs is prohibited as we cannot guarantee the security of guest/host sharing. Therefore, this PR enables administrators to configure shared_fs = "none" via the configuration.toml file, thereby enforcing the disablement of sharing. Fixes #10677 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-06-10 11:30:01 +08:00
Dan Mihai	d37feac679	tests: test mariner with disable_image_nvdimm=true Run the k8s tests on mariner with annotation disable_image_nvdimm=true, to use virtio-blk instead of nvdimm for the guest rootfs block device. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-10 02:03:31 +00:00
Dan Mihai	1aeef52bae	clh: runtime: add disable_image_nvdimm support Allow users to build using DEFDISABLEIMAGENVDIMM=true if they want to set disable_image_nvdimm=true in configuration-clh.toml. disable_image_nvdimm=false is the default config value. Also, use virtio-blk instead of nvdimm if disable_image_nvdimm=true in configuration-clh.toml. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-10 02:00:52 +00:00
Dan Mihai	0dd9325264	qemu: runtime: build variable for disable_image_nvdimm=true Allow users to build using DEFDISABLEIMAGENVDIMM=true if they want to set disable_image_nvdimm=true in configuration-qemu*.toml. disable_image_nvdimm=false is the default configuration value. Note that the value of disable_image_nvdimm gets ignored for platforms using "confidential_guest = true". Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-10 01:57:42 +00:00
Dan Mihai	d51e0c9875	snp: gpu: comment out disable_image_nvdimm config Comment out "disable_image_nvdimm = true" in: - configuration-qemu-snp.toml - configuration-qemu-nvidia-gpu-snp.toml for consistency with the other configuration-qemu*.toml files. Those two platforms are using "confidential_guest = true", and therefore the value of disable_image_nvdimm gets ignored. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-10 01:44:51 +00:00
stevenhorsman	ac9d3eb7be	libs: Bump chrono package Bump chrono package to 0.4.41 and thereby remove the time 0.1.43 dependency and remediate CVE-2020-26235 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-09 21:01:27 +01:00
Aurélien Bombo	004c1a4595	Revert "ci: Fix Mariner rootfs build failure" This reverts commit `dfa25a42ff`. The original issue was fixed: https://github.com/microsoft/azurelinux/issues/13971#issuecomment-2956384627	2025-06-09 14:06:07 -05:00
Aurélien Bombo	2ee3470627	ci: gha: Remove ok-to-test label on every push This removes the ok-to-test label on every push, except if the PR author has write access to the repo (ie. permission to modify labels). This protects against attackers who would initially open a genuine PR, then push malicious code after the initial review. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-09 12:37:06 -05:00
Aurélien Bombo	9488ce822d	Merge pull request #11396 from kata-containers/sprt/fix-mariner-image ci: Fix Mariner rootfs build failure	2025-06-09 12:32:14 -05:00
Aurélien Bombo	dfa25a42ff	ci: Fix Mariner rootfs build failure This implements a workaround for microsoft/azurelinux#13971 to unblock the CI. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-09 10:56:10 -05:00
Alex Lyn	2979312f7b	Merge pull request #11381 from RuoqingHe/log-instead-of-format runtime-rs: Log error instead of format	2025-06-09 11:54:13 +08:00
Ruoqing He	e290587f9c	runtime-rs: Skip test on RISC-V architecture Full set test on RISC-V architecture is not yet supported, skip it for now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-09 01:49:47 +00:00
Ruoqing He	781510202a	runtime-rs: Log error instead of format Log on error condition when `umount` operation fail instead of `format!` error message. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-08 08:28:22 +00:00
Xuewei Niu	17b2daf0a7	Merge pull request #11357 from justxuewei/nxw/remove-dcode dragonball: Remove a useless dead_code attribute	2025-06-08 16:07:03 +08:00
Dan Mihai	e067a1be64	Merge pull request #11358 from burgerdev/gid-warning genpolicy: improvements to /etc/passwd checks	2025-06-06 17:04:27 -07:00
Aurélien Bombo	9dd3807467	ci: Use OIDC to log into Azure This completely eliminates the Azure secret from the repo, following the below guidance: https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/configuring-openid-connect-in-azure The federated identity is scoped to the `ci` environment, meaning: * I had to specify this environment in some YAMLs. I don't believe there's any downside to this. * As previously, the CI works seamlessly both from PRs and in the manual workflow. I also deleted the tools/packaging/kata-deploy/action folder as it doesn't seem to be used anymore, and it contains a reference to the secret. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-06-06 15:26:10 -05:00
Steve Horsman	31a8944da1	Merge pull request #11334 from kata-containers/remove-inherit-secrets workflows: Replace secrets: inherit	2025-06-06 16:41:13 +01:00
Steve Horsman	9555f2ce08	Merge pull request #11387 from burgerdev/riscv-artifact-name ci: fix artifact name of RISC-V tarball	2025-06-06 15:50:21 +01:00
stevenhorsman	66ef1c1198	workflows: Replace secrets: inherit Having secrets unconditionally being inherited is bad practice, so update the workflows to only pass through the minimal secrets that are needed Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-06 09:56:46 +01:00
stevenhorsman	89d038d2b4	workflows: Switch QUAY_DEPLOYER_USERNAME to var QUAY_DEPLOYER_USERNAME isn't sensitive, so update the secret for a var to simplify the workflows Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-06 09:49:14 +01:00
stevenhorsman	2eda21180a	workflows: Switch AUTHENTICATED_IMAGE_USER to var AUTHENTICATED_IMAGE_USER isn't sensitive, so update the secret for a var to simplify the workflows Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-06 09:49:14 +01:00
Markus Rudy	9ffed463a1	ci: fix artifact name of RISC-V tarball The artifact name accidentally referred to ARM64, which caused a clash in CI runs. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-06 08:29:48 +02:00
RuoqingHe	567296119d	Merge pull request #11317 from kimullaa/remove-obsolete-parameters runtime: remove hotplug_vfio_on_root_bus from config.toml	2025-06-06 04:03:03 +02:00
Steve Horsman	9ff650b641	Merge pull request #11383 from stevenhorsman/remove-docker-hub-publish Switch docker hub mirroring to ghcr.io	2025-06-05 17:16:18 +01:00
Shunsuke Kimura	5193cfedca	runtime: remove hotplug_vfio_on_root_bus from toml In this commit, hotplug_vfio_on_root_bus parameter is removed. <`dd422ccb69`> pcie_root_port parameter description (`This value is valid when hotplug_vfio_on_root_bus is true and machine_type is "q35"`) will have no value, and not completely valid, since vrit or DB as also support for root-ports and CLH as well. so removed. Fixes: #11316 Co-authored-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-06-05 21:53:06 +09:00
Steve Horsman	0f8104a2df	Merge pull request #11376 from RuoqingHe/upgrade-ttrpc-0.5.0 Upgrade `ttrpc-codegen` and `protobuf` to kill `#![allow(box_pointers)]`	2025-06-05 13:02:13 +01:00
stevenhorsman	6c6e16eef3	workflows: Remove docker hub registry publishing As docker hub has rate limiting issues, inside mirror quay.io to ghcr.io instead Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-05 11:46:51 +01:00
Markus Rudy	1c240de58d	genpolicy: don't parse /etc/passwd in a loop Instead of looping over the users per group and parsing passwd for each user, we can do the reverse lookup uid->user up front and then compare the names directly. This has the nice side-effect of silencing warnings about non-existent users mentioned in /etc/group, which is not relevant for policy decisions. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-04 17:54:57 +02:00
Markus Rudy	a1baaf6fe2	genpolicy: ignore groups with same name as user containerd does not automatically add groups to the list of additional GIDs when the groups have the same name as the user: https://github.com/containerd/containerd/blob/f482992/pkg/oci/spec_opts.go#L852-L854 This is a bug and should be corrected, but it has been present since at least 1.6.0 and thus affects almost all containerd deployments in existence. Thus, we adopt the same behavior and ignore groups with the same name as the user when calculating additional GIDs. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-04 10:29:49 +02:00
Xuewei Niu	77ca2fe88b	runtime-rs: Reduce the number of duplicate log entries being printed When connecting to guest through vsock, a log is printed for each failure. The failure comes from two main reasons: (1) the guest is not ready or (2) some real errors happen. Printing logs for the first case leads to log clutter, and your logs will like this: ``` Feb 07 02:47:24 ubuntu containerd[520]: {"msg":"connect uds \"/run/kata/... Feb 07 02:47:24 ubuntu containerd[520]: {"msg":"connect uds \"/run/kata/... Feb 07 02:47:24 ubuntu containerd[520]: {"msg":"connect uds \"/run/kata/... Feb 07 02:47:24 ubuntu containerd[520]: {"msg":"connect uds \"/run/kata/... Feb 07 02:47:24 ubuntu containerd[520]: {"msg":"connect uds \"/run/kata/... ``` To avoid this, the sock implmentations save the last error and return it after all retries are exhausted. Users are able to check all errors by setting the log level to trace. Reorganize the log format to "{sock type}: {message}" to make it clearer. Apart from that, errors return by the socks use `self`, instead of `ConnectConfig`, since the `ConnectConfig` doesn't provide any useful information. Disable infinite loop for the log forwarder. There is retry logic in the sock implmentations. We can consider the agent-log unavailable if `sock.connect()` encounters an error. Fixes: #10847 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-06-04 12:25:32 +08:00
Xuewei Niu	3f8dd821e6	dragonball: Remove a useless dead_code attribute The vhost-user-fs has been added to Dragonball, so we can remove `update_memory`'s dead_code attribute. Fixes: #8691 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2025-06-04 11:34:16 +08:00
Ruoqing He	77e68b164e	agent: Upgrade `ttrpc-codegen` to 0.5.0 Propagate `ttrpc-codegen` upgrade from `libs/protocols` to `agent`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-04 01:16:46 +00:00
Ryan Savino	1e686dbca7	agent: Remove casting and fix Arc declaration Removed unnecessary dynamic dispatch for services. Properly dereferenced service Box values and stored in Arc. Co-authored-by: Ruoqing He <heruoqing@iscas.ac.cn> Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn> Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-06-04 01:16:46 +00:00
Ruoqing He	0471f01074	libs: Bump `ttrpc-codegen` and `protobuf` Previous version of `ttrpc-codegen` is generating outdated `#![allow(box_pointers)]` which was deprecated. Bump `ttrpc-codegen` from v0.4.2 to v0.5.0 and `protobuf` from vx to v3.7.1 to get rid of this. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-06-04 01:16:18 +00:00
Markus Rudy	eeb3d1384b	genpolicy: compare additionalGIDs as sets The additional GIDs are handled by genpolicy as a BTreeSet. This set is then serialized to an ordered JSON array. On the containerd side, the GIDs are added to a list in the order they are discovered in /etc/group, and the main GID of the user is prepended to that list. This means that we don't have any guarantees that the input GIDs will be sorted. Since the order does not matter here, comparing the list of GIDs as sets is close enough. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-03 20:18:35 +02:00
Aurélien Bombo	8c3f8f8e21	Merge pull request #11339 from kata-containers/sprt/require-agent-ctl ci: Require agent-ctl tests	2025-06-03 11:58:33 -04:00
Steve Horsman	74e47382f8	Merge pull request #11016 from stevenhorsman/dependabot-configuration workflows: Add dependabot config	2025-06-03 15:12:32 +01:00
Steve Horsman	8176eefdac	Merge pull request #10748 from zvonkok/helm-doc doc: Add Helm Chart entry	2025-06-03 14:48:19 +01:00
Markus Rudy	02ad39ddf1	genpolicy: push down warning about missing passwd file The warning used to trigger even if the passwd file was not needed. This commit moves it down to where it actually matters. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-03 11:19:29 +02:00
Markus Rudy	ec969e4dcd	genpolicy: remove redundant group check https://github.com/kata-containers/kata-containers/pull/11077 established that the GID from the image config is never used for deriving the primary group of the container process. This commit removes the associated logic that derived a GID from a named group. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-06-03 10:59:10 +02:00
Zvonko Kaiser	985e965adb	doc: Added Helm Chart README.md We need more and accurate documentation. Let's start by providing an Helm Chart install doc and as a second step remove the kustomize steps. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Co-authored-by: Steve Horsman <steven@uk.ibm.com>	2025-06-02 23:26:16 +00:00
Dan Mihai	dc0da567cd	Merge pull request #11340 from microsoft/danmihai1/image-size-alignment image: custom guest rootfs image file size alignment	2025-06-02 14:33:21 -07:00
Dan Mihai	c2c194d860	kata-deploy: smaller guest image file for mariner Align up the mariner Guest image file size to 2M instead of the default 128M alignment. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-02 16:15:17 +00:00
Dan Mihai	65385a5bf9	image: custom guest rootfs image file size alignment The Guest rootfs image file size is aligned up to 128M boundary, since commmit `2b0d5b2`. This change allows users to use a custom alignment value - e.g., to align up to 2M, users will be able to specify IMAGE_SIZE_ALIGNMENT_MB=2 for image_builder.sh. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-06-02 16:15:17 +00:00
Steve Horsman	c575048aa7	Merge pull request #11329 from Xynnn007/fix-initdata-snp Fix \| Support initdata for SNP	2025-06-02 15:24:12 +01:00
stevenhorsman	ae352e7e34	ci: Add dependabot groups - Create groups for commonly seen cargo packages so that rather than getting up to 9 PRs for each rust components, bumps to the same package are grouped together. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-02 14:45:31 +01:00
stevenhorsman	a94388cf61	ci: Add dependabot config - Create a dependabot configuration to check for updates to our rust and golang packages each day and our github actions each month Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-06-02 14:45:31 +01:00
Xynnn007	8750eadff2	test: turn SNP on for initdata tests After the last commit, the initdata test on SNP should be ok. Thus we turn on this flag for CI. Fixes #11300 Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-02 20:33:19 +08:00
Xynnn007	39aa481da1	runtime: fix initdata support for SNP the qemu commandline of SNP should start with `sev-snp-guest`, and then following other parameters separeted by ','. This patch fixes the parameter order. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-06-02 20:33:19 +08:00
Fabiano Fidêncio	57f3cb8b3b	Merge pull request #11344 from fidencio/topic/kernel-add-tuntap-move-memagent-stuff kernel: Add CONFIG_TUN (needed for VPNs) and move mem-agent related configs to common	2025-06-01 21:32:07 +02:00
RuoqingHe	51cc960cdd	Merge pull request #11346 from fidencio/topic/bump-cgroups-rs rust: Update cgroups-rs to its v0.3.5 release	2025-05-31 04:13:05 +02:00
Fabiano Fidêncio	48f8496209	Merge pull request #11327 from Champ-Goblem/agent/increase-limit-nofile agent: increase LimitNOFILE in the systemd service	2025-05-30 21:56:01 +02:00
Fabiano Fidêncio	02c46471fd	rust: Update cgroups-rs to its v0.3.5 release We're switching to using a rev as it may take some time for the package to be updated on crates.io. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-30 21:49:50 +02:00
Fabiano Fidêncio	dadbfd42c8	kernel: Move mem-agent configs to the common kernel build There's no benefit on keeping those restricted to the dragonball build, when they can be used with other VMMs as well (as long as they support the mem-agent). Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-30 21:49:22 +02:00
Champ-Goblem	a37080917d	kernel: Add CONFIG_TUN for VPN services TUN/TAP is a must for VPN related services. Signed-off-by: Champ-Goblem <cameron@northflank.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-30 21:49:22 +02:00
Fabiano Fidêncio	b8a7350a3d	Merge pull request #11324 from Champ-Goblem/runtime/fix-cgroup-deletion runtime: fix cgroupv2 deletion when sandbox_cgroup_only=false	2025-05-30 21:23:07 +02:00
Champ-Goblem	ef642fe890	runtime: fix cgroupv2 deletion when sandbox_cgroup_only=false Currently, when a new sandbox resource controller is created with cgroupsv2 and sandbox_cgroup_only is disabled, the cgroup management falls back to cgroupfs. During deletion, `IsSystemdCgroup` checks if the path contains `:` and tries to delete the cgroup via systemd. However, the cgroup was originally set up via cgroupfs and this process fails with `lstat /sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/....scope: no such file or directory`. This patch updates the deletion logic to take in to account the sandbox_cgroup_only=false option and in this case uses the cgroupfs delete. Fixes: #11036 Signed-off-by: Champ-Goblem <cameron@northflank.com>	2025-05-30 17:51:31 +02:00
Champ-Goblem	f4007e5dc1	agent: increase LimitNOFILE in the systemd service Increase the NOFILE limit in the systemd service, this helps with running databases in the Kata runtime. Signed-off-by: Champ-Goblem <cameron@northflank.com>	2025-05-30 17:49:29 +02:00
Fabiano Fidêncio	3f5dc87284	Merge pull request #11333 from stevenhorsman/csi-driver-permissions-fix workflow: add packages: write to csi-driver publish	2025-05-30 17:45:47 +02:00
Zvonko Kaiser	4586511c01	doc: Add Helm Chart entry Since 3.12 we're shipping the helm-chart per default with each release. Update the documentation to use helm rather then the kata-deploy manifests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-05-30 14:45:01 +00:00
Aurélien Bombo	c03b38c7e3	ci: Require agent-ctl tests This adds `run-kata-agent-apis` to the list of required tests. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-05-29 14:09:42 -05:00
stevenhorsman	586d9adfe5	workflow: add packages: write to csi-driver publish This one was missed in the earlier PR Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-29 15:57:07 +01:00
Steve Horsman	3da213a8c8	Merge pull request #11326 from kata-containers/top-level-workflow-permissions Top level workflow permissions	2025-05-29 10:03:06 +01:00
stevenhorsman	c34416f53a	workflows: Add explicit permissions where needed We have a number of jobs that either need,or nest workflows that need gh permissions, such as for pushing to ghcr, or doing attest build provenance. This means they need write permissions on things like `packages`, `id-token` and `attestations`, so we need to set these permissions at the job-level (along with `contents: read`), so they are not restricted by our safe defaults. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-28 19:34:28 +01:00
stevenhorsman	088e97075c	workflow: Add top-level permissions Set: ``` permissions: contents: read ``` as the default top-level permissions explicitly to conform to recommended security practices e.g. https://github.com/ossf/scorecard/blob/main/docs/checks.md#token-permissions	2025-05-28 19:34:28 +01:00
Dan Mihai	353d0822fd	Merge pull request #11314 from katexochen/p/svc-name-regex genpolicy: fix svc_name regex	2025-05-28 10:08:38 -07:00
Steve Horsman	7a9d919e3e	Merge pull request #11322 from kata-containers/workflow-permissions workflows: Add explicit permissions for attestation	2025-05-28 17:28:22 +01:00
Steve Horsman	2667d4a345	Merge pull request #11323 from stevenhorsman/gatekeeper-workflow-permissions-ii workflow: Update gatekeeper permissions	2025-05-28 17:05:24 +01:00
stevenhorsman	4d4fb86d34	workflow: Update gatekeeper permissions I shortsightedly forgot that gatekeeper would need to read more than just the commit content in it's python scripts, so add read permissions to actions issues which it uses in it's processing Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-28 15:58:27 +01:00
Steve Horsman	fed63e0801	Merge pull request #11319 from stevenhorsman/remove-old-workflows workflows: Delete workflows	2025-05-28 15:38:19 +01:00
Steve Horsman	49f86aaa0d	Merge pull request #11320 from stevenhorsman/gatekeeper-workflow-permissions workflows: gatekeeper: Update permissions	2025-05-28 15:38:06 +01:00
stevenhorsman	3ff602c1e8	workflows: Add explicit permissions for attestation We have a number of jobs that nest the build-static-tarball workflows later on. Due to these doing attest build provenance, and pushing to ghcr.io, t hey need write permissions on `packages`, `id-token` and `attestations`, so we need to set these permissions on the top-level jobs (along with `contents: read`), so they are not blocked. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-28 12:56:52 +01:00
stevenhorsman	2f0dc2ae24	workflows: gatekeeper: Update permissions Restrict the permissions of gatekeeper flow to read contents only for better security Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-28 09:57:19 +01:00
stevenhorsman	f900b0b776	workflows: Delete workflows Some legacy workflows require write access to github which is a security weakness and don't provide much value, so lets remove them. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-28 09:45:42 +01:00
Alex Lyn	aab6caa141	Merge pull request #10362 from Apokleos/vfio-hotplug-runtime-rs runtime-rs: add support hotplugging vfio device for qemu-rs	2025-05-28 13:21:58 +08:00
Fabiano Fidêncio	ac934e001e	Merge pull request #11244 from katexochen/p/guest-pull-config runtime: add option to force guest pull	2025-05-27 16:00:09 +02:00
alex.lyn	e69a4d203a	runtime-rs: Increase QMP read timeout to mitigate failures It frequently causes "Resource Temporarily Unavailable (OS Error 11)" with the original 250ms read timeout When passing through devices via VFIO in QEMU. The root cause lies in synchronization timeout windows failing to accommodate inherent delays during critical hardware init phases in kernel space. This commit would increase the timeout to 5000ms which was determined through some tests. While not guaranteeing complete resolution for all hardware combinations, this change significantly reduces timeout failures. Fixes # 10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-27 21:06:57 +08:00
Paul Meyer	c4815eb3ad	runtime: add option to force guest pull This enables guest pull via config, without the need of any external snapshotter. When the config enables runtime.experimental_force_guest_pull, instead of relying on annotations to select the way to share the root FS, we always use guest pull. Co-authored-by: Markus Rudy <mr@edgeless.systems> Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-05-27 12:42:00 +02:00
Fabiano Fidêncio	d3f81ec337	Merge pull request #11240 from Apokleos/copydir runtime-rs: Propagate k8s configs correctly when sharedfs is disabled	2025-05-27 12:41:21 +02:00
Paul Meyer	8de8b8185e	genpolicy: rename svc_name to svc_name_downward_env Just to be more explicit what this matches. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-05-27 10:13:43 +02:00
Paul Meyer	78eb65bb0b	genpolicy: fix svc_name regex The service name is specified as RFC 1035 lable name [1]. The svc_name regex in the genpolicy settings is applied to the downward API env variables created based on the service name. So it tries to match RFC 1035 labels after they are transformed to downward API variable names [2]. So the set of lower case alphanumerics and dashes is transformed to upper case alphanumerics and underscores. The previous regex wronly permitted use of numbers, but did allow dot and dash, which shouldn't be allowed (dot not because they aren't conform with RFC 1035, dash not because it is transformed to underscore). We have to take care not to also try to use the regex in places where we actually want to check for RFC 1035 label instead of the downward API transformed version of it. Further, we should consider using a format like JSON5/JSONC for the policy settings, as these are far from trivial and would highly benefit from proper documentation through comments. [1]: https://kubernetes.io/docs/concepts/services-networking/service/#defining-a-service [2]: `b2dfba4151/pkg/kubelet/envvars/envvars.go (L29-L70)` Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-05-27 08:43:25 +02:00
RuoqingHe	139dc13bdc	Merge pull request #11301 from lifupan/fix_cgroup runtime-rs: fix the issue of delete cgroup failed	2025-05-27 05:05:32 +02:00
Wainer Moschetta	d77e33babf	Merge pull request #11266 from ldoktor/ci-pp-retry ci.ocp: A couple of peer-pods setup improvements	2025-05-26 14:22:11 -03:00
Wainer Moschetta	c249769bb8	Merge pull request #11270 from ldoktor/gk tools.testing: Add methods to simplify gatekeeper development	2025-05-26 12:04:07 -03:00
Fabiano Fidêncio	20d3bc6f37	Merge pull request #10964 from hsiangkao/drop_outdated_patches Drop outdated erofs patches for 6.1.y kernels & fix a dragonball vsock issue	2025-05-26 13:00:25 +02:00
Gao Xiang	b441890749	kernel: drop outdated erofs patches for 6.1.y kernels Patches 0001..0004 have been included upstream as dependencies since Linux 6.1.113. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-05-26 15:48:24 +08:00
Xingru Li	71b6acfd7e	dragonball: vsock: support single descriptor Since kernel v6.3 the vsock packet is not split over two descriptors and is instead included in a single one. Therefore, we currently decide the specific method of obtaining BufWrapper based on the length of descriptor. Refer: `a2752fe04f` https://git.kernel.org/torvalds/c/71dc9ec9ac7d Signed-off-by: Xingru Li <lixingru.lxr@linux.alibaba.com> [ Gao Xiang: port this patch from the internal branch to address Linux 6.1.63+. ] Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-05-26 15:48:19 +08:00
RuoqingHe	b6cafba5f6	Merge pull request #11308 from hsiangkao/enable_tmpfs_xattr kernel: support `CONFIG_TMPFS_XATTR=y`	2025-05-26 05:00:26 +02:00
Gao Xiang	b681dfb594	kernel: support `CONFIG_TMPFS_XATTR=y` Currently, Kata EROFS support needs it, otherwise it will: [ 0.564610] erofs: (device sda): mounted with root inode @ nid 36. [ 0.564858] overlayfs: failed to set xattr on upper [ 0.564859] overlayfs: ...falling back to index=off,metacopy=off. [ 0.564860] overlayfs: ...falling back to xino=off. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2025-05-24 20:43:35 +08:00
RuoqingHe	a9ffdfc2ae	Merge pull request #11294 from wainersm/delint_confidential_kbs tests/k8s: delint confidential_kbs.sh	2025-05-23 17:00:28 +02:00
Fupan Li	e9b45126fc	Merge pull request #11254 from sampleyang/main runtime-rs: fix vfio pci address domain 0001 problem	2025-05-23 18:13:10 +08:00
yangsong	06c7c5bccb	runtime-rs: fix vfio pci address domain 0001 problem Some nvidia gpu pci address domain with 0001, current runtime default deal with 0000:bdf, which cause address errors during device initialization and address conflicts during device registration. Fixes #11252 Signed-off-by: yangsong <yunya.ys@antgroup.com>	2025-05-23 14:33:06 +08:00
Wainer dos Santos Moschetta	ddf333feaf	tests/k8s: fix shellcheck SC1091 in confidential_kbs.sh Fixed "note: Not following: ./../../../tools/packaging/guest-image/lib_se.sh: openBinaryFile: does not exist (No such file or directory) [SC1091]" Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-22 15:38:27 -03:00
Wainer dos Santos Moschetta	c9fb0b9c85	tests/k8s: fix shellcheck SC2154 in confidential_kbs.sh Fixed "warning: HKD_PATH is referenced but not assigned. [SC2154]" Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-22 15:02:20 -03:00
Wainer dos Santos Moschetta	68d91d759a	tests/k8s: add `set -e` to confidential_ksh.sh Although the script will inherit that setting from the caller scripts, expliciting it in the file will vanish shellcheck "warning: Use 'pushd ... \|\| exit' or 'pushd ... \|\| return' in case pushd fails. [SC2164]" Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-22 14:55:24 -03:00
Wainer dos Santos Moschetta	b4adfcb3cb	tests/k8s: apply shellcheck tips to confidential_kbs.sh Addressed the following shellcheck advices: SC2046 (warning): Quote this to prevent word splitting. SC2248 (style): Prefer double quoting even when variables don't contain special characters SC2250 (style): Prefer putting braces around variable references even when not strictly required. SC2292 (style): Prefer [[ ]] over [ ] for tests in Bash/Ksh Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-22 14:52:38 -03:00
alex.lyn	043bab3d3e	runtime-rs: Handle port allocation in PCIe topology for vfio devices It's import to handle port allocation in a PCIe topology before vfio deivce hotplug via QMP. The code ensures that VFIO devices are properly allocated to available ports (either root ports or switch ports) and updates the device's bus and port information accordingly. It'll first retrieves the PCIe port type from the topology using pcie_topo.get_pcie_port(). And then, searches for an available node in the PCIe topology with RootPort or SwitchPort type and allocates the VFIO device to the found available port. Finally, Updates the device's bus with the allocated port's ID and type. Fixes # 10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-22 18:58:41 +08:00
alex.lyn	01b822de16	runtime-rs: Get available port node in the PCIe topology This commit implements the `find_available_node` function, which searches the PCIe topology for the first available `TopologyPortDevice` or `SwitchDownPort`. If no available node is found in either the `pcie_port_devices` or the connected switches' downstream ports, the function returns `None`. Fixes # 10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-22 18:58:41 +08:00
alex.lyn	533d07a2c3	runtime-rs: Introduce qemu-rs vfio device hotplug handler This commit note that the current implementation restriction where 'multifunction=on' is temporarily unsupported. While the feature isn't available in the present version, we explicitly acknowledge this limitation and commit to addressing it in future iterations to enhance functional completeness. Tracking issue #11292 has been created to monitor progress towards full multifunction support. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-22 18:58:06 +08:00
Steve Horsman	91f2e97aae	Merge pull request #11267 from Rtoax/p001-fix-osbuilder-lib.sh-indent osbuilder: lib.sh: Fix indent	2025-05-22 09:54:18 +01:00
alex.lyn	f1796fe9ba	runtime-rs: Add more fields in VfioDevice to express vfio devices To support port devices for vfio devices, more fields need to be introduced to help pass port type, bus and other information. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-22 16:00:40 +08:00
Fupan Li	15cbc545ca	runtime-rs: fix the issue of delete cgroup failed When try to delete a cgroup, it's needed to move all of the tasks/procs in the cgroup into root cgroup and then delete it. Since for cgroup v2, it doesn't support to move thread into root cgroup, thus move the processes instead of moving tasks can fix this issue. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-05-22 12:15:02 +08:00
Steve Horsman	9356ed59d5	Merge pull request #11130 from wainersm/tests-better-report tests/k8s: better tests reporting for CI	2025-05-21 17:21:35 +01:00
Steve Horsman	b519e9fdff	Merge pull request #11293 from wainersm/tests_increase_kbs_timeout tests/k8s: increase wait time of KBS service ingress	2025-05-21 17:14:52 +01:00
Steve Horsman	a897bce29f	Merge pull request #11298 from stevenhorsman/release-3.17.0-bump release: Bump version to 3.17.0	2025-05-21 12:06:24 +01:00
stevenhorsman	7b90ff3c01	release: Bump version to 3.17.0 Bump VERSION and helm-chart versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-21 12:04:39 +01:00
Fabiano Fidêncio	5378e581d8	Merge pull request #11144 from Apokleos/hotplug-block-qemu-rs Support hot-plug block device in qemu-rs with QMP	2025-05-21 11:31:48 +02:00
Lukáš Doktor	67ee9f3425	ci.ocp: Improve logging of extra new resources this script relies on temporary subscriptions and won't cleanup any resources. Let's improve the logging to better describe what resources were created and how to clean them, if the user needs to do so. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-21 11:02:36 +02:00
Lukáš Doktor	32dbc5d2a9	ci.ocp: Use SCRIPT_DIR to allow execution from any folder We used hardcoded "ci/openshift-ci/cluster" location which expects this script to be only executed from the root. Let's use SCRIPT_DIR instead to allow execution from elsewhere eg. by user bisecting a failed CI run. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-21 10:30:03 +02:00
Lukáš Doktor	0e4fb62bb4	ci.ocp: Retry first az command as login takes time to propagate In CI we hit problem where just after `az login` the first `az network vnet list` command fails due to permission. We see "insufficient permissions" or "pending permissions", suggesting we should retry later. Manual tests and successful runs indicate we do have the permissions, but not immediately after login. Azure docs suggest using extra `az account set` but still the propagation might take some time. Add a loop retrying the first command a few times before declaring failure. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-21 10:28:01 +02:00
Fabiano Fidêncio	6c9b199ef1	Merge pull request #11289 from BbolroC/fix-vfio-coldplug runtime: Preserve hotplug devices for vfio-coldplug mode	2025-05-21 09:48:25 +02:00
Wainer dos Santos Moschetta	fdcf11d090	tests/k8s: increase wait time of KBS service ingress kbs_k8s_svc_host() returns the ingress IP when the KBS service is exposed via an ingress. In Azure AKS the ingress can time a while to be fully ready and recently we have noticed on CI that kbs_k8s_svc_host() has returned empty value. Maybe the problem is on current timeout being too low, so let's increase it to 50 seconds to see if the situation improves. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-20 15:20:08 -03:00
Wainer dos Santos Moschetta	80a816db9d	workflows/run-k8s-tests-coco-nontee: add step to report tests Run `gha-run.sh report-tests` to generate the report of the tests. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-20 14:43:38 -03:00
Wainer dos Santos Moschetta	8c4637d629	tests/k8s: print tests report Added 'report-tests' command to gha-run.sh to print to stdout a report of the tests executed. For example: ``` SUMMARY (2025-02-17-14:43:53): Pass: 0 Fail: 1 STATUSES: not_ok foo.bats OUTPUTS: ::group::foo.bats 1..3 not ok 1 test 1 not ok 2 test 2 ok 3 test 3 1..2 not ok 1 test 1 not ok 2 test 2 ::endgroup:: ``` Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-20 14:43:38 -03:00
Wainer dos Santos Moschetta	5e3b8a019a	tests/k8s: split and save bats outputs in files Currently run_kubernetes_tests.sh sends all the bats outputs to stdout which can be very difficult to browse to find a problem, mainly on CI. With this change, each bats execution have its output sent to 'reports/yyy-mm-dd-hh:mm:ss/<status>-<bats file>.log' where <status> is either 'ok' (tests passed) or 'not_ok' (some tests failed). Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-05-20 14:43:38 -03:00
Steve Horsman	f8c5aa6df6	Merge pull request #11259 from fitzthum/bump-gc-0140 Update Trustee and Guest Components for CoCo v0.14.0	2025-05-20 18:05:17 +01:00
Lukáš Doktor	c203d7eba6	ci.ocp: Set peer-pods-azure license We forgot to add the license header when introducing this test. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-20 17:03:48 +02:00
Steve Horsman	b4aa1e3fbd	Merge pull request #11279 from skazi0/repo-components osbuilder: ubuntu: Add REPO_COMPONENTS setting	2025-05-20 16:03:48 +01:00
Lukáš Doktor	b97b20295b	ci.ocp: Make peer-pods setup executable set permissions of the peer-pods-azure.sh script to executable Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-20 17:03:48 +02:00
Sumedh Alok Sharma	9a4432d197	Merge pull request #11233 from Ankita13-code/ankitapareek/execprocess-additional-input-validation genpolicy: validate input process fields for ExecProcessRequest	2025-05-20 20:11:41 +05:30
Jacek Tomasiak	91fb4353f6	osbuilder: ubuntu: Add REPO_COMPONENTS setting Added variable REPO_COMPONENTS (default: "main") which sets components used by mmdebstrap for rootfs building. This is useful for custom image builders who want to include EXTRA_PKGS from components other than the default "main" (e.g. "universe"). Fixes: #11278 Signed-off-by: Jacek Tomasiak <jtomasiak@arista.com> Signed-off-by: Jacek Tomasiak <jacek.tomasiak@gmail.com>	2025-05-20 14:01:48 +02:00
Fabiano Fidêncio	29099d139b	Merge pull request #11280 from kata-containers/dependabot/cargo/src/tools/kata-ctl/ring-0.17.14 build(deps): bump ring from 0.17.5 to 0.17.14 in /src/tools/kata-ctl	2025-05-20 13:47:22 +02:00
Fabiano Fidêncio	0bc0623037	Merge pull request #11277 from skazi0/repo-url osbuilder: ubuntu: Expose REPO_URL variables	2025-05-20 13:46:01 +02:00
Ankita Pareek	ad75595dc8	genpolicy: Add tests for various input validations for ExecProcessRequest These additional tests cover edge cases specific to- - Terminal validation - Capabilities validation - Working directory (Cwd) validation - NoNewPrivileges validation - User validation - Environment variables validation Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>	2025-05-20 11:19:55 +00:00
Saul Paredes	1e466bf39c	genpolicy: fix validation of env variables sourced from metadata.namespace Use $(sandbox-namespace) wildcard in case none is specified in yaml. If wildcard is present, compare input against annotation value. Fixes regression introduced in https://github.com/microsoft/kata-containers/pull/273 where samples that use metadata.namespace env var were no longer working. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-05-20 11:19:46 +00:00
Dan Mihai	a113b9eefd	genpolicy: validate probe process fields Validate more process fields for k8s probe commands - e.g., livenessProbe, readinessProbe, etc. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-05-20 11:15:30 +00:00
Dan Mihai	c0b8c6ed5e	genpolicy: validate process for commands from settings Validate more process fields for commands enabled using the ExecProcessRequest "commands" and/or "regex" fields from the settings file. Add function to get the container from state based on container_id matching instead of matching it against every policy container data Signed-off-by: Dan Mihai <dmihai@microsoft.com> Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>	2025-05-20 11:15:30 +00:00
Dan Mihai	6f78aaa411	genpolicy: use process inputs for allow_process() Using process data inputs for allow_process() is easier to read/understand compared with the older OCI data inputs. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-05-20 11:15:30 +00:00
Steve Horsman	2871c31162	Merge pull request #11273 from mythi/tdx-qemu-params config: update QEMU TDX configuration	2025-05-20 10:22:59 +01:00
Steve Horsman	4b317dddfa	Merge pull request #11271 from stevenhorsman/gatekeeper-truncate-names ci: gatekeeper: Require names update	2025-05-20 10:20:05 +01:00
alex.lyn	4b27ca9233	runtime-rs: Implement volume copy allowlist check For security reasons, we have restricted directory copying. Introduces the `is_allowlisted_copy_volume` function to verify if a given volume path is present in an allowed copy directory. This enhances security by ensuring only permitted volumes are copied Currently, only directories under the path `/var/lib/kubelet/pods/<uid>/volumes/{kubernetes.io~configmap, kubernetes.io~secret, kubernetes.io~downward-api, kubernetes.io~projected}` are allowed to be copied into the guest. Copying of other directories will be prohibited. Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:57:10 +08:00
alex.lyn	8910bddce8	kata-types: Introduce k8s special volumes for projected and downward-api Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:55:49 +08:00
alex.lyn	6fa409df1a	kata-agent: Improve file sync handling and address symlink issues When synchronizing file changes on the host, a "symlink AlreadyExists" issue occurs, primarily due to improper handling of symbolic links (symlinks). Additionally, there are other related problems. This patch will try to address these problems. (1) Handle symlink target existence (files, dirs, symlinks) during host file sync. Use appropriate removal methods (unlink, remove_file, remove_dir_all). (2) Enhance temporary file handling for safer operations and implement truncate only at offset 0 for resume support. (3) Set permissions and ownership for parent directories. (4) Check and clean target path for regular files before rename. Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:55:49 +08:00
alex.lyn	654e6db91f	runtime-rs: Add inotify-based real-time directory synchronization Introduce event-driven file sync mechanism between host and guest when sharedfs is disabled, which will help monitor the host path in time and do sync files changes: 1. Introduce FsWatcher to monitor directory changes via inotify; 2. Support recursive watching with configurable filters; 3. Add debounce logic (default 500ms cooldown) to handle burst events; 4. Trigger `copy_dir_recursively` on stable state; 5. Handle CREATE/MODIFY/DELETE/MOVED/CLOSE_WRITE events; Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:55:49 +08:00
alex.lyn	79b832b2f5	runtime-rs: Propagate k8s configs correctly when sharedfs is disabled In Kubernetes (k8s), while Kata Pods often use virtiofs for injecting Service Accounts, Secrets, and ConfigMaps, security-sensitive environments like CoCo disable host-guest sharing. Consequently, when SharedFs is disabled, we propagate these configurations into the guest via file copy and bind mount for correct container access. Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:55:49 +08:00
alex.lyn	8da7cd1611	runtime-rs: Impl recursive directory copy with metadata preservation Add async directory traversal using BFS algorithm: (1) Support file type handling: Regular files (S_IFREG) with content streaming; Directories (S_IFDIR) with mode preservation; Symbolic links (S_IFLNK) with target recreation; (2) Maintain POSIX metadata: UID/GID preservation,File mode bits, and Directory permissions (3) Implement async I/O operations for: Directory enumeration, file reading, symlink target resolution Fixes #11237 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:55:49 +08:00
alex.lyn	378d04bdf0	runtime-rs: Add hotplug block device type with QMP There's several cases that block device plays very import roles: 1. Direct Volume: In Kata cases, to achieve high-performance I/O, raw files on the host are typically passed directly to the Guest via virtio-blk, and then bond/mounted within the Guest for container usage. 2. Trusted Storage In CoCo scenarios, particularly in Guest image pull mode, images are typically pulled directly from the registry within the Guest. However, due to constrained memory resources (prioritized for containers), CoCo leverages externally attached encrypted storage to store images, requiring hot-plug capability for block devices. and as other vmms, like dragonball and cloud-hypervisor in runtime-rs or qemu in kata-runtime have already supported such capabilities, we need support block device with hot-plug method (QMP) in qemu-rs. Let's do it. Fixes #11143 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:46:54 +08:00
alex.lyn	2405301e2e	runtime-rs: Support hotplugging block device via QMP This commit introduces block device hotplugging capability using QMP commands. The implementation enables attaching raw block devices to a running VM through the following steps: 1.Block Device Configuration Uses `blockdev-add` QMP command to define a raw block backend with (1) Direct I/O mode (2) Configurable read-only flag (3) Host file/block device path (`/path/to/block`) 2.PCI Device Attachment, Attaches the block device via `device_add` QMP command as a `virtio-blk-pci` device: (1) Dynamically allocates PCI slots using `find_free_slot()` (2) Binds to user-specified PCIe bus (e.g., `pcie.1`) (3) Returns PCI path for further management Fixes #11143 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:46:54 +08:00
alex.lyn	80bd71bfcc	runtime-rs: Iterates through PCI devices to find a match with qdev_id The get_pci_path_by_qdev_id function is designed to search for a PCI device within a given list of devices based on a specified qdev_id. It tracks the device's path in the PCI topology by recording the slot values of the devices traversed during the search. If the device is located behind a PCI bridge, the function recursively explores the bridge's device list to find the target device. The function returns the matching device along with its updated path if found, otherwise, it returns None. Fixes #11143 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-20 16:46:54 +08:00
Fupan Li	9a03815f18	Merge pull request #11095 from lifupan/ephemeral_volume runtime-rs: add the ephemeral memory based volume support	2025-05-20 09:18:34 +08:00
RuoqingHe	5b5c71510e	Merge pull request #11093 from kimullaa/fix-err-when-containerd-conf-does-not-exist kata-deploy: fix bug when config does not exist	2025-05-19 18:12:50 +02:00
Steve Horsman	cfdccaacb3	Merge pull request #11283 from Rtoax/p002-fix-typo config: Fix typos	2025-05-19 14:59:37 +01:00
RuoqingHe	93b44f920c	Merge pull request #11287 from bpradipt/remote-hyp-logging runtime: Fix logging for remote hypervisor	2025-05-19 15:49:15 +02:00
Shunsuke Kimura	9a8d64d6b1	kata-deploy: execute in the host environment `containerd` command should be executed in the host environment. (To generate the config that matches the host's containerd version.) Fixes: #11092 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-19 21:42:21 +09:00
Shunsuke Kimura	d3edc90d80	kata-deploy: Fix condition always true if config.toml does not exist, `[ -x $(command -v containerd) ]` will always True (Because it is not enclosed in ""). ``` // current code $ [ -x $(command -v containerd_notfound) ] $ echo $? 0 // maybe expected code $ [ -x "$(command -v containerd_notfound)" ] $ echo $? 1 ``` Fixes: #11092 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-19 21:42:21 +09:00
Hyounggyu Choi	2fd2cd4a9b	runtime: Preserve hotplug devices for vfio-coldplug mode Fixes: #11288 This commit appends hotplug devices (e.g., persistent volume) to deviceInfos when `vfio_mod` is `vfio` and `cold_plug_vfio` is set to one except `no-port`. For details, please visit the issue. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-05-19 13:46:49 +02:00
Pradipta Banerjee	9f9841492e	runtime: Fix logging for remote hypervisor Need to use hvLogger Fixes: #11286 Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>	2025-05-19 07:01:59 -04:00
Jacek Tomasiak	da6860a632	osbuilder: ubuntu: Expose REPO_URL variables This exposes REPO_URL and adds REPO_URL_X86_64 which can be set to use custom Ubuntu repo for building rootfs. If only one architecture is built, REPO_URL can be set. Otherwise, REPO_URL_X86_64 is used for x86_64 arch and REPO_URL for others. Fixes: #11276 Signed-off-by: Jacek Tomasiak <jtomasiak@arista.com> Signed-off-by: Jacek Tomasiak <jacek.tomasiak@gmail.com>	2025-05-19 12:41:49 +02:00
Rong Tao	914730d948	config: Fix typos devie should be device Signed-off-by: Rong Tao <rongtao@cestc.cn>	2025-05-19 14:19:22 +08:00
Alex Lyn	305a5f5e41	Merge pull request #10578 from Apokleos/pcie-port-devices runtime-rs: Introduce PCIe Port devices in runtime-rs for qemu-rs	2025-05-18 21:10:25 +08:00
Dan Mihai	b9651eadab	Merge pull request #11214 from microsoft/cameronbaird/address-gid-mismatch-additionalgids genpolicy: Enable AdditionalGids checks in rules.rego	2025-05-16 10:15:53 -07:00
dependabot[bot]	a2c7e48e0e	build(deps): bump ring from 0.17.5 to 0.17.14 in /src/tools/kata-ctl Bumps [ring](https://github.com/briansmith/ring) from 0.17.5 to 0.17.14. - [Changelog](https://github.com/briansmith/ring/blob/main/RELEASES.md) - [Commits](https://github.com/briansmith/ring/commits) --- updated-dependencies: - dependency-name: ring dependency-version: 0.17.14 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-05-16 14:51:20 +00:00
Fabiano Fidêncio	9e11b2e577	Merge pull request #11274 from fidencio/topic/arm-ci-k8s-enable-hotplug-tests ci: k8s: arm: Enable skipped tests	2025-05-16 13:19:18 +02:00
Fabiano Fidêncio	219d6e8ea6	Merge pull request #11257 from mythi/coco-guest-hardening confidential guest kernel hardening changes	2025-05-16 08:52:36 +02:00
Fabiano Fidêncio	86d2d96d4a	ci: k8s: arm: Enable skipped tests Now that memory hotplug should work, as we're using a firmware that supports that, let's re-enable the tests that rely on hotplug. Fixes: #10926, #10927 Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-16 03:02:32 +02:00
Fabiano Fidêncio	02ce395a69	Merge pull request #11272 from seungukshin/enable-edk2-for-arm64 Enable edk2 for arm64	2025-05-15 20:59:56 +02:00
Cameron Baird	7bba7374ec	genpolicy: Add retries to policy generation As the genpolicy from_files call makes network requests to container registries, it has a chance to fail. Harden us against flakes due to network by introducing a 6x retry loop in genpolicy tests. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-05-15 18:12:50 +00:00
Steve Horsman	d21d2a0657	Merge pull request #11265 from chathuryaadapa/bumpalo-crate-bump Bump: libz-sys crate to address CVE	2025-05-15 16:18:00 +01:00
Mikko Ylinen	ff851202e6	config: update QEMU TDX configuration Drop '-vmx-rdseed-exit' from '-cpu host' QEMU options. The history of it is unknown but it's likely related to early TDX enablement. TD pods start up fine without it (tested by manually editing the configuration file) and it's also not used elsewhere. Keep TDXCPUFEATURES for now in case a need for it shows up later. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-15 15:43:24 +03:00
Fabiano Fidêncio	676e66ae49	Merge pull request #11246 from skazi0/mmdebstrap osbuilder: ubuntu: Switch from multistrap to mmdebstrap	2025-05-15 14:15:37 +02:00
alex.lyn	07533522b8	runtime-rs: Handle PortDevice devices when invoke start_vm with Qemu Extract PortDevice relevant information, and then invoke different processing methods based on the device type. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	c109328097	runtime-rs: Introduce pcie root port and switch port in qemu-rs cmdline. Some data structures and methods are introduced to help handle vfio devices. And mothods add_pcie_root_ports and add_pcie_switch_ports follow runtime's related implementations of vfio devices. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	47c7ba8672	runtime-rs: Prepare pcie port devices before start sandbox Prepare pcie port devices before starting VM with the help of device manager and PCIe Topology. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	d435712ccb	runtime-rs: Introduce PortDevice in resource manager in sandbox A new resource type `PortDevice` is introduced which is dedicated for handling root ports/switch ports during sandbox creation(VM). Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	1d670bb46c	runtime-rs: handle useless Device match arms in dragonball vmm case Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	f08fdd25d8	runtime-rs: Introduce device type of PordDevice in device manager PortDevice is for handling root ports or switch ports in PCIe Topology. It will make it easy pass the root ports/switch ports information during create VM with requirements of PCIe devices. Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	694a849eaa	runtime-rs: Add PCIe topology mgmt for Root Port and Switch Port This commit introduces an implementation for managing PCIe topologies, focusing on the relationship between Root Ports and Switch Ports. The design supports two strategies for generating Switch Ports: Let's take the requirement of 4 switch ports as an example. There'll be three possible solutions as below: (1) Single Root Port + Single PCIe Switch: Uses 1 Root Port and 1 Switch with 4 Downstream Ports. (2) Multiple Root Ports + Multiple PCIe Switches: Uses 2 Root Ports and 2 Switches, each with 2 Downstream Ports. The recommended strategy is Option 1 due to its simplicity, efficiency, and scalability. The implementation includes data structures (PcieTopology, RootPort, PcieSwitch, SwitchPort) and operations (add_pcie_root_port, add_switch_to_root_port, add_switch_port_to_switch) to manage the topology effectively. Fxies #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	2f5ee0ec6d	kata-types: Support switch port config via annotation and configuration Support setting switch ports with annotatation or configuration.toml Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
alex.lyn	a42d16a6a4	kata-types: Introduce pcie_switch_port in configuration (1) Introduce new field `pcie_switch_port` for switch ports. (2) Add related checking logics in vmms(dragonball, qemu) Fixes #10361 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-05-15 20:10:49 +08:00
Fabiano Fidêncio	af3c601a92	Merge pull request #11258 from fidencio/topic/second-try-fix-multi-install-prefix kata-deploy: Avoid changing any component path in case of restart	2025-05-15 11:21:15 +02:00
Seunguk Shin	560e718979	runtime: Add edk2 to configuration-qemu.toml for arm64 The edk2 is required for memory hot plug on qemu for arm64. This adds the edk2 to configuration-qemu.toml for arm64. Signed-off-by: Seunguk Shin <seunguk.shin@arm.com> Reviewed-by: Nick Connolly <nick.connolly@arm.com>	2025-05-15 10:12:31 +01:00
Seunguk Shin	5cabce1a25	packaging: Build edk2 for arm64 The edk2 is required for memory hot plug on qemu for arm64. This adds the edk2 to static tarball for arm64. Signed-off-by: Seunguk Shin <seunguk.shin@arm.com> Reviewed-by: Nick Connolly <nick.connolly@arm.com>	2025-05-15 10:12:24 +01:00
stevenhorsman	c09291a9c7	ci: gatekeeper: Require names update The github rest api truncated job names that are >100 characters (which doesn't seem to be documented). There doesn't seem to be a way to easily make gatekeeper handle this automatically, so lets update the required-tests to expect the truncated job names Fixes: #11176 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-15 10:07:41 +01:00
Steve Horsman	95e5e0ec49	Merge pull request #11264 from fidencio/topic/helm-to-ci helm: release: Publish our helm charts to the OCI registries	2025-05-15 09:47:33 +01:00
Lukáš Doktor	9f8c8ea851	tools.testing: Add way to re-play recorded queries in gatekeeper to simplify gatekeeper development add support for DEBUG_INPUT which can be used to report content from files gathered in DEBUG run. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-15 10:32:10 +02:00
Lukáš Doktor	1a15990ee1	tools.testing: Add DEBUG support for gatekeeper to avoid manual curling to analyze GK issues let's add a way to dump all GK requests in a directory when the use specifies "DEBUG" env variable. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-05-15 10:32:10 +02:00
Fabiano Fidêncio	71e8c1b4f0	helm: release: Publish our helm charts to the OCI registries Let's take advantage that helm take and OCI registry as the charts, and upload our charts to the OCI registries we've been using so far. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-14 20:20:35 +02:00
RuoqingHe	393cc61153	Merge pull request #11241 from kata-containers/dependabot/cargo/src/tools/agent-ctl/ring-0.17.14 build(deps): bump ring from 0.17.8 to 0.17.14 in /src/tools/agent-ctl	2025-05-14 16:20:33 +02:00
Adapa Chathurya	3d284d3b4e	versions: Bump libz-sys version Bump libz-sys version to update and remediate CVE-2025-1744. Signed-off-by: Adapa Chathurya <adapa.chathurya1@ibm.com>	2025-05-14 19:48:10 +05:30
Fabiano Fidêncio	82928d1480	kata-deploy: Avoid changing any component path in case of restart The previous attempt to fix this issue only took in consideration the QEMU binary, as I completely forgot that there were other pieces of the config that we also adjusted. Now, let's just check one of the configs before trying to adjust anything else, and only do the changes if the suffix added with the multi-install suffix is not yet added.{ Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-14 15:41:13 +02:00
Jacek Tomasiak	e20fb377fc	osbuilder: ubuntu: Switch from multistrap to mmdebstrap Multistrap requires usrmerge package which was dropped in Ubuntu 24.04 (Noble). Based on details from [0], the rootfs build process was switched to mmdebstrap. Some additional minor tweaks were needed around chrony as the version from Noble has very strict systemd sandboxing configured and it doesn't work with readonly root by default. [0] https://lists.debian.org/debian-dpkg/2023/05/msg00080.html Fixes: #11245 Signed-off-by: Jacek Tomasiak <jtomasiak@arista.com> Signed-off-by: Jacek Tomasiak <jacek.tomasiak@gmail.com>	2025-05-14 11:46:19 +02:00
Steve Horsman	711fcd8f51	Merge pull request #11251 from stevenhorsman/rust-vulns-9th-may-2025 Rust vulns 9th may 2025	2025-05-14 09:58:12 +01:00
Tobin Feldman-Fitzthum	be708f410e	tests: fixup error assert in pull image test Guest components is now less verbose with its error messages. This will be fixed after the release but for now switch to a more generic error message that is still found in the logs. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-13 20:17:02 -05:00
Tobin Feldman-Fitzthum	806abeefb9	tests: fixup error asserts in init-data test Guest components is less verbose with its error message now. This will be fixed after the release, but for now, update the tests with the new more general message. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-13 20:16:50 -05:00
Tobin Feldman-Fitzthum	e2e503eb33	tests: fixup error string for signature tests Guets components is less verbose with its error messages. This will be fixed after the release, but for now let's replace this with a more generic message. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-13 16:54:06 -05:00
Cameron Baird	090497f520	genpolicy: Add test cases for fsGroup and supplementalGroup fields Fix up genpolicy test inputs to include required additionalGids Include a test for the pod_container container in security_context tests as these containers follow slightly different paths in containerd. Introduce a test for fsGroup/supplementalGroups fields in the security context. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-05-13 21:48:58 +00:00
Cameron Baird	19d502de76	ci: Add test cases for fsGroup and supplementalGroup fields Introduce new test case to the security context bats file which verifies that policy works properly for a deployment yaml containing fsGroup and supplementalGroup configuration. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-05-13 21:48:58 +00:00
Cameron Baird	d3cd1af593	genpolicy: Enable AdditionalGids checks in rules.rego With added support for parsing these fields in genpolicy, we can now enable policy verification of AdditionalGids. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-05-13 21:48:58 +00:00
Tobin Feldman-Fitzthum	ef98f39b6d	tests: update error message for authenticated guest pull Some changes in guest components have obscured the error message that we show when we fail to get the credentials for an authenticated image. The new error message is a little bit misleading since it references decrypting an image. This will be udpated in a future release, but for now look for this message. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-13 16:46:32 -05:00
Cameron Baird	29ee46c186	genpolicy: Handle PodSecurityContext.fsGroup\|supplementalGroups Policy enforcement for additionalGids, A list of groups applied to the first process run in each container. Manifests in OCI struct as additionalGids: Consists of container's GID, fsGroup, and supplementalGroups. https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#PodSecurityContext-v1-core Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-05-13 21:44:51 +00:00
Tobin Feldman-Fitzthum	e10aa4e49c	tests: update error message for encrypted image test Guest components prints out a different error when failing to decrypt an image. Update the test to look for this new error. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-13 12:33:37 -05:00
RuoqingHe	cd4c3e89e1	Merge pull request #11243 from kata-containers/dependabot/go_modules/src/runtime/github.com/opencontainers/runc-1.2.0 build(deps): bump github.com/opencontainers/runc from 1.1.12 to 1.2.0 in /src/runtime	2025-05-13 17:02:35 +02:00
RuoqingHe	268197957d	Merge pull request #11253 from stevenhorsman/golang.org/x/oauth2v0.27.0-bump versions: Bump golang.org/x/oauth2	2025-05-13 15:03:24 +02:00
stevenhorsman	b3825829d8	versions: Bump golang.org/x/oauth2 Update module to remediate [CVE-2025-22868](https://www.cve.org/CVERecord?id=CVE-2025-22868) Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-13 11:00:35 +01:00
Rong Tao	37a16c19d1	osbuilder: lib.sh: Fix indent Replace 4 spaces to [tab]. Signed-off-by: Rong Tao <rongtao@cestc.cn>	2025-05-13 16:56:54 +08:00
Steve Horsman	299fb3b77b	Merge pull request #11255 from stevenhorsman/skip-docker-tests ci: gatekeeper: skip docker tests	2025-05-13 09:18:09 +01:00
Zvonko Kaiser	842ec6a32e	Merge pull request #11262 from BbolroC/add-vfio-config-for-sel-runtime runtime/config: Add VFIO config for IBM SEL	2025-05-12 10:59:09 -04:00
Zvonko Kaiser	5cc098ae43	Merge pull request #11242 from houstar/qing/safe-path agent: use safe-path to replace secure_join	2025-05-12 10:58:19 -04:00
Mikko Ylinen	ab29c8c979	runtime: do not add virtio-rng-pci device for confidential guests Adding: "-object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0" for confidential guests is not necessary as the RNG source cannot be trusted and the guest kernel has the driver already disable as well. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-12 17:14:51 +03:00
Mikko Ylinen	a44dfb8d37	versions: bump LTS kernel 6.12.28 has been released, let's bump to it. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-12 17:14:51 +03:00
Mikko Ylinen	eb326477fc	kernel: disable virtio RNG for confidential guests Linux CoCo x86 guest is hardened to ensure RDRAND provides enough entropy to initialize Linux RNG. A failure will panic the guest. For confidential guests any other RNG source is untrusted so disable them. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-12 17:12:44 +03:00
Hyounggyu Choi	4fac1293bd	runtime/config: Add VFIO config for IBM SEL With #11076 merged, a VFIO configuration is needed in the runtime when IBM SEL is involved (e.g., qemu-se or qemu-se-runtime-rs). For the Go runtime, we already have a nightly test (e.g., https://github.com/kata-containers/kata-containers/actions/runs/14964175872/job/42031097043) in which this change has been applied. For the Rust runtime, the feature has not yet been migrated. Thus, this change serves as a placeholder and a reminder for future implementation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-05-12 14:58:29 +02:00
Qingyuan Hou	c0ceaf661a	agent: use safe-path to replace secure_join This patch use safe-path library to safely handle filesystem paths. Signed-off-by: Qingyuan Hou <qingyuan.hou@linux.alibaba.com>	2025-05-12 09:06:55 +00:00
Tobin Feldman-Fitzthum	de6f4ae99c	versions: update Trustee version for CoCo v0.14.0 This hash will be tagged as Trustee v0.13.0 after the CoCo release is finished. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-09 13:40:28 -05:00
Tobin Feldman-Fitzthum	f9a9967e21	versions: update guest-components for CoCo v0.14.0 Pick up changes to guest components. This hash is right before the changes to GC to support image pull via the CDH. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-09 13:40:28 -05:00
Tobin Feldman-Fitzthum	d714eb2472	agent: update image-rs for CoCo v0.14.0 We might be able to eliminate this dependency soon, but for now let's update image-rs. I massaged the dependencies with: cargo update idna_adapter@1.2.1 --precise 1.2.0 cargo update litemap@0.7.5 --precise 0.7.4 cargo update zerofrom@0.1.6 --precise 0.1.5 cargo update astral-tokio-tar@0.5.2 --precise 0.5.1 cargo update base64ct@1.7.3 --precise 1.6.0 cargo update generic-array@1.2.0 --precise 1.1.1 Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-05-09 13:39:52 -05:00
stevenhorsman	35ed3a2a3a	versions: Bump bumpalo version Bump bumpalo version to remediate RUSTSEC-2022-0078 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-09 16:09:22 +01:00
stevenhorsman	fcc60b514b	versions: Bump hyper version Bump hyper version to update and remediate CVE-2023-26964 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-09 16:09:22 +01:00
stevenhorsman	7807e6c29a	versions: Bump byte-unit and rust_decimal Bump the crates to update them and pull in a newer version of borsh to remediate RUSTSEC-2023-0033 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-09 16:09:22 +01:00
Mikko Ylinen	96d922fc27	kernel: disable virtio MMIO for confidential guests As the comment in the fragment suggests, this is for the firecracker builds and not relevant for confidential guests, for example. Exlude mmio.conf fragment by adding the new !confidential tag to drop virtio MMIO transport for the confidential guest kernel (as virtio PCI is enough for the use cases today). Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-09 17:53:22 +03:00
Mikko Ylinen	31d6839eb5	tools: let confidential guest kernel builds to exclude fragments build-kernel.sh supports exluding fragments from the common base set based on the kernel target architecture. However, there are also cases where the base set must be stripped down for other reason. For example, confidential guest builds want to exclude some drivers the untrusted host may try to add devices (e.g., virtio-rng). Make build-kernel.sh to skip fragments tagged using '!confidential' when confidential guest kernels are built. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-05-09 17:53:22 +03:00
Zvonko Kaiser	78ff72a386	Merge pull request #11199 from fidencio/topic/kata-deploy-fix-multiInstallSufix-behaviour-during-restarts helm: Avoid appending the multiInstallSuffix several times	2025-05-09 10:32:23 -04:00
Zvonko Kaiser	26a3cb4fd1	Merge pull request #11250 from stevenhorsman/tempfile-3.19.1-bump versions: Update tempfile crate	2025-05-09 09:51:49 -04:00
stevenhorsman	a09a76a4f5	ci: gatekeeper: skip docker tests It looks like the 22.04 image got updated and broke the docker tests (see #11247), so make these un-required until we can get a resolution Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-09 13:57:23 +01:00
Markus Rudy	835f59df2f	Merge pull request #10986 from 3u13r/euler/feat/genpolicy/env-from-secret genpolicy: support secrets to be referenced for pod envs	2025-05-09 13:29:35 +02:00
stevenhorsman	787198f8bb	versions: Update tempfile crate Update the tempfile crate to resolve security issue [WS-2023-0045](`7247a8b6ee`) that came with the remove_dir_all dependency in prior versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-09 09:57:28 +01:00
Leonard Cohnen	b23ff6fc68	genpolicy: refactor policy test workdir setup This aligns the workdir preparation more closely with the workdir preparation for the generate integration test. Most notably, we clean up the temporary directory before we execute the tests in it. This way we better isolate different runs. Signed-off-by: Leonard Cohnen <leonard.cohnen@gmail.com> Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-05-09 09:45:28 +02:00
Leonard Cohnen	bad0cd0003	genpolicy: add cli integration tests Add a new type of integration test to genpolicy. Now we can test flag handling and how the CLI behaves with certain yaml inputs. The first tests cover the case when a Pod references a Kubernetes secret of config map in another file. Those need to be explicitly added via the --config-files flag. In the future we can easily add test suites that cover that all yaml fields of all resources are understood by genpolicy. Signed-off-by: Leonard Cohnen <leonard.cohnen@gmail.com>	2025-05-09 09:45:28 +02:00
Leonard Cohnen	61ee330029	genpolicy: move policy enforcement integration test to separate folder In preparation for adding more types of integration tests, moving the policy enforcements test into a separate folder. Signed-off-by: Leonard Cohnen <leonard.cohnen@gmail.com> Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-05-09 09:45:28 +02:00
Leonard Cohnen	2ea57aefbc	genpolicy: remove unused function Remove function that became unused in the last commit. Signed-off-by: Leonard Cohnen <leonard.cohnen@gmail.com>	2025-05-09 09:41:43 +02:00
Aurélien Bombo	4bb441965f	genpolicy: support arbitrary resources with -c This allows passing config maps and secrets (as well as any other resource kinds relevant in the future) using the -c flag. Fixes: #10033 Co-authored-by: Leonard Cohnen <leonard.cohnen@gmail.com> Signed-off-by: Leonard Cohnen <leonard.cohnen@gmail.com> Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-05-09 09:41:43 +02:00
Hyounggyu Choi	a286a5aee8	Merge pull request #11076 from Jakob-Naucke/ap-bind-assoc Bind/associate for VFIO-AP	2025-05-09 09:32:46 +02:00
Saul Paredes	1e09dfb0df	Merge pull request #11127 from microsoft/archana1/mount-tc genpolicy: improve validation for mounts	2025-05-08 15:41:23 -07:00
stevenhorsman	17843e50bb	runtime: Switch userns packages Switch imports to resolve: ``` SA1019: "github.com/opencontainers/runc/libcontainer/userns" is deprecated: use github.com/moby/sys/userns ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-08 11:04:11 +01:00
dependabot[bot]	2c80a3edce	build(deps): bump github.com/opencontainers/runc in /src/runtime Bumps [github.com/opencontainers/runc](https://github.com/opencontainers/runc) from 1.1.12 to 1.2.0. - [Release notes](https://github.com/opencontainers/runc/releases) - [Changelog](https://github.com/opencontainers/runc/blob/main/CHANGELOG.md) - [Commits](https://github.com/opencontainers/runc/compare/v1.1.12...v1.2.0) --- updated-dependencies: - dependency-name: github.com/opencontainers/runc dependency-version: 1.2.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-08 11:02:16 +01:00
Steve Horsman	e3e0007bf7	Merge pull request #11141 from stevenhorsman/k8s-cpu-ns-exec-retry tests: k8s: Retry output of kubectl exec in k8s-cpu-ns	2025-05-07 17:11:25 +01:00
Fabiano Fidêncio	f981e8a904	Merge pull request #10833 from stevenhorsman/crio-annotations-update Crio annotations update	2025-05-07 16:05:24 +02:00
dependabot[bot]	96885a8449	build(deps): bump ring from 0.17.8 to 0.17.14 in /src/tools/agent-ctl Bumps [ring](https://github.com/briansmith/ring) from 0.17.8 to 0.17.14. - [Changelog](https://github.com/briansmith/ring/blob/main/RELEASES.md) - [Commits](https://github.com/briansmith/ring/commits) --- updated-dependencies: - dependency-name: ring dependency-version: 0.17.14 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-05-07 12:18:56 +00:00
RuoqingHe	be75391953	Merge pull request #11235 from kata-containers/dependabot/cargo/src/tools/kata-ctl/openssl-0.10.72 build(deps): bump openssl from 0.10.60 to 0.10.72 in /src/tools/kata-ctl	2025-05-07 20:17:42 +08:00
RuoqingHe	d4d737a73e	Merge pull request #10512 from ncppd/riscv64-agent agent: Support RISC-V 64-bit architecture	2025-05-07 10:56:10 +08:00
RuoqingHe	7bdfea0041	Merge pull request #11123 from kimullaa/add-path-for-kata-deploy runtime: Add Path for kata-deploy	2025-05-07 00:25:12 +08:00
RuoqingHe	b5e45601f6	Merge pull request #11116 from kimullaa/more-robust-script-path-resolution kata-debug: Make path resolution more robust	2025-05-07 00:19:04 +08:00
stevenhorsman	5472662b33	runtime: Fix Incorrect conversion between integer types Fix the high severity codeql issue by checking the value is in bounds before converting Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-06 15:18:37 +01:00
stevenhorsman	4de79b9821	runtime: Ignoring deprecated warning. In the latest oci-spec, the prestart hook is deprecated. However, the docker & nerdctl tests failed when I switched to one of the newer hooks which don't run at quite the same time, so ignore the deprecation warnings for now to unblock the security fix Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-06 15:18:37 +01:00
stevenhorsman	37dda6060c	runtime: Re-vendor Re-run `make vendor` after the podman -> crio annotations change Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-06 15:18:37 +01:00
stevenhorsman	3740ce6e7b	runtime: Update crio annotations We've been using the github.com/containers/podman/v4/pkg/annotations module to get cri-o annotations, which has some major CVEs in, but in v5 most of the annotations were moved into crio (from 1.30) (see https://github.com/cri-o/cri-o/pull/7867). Let's switch to use the cri-o annotations module instead and remediate CVE-2024-3056. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-06 15:18:37 +01:00
dependabot[bot]	70b481e1ee	build(deps): bump openssl from 0.10.60 to 0.10.72 in /src/tools/kata-ctl Bumps [openssl](https://github.com/sfackler/rust-openssl) from 0.10.60 to 0.10.72. - [Release notes](https://github.com/sfackler/rust-openssl/releases) - [Commits](https://github.com/sfackler/rust-openssl/compare/openssl-v0.10.60...openssl-v0.10.72) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.72 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-05-06 13:56:33 +00:00
RuoqingHe	4f97e5fed3	Merge pull request #11226 from kata-containers/dependabot/cargo/src/agent/tokio-1.44.2 build(deps): bump tokio from 1.44.0 to 1.44.2	2025-05-06 21:55:18 +08:00
Fabiano Fidêncio	78bf9d7500	Merge pull request #11232 from lifupan/mtu runtime: add the mtu support for updating routes	2025-05-06 15:55:04 +02:00
Shunsuke Kimura	7177ab3827	runtime: execute using abs path Fixes: #11123 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-06 21:17:06 +09:00
Shunsuke Kimura	ddccbd4764	runtime: Add Path for kata-deploy When installing with kata-deploy, usually `/opt/kata/bin` is not in the PATH. Therefore, it will fail to execute. so add it to the PATH. Fixes: #11122 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-05-06 21:17:06 +09:00
Shunsuke Kimura	5c156a24e8	kata-debug: Make path resolution more robust Enabled to run from other scripts as source, etc. Fixes: #11115 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-06 21:16:25 +09:00
stevenhorsman	6030a64f0c	build(deps): bump tokio to 1.44.2 Bumps [tokio](https://github.com/tokio-rs/tokio) from to 1.44.2 in all components to resolve the security vuln throughout our repo Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-06 11:38:52 +01:00
RuoqingHe	89685c0cd0	Merge pull request #11225 from kata-containers/dependabot/cargo/src/dragonball/openssl-0.10.72 build(deps): bump openssl from 0.10.57 to 0.10.72	2025-05-06 18:27:45 +08:00
Fabiano Fidêncio	fb5f3eae3b	Merge pull request #11172 from ChengyuZhu6/erofs EROFS Snapshotter Support in Kata	2025-05-06 11:14:19 +02:00
Ruoqing He	384d335419	ci: Enable build-check for agent on riscv64 Enable build-check for `agent` component for riscv64 platforms. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-05-06 01:48:37 +00:00
Ruoqing He	7f9b2c0af1	ci: Enable `install_libseccomp.sh` for riscv64 `musl` target is not yet available for riscv64 as of 1.80.0 rust toolchain, set `FORTIFY_SOURCE` to 1 on riscv64 platforms. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-05-06 01:48:37 +00:00
Nikos Ch. Papadopoulos	0f2c0d38f5	agent: Create pci_root_bus_path for riscv64 `create_pci_root_bus_path` needs to be enabled on riscv64 for agent to compile and work on those platforms. Signed-off-by: Nikos Ch. Papadopoulos <ncpapad@cslab.ece.ntua.gr>	2025-05-06 01:48:37 +00:00
Fupan Li	29f9015caf	runtime-rs: rm the obsoleted ephemeral volume processing Since the ephemeral volume already has a separate volume type for processing, the processing in the virtiofs share volume can be deleted. Moreover, it is not appropriate to process the ephemeral in the share fs. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-05-06 09:45:35 +08:00
Fupan Li	6e5f3cbbeb	runtime-rs: add the ephemeral memory based volume support For k8s, there's two type of volumes based on ephemral memory, one is emptydir volume based on ephemeral memory, and the other one is used for shm device such as /dev/shm. Thus add a new volume type ephemeral volume to support those two type volumes and remove the legacy shm volume. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-05-06 09:45:24 +08:00
ChengyuZhu6	d07b279bf1	agent:storage: Add directory creation support Implementing directory creation logic in the OverlayfsHandler to process driver options with the KATA_VOLUME_OVERLAYFS_CREATE_DIR prefix Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>	2025-05-05 23:51:44 +02:00
ChengyuZhu6	f63ec50ba3	runtime: Add EROFS snapshotter with block device support - Detection of EROFS options in container rootfs - Creation of necessary EROFS devices - Sharing of rootfs with EROFS via overlayfs Fixes: #11163 Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>	2025-05-05 23:51:13 +02:00
Archana Choudhary	fb815b77c1	genpolicy: add test for volumeMounts This patch: - adds a count check on mounts - adds various test scenarios for mounts with emptyDir volume source Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2025-05-05 15:17:50 +00:00
RuoqingHe	1cb34c4d0a	Merge pull request #11202 from RuoqingHe/2025-04-28-upgrade-rtnetlink runtime-rs: Upgrade `rust-netlink` crates	2025-05-05 21:35:45 +08:00
Fupan Li	492329fc02	runtime: add the mtu support for updating routes Some cni plugins will set the MTU of some routes, such as cilium will modify the MTU of the default route. If the mtu of the route is not set correctly, it may cause excessive fragmentation or even packet loss of network packets. Therefore, this PR adds the setting of the MTU of the route. First, when obtaining the route, if the MTU is set, the MTU will also be obtained and set to the route in the guest. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-05-04 23:12:57 +02:00
Ruoqing He	2d0f32ff96	runtime-rs: Upgrade crates from `rust-netlink` Bump `netlink-sys` to v0.8, `netlink-packet-route` to v0.22 and `rtnetlink` to v0.16 to reach a consistent state of `rust-netlink` dependencies. `bitflags` is bumped to v2.9.0 since those crates requires it. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-05-03 02:31:02 +00:00
Ruoqing He	09700478eb	runtime-rs: Group Dependencies from `rust-netlink` `rtnetlink`, `netlink-sys` and `netlink-packet-route` are from the same organization, and some of them are depending on the others, which implies the version of those crates should be chosen and dealt with carefully, group them to provide better management. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-05-03 02:29:43 +00:00
Fabiano Fidêncio	fbf7faa9f4	Merge pull request #11227 from fidencio/topic/agent-only-try-ipv6-if-stack-is-supported agent: netlink: Only add an ipv6 address if ipv6 is enabled	2025-05-02 12:31:40 +02:00
Xuewei Niu	a9b3c6a5a5	Merge pull request #11209 from lifupan/fix_slog shimv2: fix the issue logger write failed	2025-05-02 17:25:44 +08:00
Fabiano Fidêncio	79ad68cce5	Merge pull request #11230 from kimullaa/remove-wrong-qemu-option runtime: remove wrong qemu-system-x86_64 option	2025-05-02 11:18:45 +02:00
stevenhorsman	21498d401f	build(deps): bump openssl from to 0.10.72 Bumps [openssl](https://github.com/sfackler/rust-openssl) to 0.10.72. - [Release notes](https://github.com/sfackler/rust-openssl/releases) --- updated-dependencies: - dependency-name: openssl dependency-version: 0.10.72 dependency-type: indirect ... Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-05-02 09:36:50 +01:00
Fabiano Fidêncio	4ce00ea434	agent: netlink: Only add an ipv6 address if ipv6 is enabled When running Kata Containers on CSPs, the CSPs may enforce their clusters to be IPv4-only. Checking the OCI spec passed down to container, on a GKE cluster, we can see: ``` "sysctl": { ... "net.ipv6.conf.all.disable_ipv6": "1", "net.ipv6.conf.default.disable_ipv6": "1", ... }, ``` Even with ipv6 being explicitly disabled (behind our back ;-)), we've noticed that IPv6 addresses would be received, but then as IPv6 was disabled we'd break on CreatePodSandbox with the following error: ``` Warning FailedCreatePodSandBox 4s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: "update interface: Failed to add address fe80::c44c:1cff:fe84:f6b7: NetlinkError(ErrorMessage { code: Some(-13), header: [64, 0, 0, 0, 20, 0, 5, 5, 19, 0, 0, 0, 0, 0, 0, 0, 10, 64, 0, 0, 2, 0, 0, 0, 20, 0, 1, 0, 254, 128, 0, 0, 0, 0, 0, 0, 196, 76, 28, 255, 254, 132, 246, 183, 20, 0, 2, 0, 254, 128, 0, 0, 0, 0, 0, 0, 196, 76, 28, 255, 254, 132, 246, 183] })\n\nStack backtrace:\n 0: <unknown>\n 1: <unknown>\n 2: <unknown>\n 3: <unknown>\n 4: <unknown>\n 5: <unknown>\n 6: <unknown>\n 7: <unknown>\n 8: <unknown>\n 9: <unknown>\n 10: <unknown>": unknown ``` A huge shoutout to Fupan Li for helping with the debug on this one! Fixes: #11200 Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-05-02 09:10:45 +02:00
Shunsuke Kimura	3dba8ddd98	runtime: remove wrong qemu-system-x86_64 option qemu-system-x86_64 does not support "-machine virt". (this is only supported by arm,aarch64) <https://people.redhat.com/~cohuck/2022/01/05/qemu-machine-types.html> Fixes: #11229 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-02 04:37:12 +09:00
Fabiano Fidêncio	7e404dd13f	Merge pull request #11228 from zvonkok/fix-kernel-modules-build gpu: Set the ARCH explicilty for driver builds	2025-05-01 21:07:20 +02:00
Zvonko Kaiser	445cad7754	gpu: Set the ARCH explicilty for driver builds Kernel Makefiles changed how to deduce the right arch lets set it explicilty to enable arm and amd builds. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-05-01 17:13:20 +00:00
RuoqingHe	049a4ef3a8	Merge pull request #11146 from RuoqingHe/2025-04-14-dragonball-centralize-dbs dragonball: Put local dependencies into workspace	2025-05-01 22:06:51 +08:00
RuoqingHe	bd1071aff8	Merge pull request #11174 from kata-containers/dependabot/cargo/src/mem-agent/crossbeam-channel-0.5.15 build(deps): bump crossbeam-channel from 0.5.13 to 0.5.15 in /src/mem-agent	2025-05-01 16:53:42 +08:00
Ruoqing He	61f2b6a733	dragonball: Put local dependencies into workspace Put local dependencies (mostly `dbs` crates) into workspace to avoid complex path dependencies all over the workspace. Simplify path dependency referencing. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-05-01 08:40:22 +00:00
RuoqingHe	33c69fc8bf	Merge pull request #11204 from stevenhorsman/go-security-bump-april-25 versions: Bump golang.org/x/net	2025-05-01 16:36:24 +08:00
Fabiano Fidêncio	bc66d75fe9	Merge pull request #11217 from stevenhorsman/runtime-rs-centralise-workspace-config Runtime rs centralise workspace config	2025-05-01 10:36:07 +02:00
Fupan Li	9924fbbc70	shimv2: fix the issue logger write failed It's better to open the log pipe file with read & write option, otherwise, once the containerd reboot and closed the read endpoint, kata shim would write the log pipe with broken pipe error. Fixes: #11207 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-05-01 16:15:18 +08:00
Fabiano Fidêncio	3dfabd42c2	Merge pull request #11206 from kimullaa/fix-xfs-rootfs-type runtime: remove wrong xfs options	2025-05-01 09:05:17 +02:00
Fabiano Fidêncio	a2fbc598b8	Merge pull request #11223 from microsoft/cameronbaird/revert-aks-extension-pin ci: revert temp: ci: Fix AKS cluster creation	2025-05-01 08:33:12 +02:00
Shunsuke Kimura	62639c861e	runtime: remove wrong xfs options "data=ordered" and "errors=remount-ro" are wrong options in xfs. (they are ext4 options) <https://manpages.ubuntu.com/manpages/focal/man5/xfs.5.html> Fixes: #11205 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-05-01 07:56:39 +09:00
Cameron Baird	6e21d14334	Revert "temp: ci: Fix AKS cluster creation" This reverts commit `1de466fe84`. The latest release of the az aks extension fixes the issue https://github.com/Azure/azure-cli-extensions/blob/main/src/aks-preview/HISTORY.rst#1400b5 Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-30 21:24:42 +00:00
stevenhorsman	a126884953	runtime-rs: Share workspace config Update the runtime-rs workspace packages to use workspace package versions where applicable to centralise the config and reduce maintenance when updating these Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 19:40:47 +01:00
stevenhorsman	f8fcd032ef	workflow: Set RUST_LIB_BACKTRACE=0 As discussed in #9538, with anyhow >=1.0.77 we have test failures due to backtrace behaviour changing, so set RUST_LIB_BACKTRACE=0, so that we only have backtrace on panics Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 19:38:13 +01:00
stevenhorsman	ffbaa793a3	versions: Update crossbeam-channel Update all crossbeam-channel for all non-agent packages (it was done separately in #11175) to 0.5.15 to get them on latest version and remove the versions with a vulnerability Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 19:36:40 +01:00
Steve Horsman	b97bc03ecb	Merge pull request #11211 from stevenhorsman/dragonball-lockfiles dragonball: Remove package lockfiles	2025-04-30 19:34:58 +01:00
stevenhorsman	f910c7535a	ci: Workaround cargo deny issue When a PR has no new files the cargo deny runner fails with: ``` [cargo-deny-generator.sh:17] ERROR: changed_files_status= ``` so add `\|\| true` to try and help this Co-authored-by: Ruoqing He <heruoqing@iscas.ac.cn> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 16:27:25 +01:00
stevenhorsman	f2a2117252	tests: k8s: Retry output of kubectl exec in k8s-cpu-ns We are seeing failures in this test, where the output of the kubectl exec command seems to be blank, so try retrying the exec like #11024 Fixes: #11133 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 15:01:08 +01:00
stevenhorsman	97f7d49e8e	dragonball: Remove package lockfiles Since #10780 the dbs crates are managed as members of the dragonball workspace, so we can remove the lockfile as it's now workspace managed now Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-30 09:14:07 +01:00
Steve Horsman	8045cb982c	Merge pull request #11208 from kata-containers/dependabot/cargo/src/runtime-rs/tokio-1.38.2 build(deps): bump tokio from 1.38.0 to 1.38.2 in /src/runtime-rs	2025-04-30 08:44:51 +01:00
Aurélien Bombo	46af7cf817	Merge pull request #11077 from microsoft/cameronbaird/address-gid-mismatch genpolicy: Align GID behavior with CRI and enable GID policy checks.	2025-04-29 22:23:23 +01:00
Aurélien Bombo	19371e2d3b	Merge pull request #11164 from wainersm/fix_kbs_on_aks tests/k8s: fix kbs installation on Azure AKS	2025-04-29 18:25:14 +01:00
Steve Horsman	6c1fafb651	Merge pull request #11210 from kata-containers/dependabot/cargo/src/tools/runk/tokio-1.44.2 build(deps): bump tokio from 1.38.0 to 1.44.2 in /src/tools/runk	2025-04-29 16:43:58 +01:00
Steve Horsman	3c8cc0cdbf	Merge pull request #11212 from BbolroC/add-cc-vfio-ap-test-s390x GHA: Add VFIO-AP to s390x nightly tests for CoCo	2025-04-29 16:15:00 +01:00
Steve Horsman	a6d1dc7df3	Merge pull request #10940 from ldoktor/peer-pods ci.ocp: Add peer-pods setup script	2025-04-29 15:57:30 +01:00
Hyounggyu Choi	63b9ae3ed0	GHA: Add VFIO-AP to s390x nightly tests for CoCo As #11076 introduces VFIO-AP bind/associate funtions for IBM Secure Execution (SEL), a new internal nightly test has been established. This PR adds a new entry `cc-vfio-ap-e2e-tests` to the existing matrix to share the test result. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-04-29 16:06:12 +02:00
Steve Horsman	8b32846519	Merge pull request #10882 from stevenhorsman/kbs-logging-on-failure tests: confidential: Add KBS logging	2025-04-29 13:29:21 +01:00
dependabot[bot]	7163d7d89b	build(deps): bump tokio from 1.38.0 to 1.38.2 in /src/runtime-rs Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.38.0 to 1.38.2. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.38.0...tokio-1.38.2) --- updated-dependencies: - dependency-name: tokio dependency-version: 1.38.2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-04-29 12:21:58 +00:00
dependabot[bot]	2992a279ab	build(deps): bump tokio from 1.38.0 to 1.44.2 in /src/tools/runk Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.38.0 to 1.44.2. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.38.0...tokio-1.44.2) --- updated-dependencies: - dependency-name: tokio dependency-version: 1.44.2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-04-29 12:14:41 +00:00
Fabiano Fidêncio	e5cc9acab8	Merge pull request #11175 from kata-containers/dependabot/cargo/src/agent/crossbeam-channel-0.5.15 build(deps): bump crossbeam-channel from 0.5.14 to 0.5.15 in /src/agent	2025-04-29 14:13:25 +02:00
Fabiano Fidêncio	a9893e83b8	Merge pull request #11203 from stevenhorsman/high-severity-security-bumps-april-25 rust: High severity security bumps april 25	2025-04-29 14:10:05 +02:00
stevenhorsman	52b2662b75	tests: confidential: Add KBS logging For help with debugging add, logging of the KBS, like the container system logs if the confidential test fails Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-29 09:48:18 +01:00
stevenhorsman	bcffe938ca	versions: Bump golang.org/x/net Bump golang.org/x/net to 0.38.0 as dependabot isn't doing it for these packages to remediate CVE-2025-22872 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-29 09:46:48 +01:00
Steve Horsman	57527c1ce4	Merge pull request #11161 from kata-containers/dependabot/go_modules/src/runtime/golang.org/x/net-0.38.0 build(deps): bump golang.org/x/net from 0.33.0 to 0.38.0 in /src/runtime	2025-04-29 09:39:30 +01:00
Cameron Baird	70ef0376fb	genpolicy: Introduce special handling for clusters using nydus Nydus+guest_pull has specific behavior where it improperly handles image layers on the host, causing the CRI to not find /etc/passwd and /etc/group files on container images which have them. The unfortunately causes different outcomes w.r.t. GID used which we are trying to enforce with policy. This behavior is observed/explained in https://github.com/kata-containers/kata-containers/issues/11162 Handle this exception with a config.settings.cluster_config.guest_pull field. When this is true, simply ignore the /etc/* files in the container image as they will not be parsed by the CRI. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 20:18:42 +00:00
Cameron Baird	d3b652014a	genpolicy: Introduce genpolicy tests for security contexts Add security context testcases for genpolicy, verifying that UID and GID configurations controlled by the kubernetes security context are enforced. Also, fix the other CreateContainerRequest tests' expected contents to reflect our new genpolicy parsing/enforcement of GIDs. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:31 +00:00
Cameron Baird	fc75aee13a	ci: Add CI tests for runAsGroup, GID policy Introduce tests to check for policy correctness on a redis deployment with 1. a pod-level securityContext 2. a container-level securityContext which shadows the pod-level securityContext 3. a pod-level securityContext which selects an existing user (nobody), causing a new GID to be selected. Redis is an interesting container image to test with because it includes a /etc/passwd file with existing user/group configuration of 1000:1000 baked in. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:31 +00:00
Cameron Baird	938ddeaf1e	genpolicy: Enable GID checks in rules.rego With fixes to align policy GID parsing with the CRI behavior, we can now enable policy verification of GIDs. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:31 +00:00
Cameron Baird	eb2c7f4150	genpolicy: Integrate /etc/passwd from OCI container when setting GIDs The GID used for the running process in an OCI container is a function of 1. The securityContext.runAsGroup specified in a pod yaml, 2. The UID:GID mapping in /etc/passwd, if present in the container image layers, 3. Zero, even if the userstr specifies a GID. Make our policy engine align with this behavior by: 1. At the registry level, always obtain the GID from the /etc/passwd file if present. Ignore GIDs specified in the userstr encoded in the OCI container. 2. After an update to UID due to securityContexts, perform one final check against the /etc/passwd file if present. The GID used for the running process is the mapping in this file from UID->GID. 3. Override everything above with the GID of the securityContext configuration if provided Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:31 +00:00
Cameron Baird	c13d7796ee	genpolicy: Parse secContext runAsGroup and allowPrivilegeEscalation Our policy should cover these fields for securityContexts at the pod or container level of granularity. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:31 +00:00
Cameron Baird	349ce8c339	genpolicy: Refactor registry user/group parsing to account for all cases The get_process logic in registry.rs did not account for all cases (username:groupname), did not defer to contents of /etc/group, /etc/passwd when it should, and was difficult to read. Clean this implementation up, factoring the string parsing for user/group strings into their own functions. Enable the registry::Container class to query /etc/passwd and /etc/group, if they exist. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-04-28 16:28:29 +00:00
Wainer dos Santos Moschetta	460c3394dd	gha: run CoCo non-TEE tests on "all" host type By running on "all" host type there are two consequences: 1) run the "normal" tests too (until now, it's only "small" tests), so increasing the coverage 2) create AKS cluster with larger VMs. This is a new requirement due to the current ingress controller for the KBS service eating too much vCPUs and lefting only few for the tests (resulting on failures) Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-04-28 12:08:31 -03:00
Wainer dos Santos Moschetta	945482ff6e	tests: make _print_instance_type() to handle "all" host type _print_instance_type() returns the instance type of the AKS nodes, based on the host type. Tests are grouped per host type in "small" and "normal" sets based on the CPU requirements: "small" tests require few CPUs and "normal" more. There is an 3rd case: "all" host type maps to the union of "small" and "normal" tests, which should be handled by _print_instance_type() properly. In this case, it should return the largest instance type possible because "normal" tests will be executed too. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-04-28 12:08:31 -03:00
Wainer dos Santos Moschetta	a66aac0d77	tests/k8s: optimize nginx ingress for AKS small VM It's used an AKS managed ingress controller which keeps two nginx pod replicas where both request 500m of CPU. On small VMs like we've used on CI for running the CoCo non-TEE tests, it left only a few amount of CPU for the tests. Actually, one of these pod replicas won't even get started. So let's patch the ingress controller to have only one replica of nginx. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-04-28 12:08:31 -03:00
Wainer dos Santos Moschetta	14e74b8fc9	tests/k8s: fix kbs installation on Azure AKS The Azure AKS addon-http-application-routing add-on is deprecated and cannot be enabled on new clusters which has caused some CI jobs to fail. Migrated our code to use approuting instead. Unlike addon-http-application-routing, this add-on doesn't configure a managed cluster DNS zone, but the created ingress has a public IP. To avoid having to deal with DNS setup, we will be using that address from now on. Thus, some functions no longer used are deleted. Fixes #11156 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-04-28 12:08:31 -03:00
Fabiano Fidêncio	03ab774ed5	helm: Avoid appending the multiInstallSuffix several times Once the multiInstallSuffix has been taken into account, we should not keep appending it on every re-run/restart, as that would lead to a path that does not exist. Fixes: #11187 Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-28 16:36:38 +02:00
stevenhorsman	c938c75af0	versions: kata-ctl: Bump rustls Bump rustls version to > 0.21.11 to remediate high severity CVE-2024-32650 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-28 14:55:59 +01:00
stevenhorsman	2ee7ef6aa3	versions: agent-ctl: Bump hashbrown Bump hashbrown to >= 0.15.1 to remediate the high severity security alert that was in v0.15.0 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-28 14:55:46 +01:00
stevenhorsman	e3d3a2843f	versions: Bump mio to at least 0.8.11 Ensure that all the versions of mio we use are at least 0.8.11 to remediate CVE-2024-27308 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-28 14:55:46 +01:00
stevenhorsman	973bd7c2b6	build(deps): bump golang.org/x/net from 0.33.0 to 0.38.0 in /src/runtime Bumps [golang.org/x/net](https://github.com/golang/net) from 0.33.0 to 0.38.0. - [Commits](https://github.com/golang/net/compare/v0.33.0...v0.38.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-version: 0.38.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-28 14:09:54 +01:00
Steve Horsman	9248634baa	Merge pull request #11098 from stevenhorsman/golang-1.23.7 versions: Bump golang version	2025-04-28 13:46:11 +01:00
Fabiano Fidêncio	ee344aa4e9	Merge pull request #11185 from fidencio/topic/reclaim-guest-freed-memory-backport-from-runtime-rs runtime: clh: Add reclaim_guest_freed_memory [BACKPORT]	2025-04-28 12:32:33 +02:00
Steve Horsman	4f703e376b	Merge pull request #11201 from BbolroC/remove-non-tee-from-required-tests ci: Remove run-k8s-tests-coco-nontee from required tests	2025-04-28 10:05:07 +01:00
Hyounggyu Choi	9fe70151f7	ci: Remove run-k8s-tests-coco-nontee from required tests In #11044, `run-k8s-tests-coco-nontee` was set as requried by mistake. This PR disables the test again. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-04-28 10:48:08 +02:00
Steve Horsman	83d31b142b	Merge pull request #11044 from Jakob-Naucke/basic-s390x-ci ci: Extend basic s390x tests	2025-04-28 09:14:00 +01:00
Fupan Li	3457572130	Merge pull request #10579 from Apokleos/pcilibs-rs kata-sys-utils: Introduce pcilibs for getting pci devices info	2025-04-27 16:39:40 +08:00
Alex Lyn	43b5a616f6	Merge pull request #11166 from Apokleos/memcfg-adjust kata-types: Optimize memory adjuesting by only gathering memory info	2025-04-27 15:57:45 +08:00
Fabiano Fidêncio	b747f8380e	clh: Rework CreateVM to reduce the amount of cycles Otherwise the static checks will whip us as hard as possible. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-25 21:30:47 +02:00
Champ-Goblem	9f76467cb7	runtime: clh: Add reclaim_guest_freed_memory [BACKPORT] We're bringing to Cloud Hypervisor only the reclaim_guest_freed_memory option already present in the runtime-rs. This allows us to use virtio-balloon for the hypervisor to reclaim memory freed by the guest. The reason we're not touching other hypervisors is because we're very much aware of avoiding to clutter the go code at this point, so we'll leave it for whoever really needs this on other hypervisor (and trust me, we really do need it for Cloud Hypervisor right now ;-)). Signed-off-by: Champ-Goblem <cameron@northflank.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-25 21:05:53 +02:00
Fabiano Fidêncio	1c72d22212	Merge pull request #11186 from fidencio/topic/kernel-add-taskstats-to-the-config kernel: Add CONFIG_TASKSTATS (and related) configs	2025-04-25 15:28:04 +02:00
Steve Horsman	213f9ddd30	Merge pull request #11191 from fidencio/topic/release-3.16.0-bump release: Bump version to 3.16.0	2025-04-25 09:04:31 +01:00
Fabiano Fidêncio	fc4e10b08d	release: Bump version to 3.16.0 Bump VERSION and helm-chart versions Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-25 08:17:15 +02:00
Fabiano Fidêncio	b96685bf7a	Merge pull request #11153 from fidencio/topic/build-allow-choosing-which-runtime-will-be-built build: Allow users to build the go, rust, or both runtimes	2025-04-25 08:13:07 +02:00
Fabiano Fidêncio	800c05fffe	Merge pull request #11189 from kata-containers/sprt/fix-create-cluster temp: ci: Fix AKS cluster creation	2025-04-24 23:01:12 +02:00
Aurélien Bombo	1de466fe84	temp: ci: Fix AKS cluster creation The AKS CLI recently introduced a regression that prevents using aks-preview extensions (Azure/azure-cli#31345), and hence create CI clusters. To address this, we temporarily hardcode the last known good version of aks-preview. Note that I removed the comment about this being a Mariner requirement, as aks-preview is also a requirement of AKS App Routing, which will be introduced soon in #11164. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-04-24 15:06:14 -05:00
Dan Mihai	706c2e2d68	Merge pull request #11184 from microsoft/danmihai1/retry-genpolicy ci: retry genpolicy execution	2025-04-24 08:01:22 -07:00
Champ-Goblem	cf4325b535	kernel: Add CONFIG_TASKSTATS (and related) configs Knowing that the upstream project provides a "ready to use" version of the kernel, it's good to include an easy way to users to monitor performance, and that's what we're doing by enabling the TASKSTATS (and related) kernel configs. This has been present as part of older kernels, but I couldn't reasonably find the reason why it's been dropped. Signed-off-by: Champ-Goblem <cameron@northflank.com> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-24 11:51:21 +02:00
Fabiano Fidêncio	7e9e9263d1	build: Allow users to build the go, rust, or both runtimes Let's add a RUNTIME_CHOICE env var that can be passed to be build scripts, which allows the user to select whether they bulld the go runtime, the rust runtime, or both. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-24 10:36:26 +02:00
Alex Lyn	8b49564c01	Merge pull request #10610 from Xynnn007/faet-initdata-rbd Feat \| Implement initdata for bare-metal/qemu hypervisor	2025-04-24 09:59:14 +08:00
Alex Lyn	e8f19609b9	Merge pull request #11150 from zvonkok/cdi-annotations gpu: Fix CDI annotations	2025-04-24 09:58:16 +08:00
Dan Mihai	517d6201f5	ci: retry genpolicy execution genpolicy is sending more HTTPS requests than other components during CI so it's more likely to be affected by transient network errors similar to: ConnectError( "dns error", Custom { kind: Uncategorized, error: "failed to lookup address information: Try again", }, ) Note that genpolicy is not the only component hitting network errors during CI. Recent example from a different component: "Message: failed to create containerd task: failed to create shim task: failed to async pull blob stream HTTP status server error (502 Bad Gateway)" This CI change might help just with the genpolicy errors. Fixes: #11182 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-04-23 21:38:12 +00:00
Zvonko Kaiser	3946435291	gpu: Handle VFIO devices with DevicePlugin and CDI We can provide devices during cold-plug with CDI annotation on a Pod level and add per container device information wit the device plugin. Since the sandbox has already attached the VFIO device remove them from consideration and just apply the inner runtime CDI annotation. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-04-23 21:02:06 +00:00
Zvonko Kaiser	486244b292	gpu: Remove unneeded parsing of CDI devices The addition of CDI devices is now done for single_container and pod_sandbox and pod_container before the devmanager creates the deviceinfos no need for extra parsing. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-04-23 21:02:06 +00:00
Zvonko Kaiser	6713db8990	gpu: Add CDI parsing for Sandbox as well Extend the CDI parsing for pod_sandbox as well, only single_container was covered properly. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-04-23 21:02:06 +00:00
Zvonko Kaiser	97f4bcb456	gpu: Remove CDI annotations for outer runtime After the outer runtime has processed the CDI annotation from the spec we can delete them since they were converted into Linux devices in the OCI spec. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-04-23 21:02:06 +00:00
Steve Horsman	6102976d2d	Merge pull request #11178 from stevenhorsman/gperf-mirror versions: Switch gperf mirror	2025-04-23 20:21:42 +01:00
stevenhorsman	09052faaa0	versions: Switch gperf mirror Every so often the main gnu site has an outage, so we can't download gperf. GNU providesthe generic URL https://ftpmirror.gnu.org to automatically choose a nearby and up-to-date mirror, so switch to this to help avoid this problem Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-23 15:29:54 +01:00
stevenhorsman	ed56050a99	versions: Bump golangci-lint version v1.60.0+ is needed for go 1.23 support, so bump to the current latest 1.x version Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-23 12:37:48 +01:00
stevenhorsman	1c9d7ce0eb	ci: cri-containerd: Remove source from install_go.sh If the correct version of go is already installed then install_go.sh runs `exit`. When calling this as source from cri-containerd/gha-run.sh it means all dependencies after are skipped, so remove this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-23 12:37:48 +01:00
stevenhorsman	c37840ce80	versions: Bump golang version Bump golang version to the latest minor 1.23.x release now that 1.24 has been released and 1.22.x is no longer stable and receiving security fixes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-23 12:37:48 +01:00
dependabot[bot]	463fd4eda4	build(deps): bump crossbeam-channel from 0.5.14 to 0.5.15 in /src/agent Bumps [crossbeam-channel](https://github.com/crossbeam-rs/crossbeam) from 0.5.14 to 0.5.15. - [Release notes](https://github.com/crossbeam-rs/crossbeam/releases) - [Changelog](https://github.com/crossbeam-rs/crossbeam/blob/master/CHANGELOG.md) - [Commits](https://github.com/crossbeam-rs/crossbeam/compare/crossbeam-channel-0.5.14...crossbeam-channel-0.5.15) --- updated-dependencies: - dependency-name: crossbeam-channel dependency-version: 0.5.15 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-04-23 11:34:14 +00:00
Steve Horsman	1ffce3ff70	Merge pull request #11173 from stevenhorsman/update-before-install workflows: Add apt update before install	2025-04-23 12:32:54 +01:00
stevenhorsman	ccfdf59607	workflows: Add apt update before install Add apt/apt-get updates before we do apt/apt-get installs to try and help with issues where we fail to fetch packages Co-authored-by: Fabiano Fidêncio <fidencio@northflank.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-23 09:06:08 +01:00
Xynnn007	b1c72c7094	test: add integration test for initdata This test we will test initdata in the following logic 1. Enable image signature verification via kernel commandline 2. Set Trustee address via initdata 3. Pull an image from a banned registry 4. Check if the pulling fails with log `image security validation failed` the initdata works. Note that if initdata does not work, the pod still fails to launch. But the error information is `[CDH] [ERROR]: Get Resource failed` which internally means that the KBS URL has not been set correctly. This test now only runs on qemu-coco-dev+x86_64 and qemu-tdx Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-04-23 15:55:04 +08:00
RuoqingHe	ef12dcd7da	Merge pull request #11158 from RuoqingHe/2025-04-15-fix-flag-calc runtime-rs: Use bitwise or assign for bitflags	2025-04-23 15:20:33 +08:00
alex.lyn	9eb3fcb84b	kata-types: Clean up noise caused by unformatted code For a long time, there has been unformatted code in the kata-types codebase, for example: ``` if qemu.memory_info.enable_guest_swap { - return Err(eother!( - "Qemu hypervisor doesn't support enable_guest_swap" - )); + return Err(eother!("Qemu hypervisor doesn't support enable_guest_swap")); } ... - }, device::DRIVER_NVDIMM_TYPE, eother, resolve_path + }, + device::DRIVER_NVDIMM_TYPE, + eother, resolve_path, -use std::collections::HashMap; -use anyhow::{Result, anyhow}; +use anyhow::{anyhow, Result}; use std::collections::hash_map::Entry; +use std::collections::HashMap; -/// DRIVER_VFIO_PCI_GK_TYPE is the device driver for vfio-pci +/// DRIVER_VFIO_PCI_GK_TYPE is the device driver for vfio-pci ``` This has brought unnecessary difficulties in version maintenance and commit difficulties. This commit will address this issue. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:40:07 +08:00
alex.lyn	97a1942f86	kata-types: Optimize memory adjuesting by only gathering memory info The Coniguration initialization was observed to be significantly slow due to the extensive system information gathering performed by `sysinfo::System::new_all()`. This function collects data on CPU, memory, disks, and network, most of which is unnecessary for Kata's memory adjusting config phase, where only the total system memory is required. This commit optimizes the initialization process by implementing a more targeted approach to retrieve only the total system memory. This avoids the overhead of collecting a large amount of irrelevant data, resulting in a noticeable performance improvement. Fixes #11165 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:40:07 +08:00
alex.lyn	3e77377be0	kata-sys-utils: Add test cases for devices In this, the crate mockall is introduced to help mock get_all_devices. Fixes #10556 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:32:04 +08:00
alex.lyn	f714b6c049	kata-sys-utils: Add test cases for pci manager Fixes #10556 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:32:04 +08:00
alex.lyn	0cdc05ce0a	kata-sys-utils: Introduce method to help handle proper BAR memory We need more information (BAR memory and other future ures...)for PCI devices when vfio devices passed through. So the method get_bars_max_addressable_memory is introduced for vfio devices to deduce the memory_reserve and pref64_reserve for NVIDIA devices. But it will be extended for other devices. Fixes #10556 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:32:04 +08:00
alex.lyn	f5eaaa41d5	kata-sys-utils: Introduce pcilibs to help get pci device info It's the basic framework for getting information of pci devices. Currently, we focus on the PCI Max bar memory size, but it'll be extended in the future. Fixes #10556 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-23 09:32:04 +08:00
Ruoqing He	d7f4b6cbef	runtime-rs: Use bitwise or assign for bitflags Use `\|=` instead of `+=` while calculating and iterating through a vector of flags, which makes more sense and prevents situations like duplicated flags in vector, which would cause problems. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-04-22 23:55:11 +00:00
Jakob Naucke	1c3b1f5adb	ci: Extend basic s390x tests Currently, s390x only tests cri-containerd. Partially converge to the feature set of basic-ci-amd64: - containerd-sandboxapi - containerd-stability - docker with the appropriate hypervisors. Do not run tests currently skipped on amd64, as well as - agent-ctl, which we don't package for s390x - nerdctl, does not package the `full` image for s390x - nydus, does not package for s390x Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-04-22 21:34:02 +02:00
Aurélien Bombo	bf93b5daf1	Merge pull request #11113 from Sumynwa/sumsharma/policy_execprocess_container_id genpolicy: Add container_id & related policy container data to state.	2025-04-22 18:37:58 +01:00
Aurélien Bombo	318c409ed6	Merge pull request #11126 from gkurz/rootfs-systemd-files rootfs: Don't remove files from the rootfs by default	2025-04-22 18:17:14 +01:00
Aurélien Bombo	12594a9f9e	Merge pull request #11157 from wainersm/make_nontee_job_not_required ci: demote CoCo non-TEE to non-required from gatekeeper	2025-04-22 18:15:28 +01:00
Greg Kurz	734e7e8c54	rootfs: Don't remove files from the rootfs by default Recent PR #10732 moved the deletion of systemd files and units that were deemed uneccessary by `02b3b3b977` from `image_builder.sh` to `rootfs.sh`. This unfortunately broke `rootfs.sh centos` and `rootfs.sh -r` as used by some other downstream users like fedora and RHEL, with the following error : Warning FailedCreatePodSandBox 1s (x5 over 63s) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = CreateContainer failed: Establishing a D-Bus connection Caused by: 0: I/O error: Connection reset by peer (os error 104) 1: Connection reset by peer (os error 104) This is because the aforementioned distros use dbus-broker [1] that requires systemd-journald to be present. It is questionable that systemd units or files should be deemed unnecessary for _all_ distros but this has been around since 2019. There's now also a long-standing expectation from CI that `make rootfs && make image` does remove these files. In order to accomodate all the expectations, add a `-d` flag to `rootfs.sh` to delete the systemd files and have `make rootfs` to use it. [1] https://github.com/bus1/dbus-broker Reported-by: Niteesh Dubey <niteesh@us.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2025-04-17 16:53:05 +02:00
Zvonko Kaiser	497ab9faaf	Merge pull request #10999 from zvonkok/rootfs-updates gpu: Update creation permissions	2025-04-16 10:15:38 -04:00
Wainer dos Santos Moschetta	90397ca4fe	ci: demote CoCo non-TEE to non-required from gatekeeper The CoCo non-TEE job has failed due the removal of an add-on from AKS, causing KBS to not get installed (see #11156). The fix should be done in this repo as well as in trustee, which can take some time. We don't want to hold kata-containers PRs from getting merged anylonger, so removing the job from required list. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-04-15 19:00:30 -03:00
Wainer Moschetta	ff9fb19f11	Merge pull request #11026 from ldoktor/e2e-resources ci.ocp: Override default runtimeclass CPU resources	2025-04-15 10:33:35 -03:00
Lukáš Doktor	bfdf4e7a6a	ci.ocp: Add peer-pods setup script this script will be used in a new OCP integration pipeline to monitor basic workflows of OCP+peer-pods. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-04-15 12:13:22 +02:00
Xynnn007	91bb6b7c34	runtime: add support for io.katacontainers.config.runtime.cc_init_data io.katacontainers.config.runtime.cc_init_data specifies initdata used by the pod in base64(gzip(initdata toml)) format. The initdata will be encapsulated into an initdata image and mount it as a raw block device to the guest. The initdata image will be aligned with 512 bytes, which is chosen as a usual sector size supported by different hypervisors like qemu, clh and dragonball. Note that this patch only adds support for qemu hypervisor. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-04-15 16:35:59 +08:00
Sumedh Sharma	2a17628591	genpolicy: Add container_id & related policy container data to state. This commit adds changes to add input container_id and related container data to state after a CreateContainerRequest is allowed. This helps constrain reference container data for evaluating request inputs to one instead of matching against every policy container data, Ex: in ExecProcessRequest inputs. Fixes #11109 Signed-off-by: Sumedh Sharma <sumsharma@microsoft.com>	2025-04-15 14:02:59 +05:30
Zvonko Kaiser	2f28be3ad9	gpu: Update creation permissions We need to make sure the device files are created correctly in the rootfs otherwise kata-agent will apply permission 0o000. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-04-14 21:02:34 +00:00
Fabiano Fidêncio	bfd4b98355	Merge pull request #11142 from fidencio/topic/build-scripts-improvements-for-users build: User-facing improvements for the build scripts	2025-04-14 19:28:12 +02:00
Fabiano Fidêncio	5e363dc277	virtiofsd: Update to v1.13.1 It's been released for some time already ... and although we did have the necessary patches in, we better to stick to a released version of the project. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-14 13:23:31 +02:00
Fabiano Fidêncio	2fef594f14	build: Allow users to define AGENT_POLICY This is mostly used for Kata Containers backing up Confidential Computing use cases, this also has benefits for the normal Kata Containers use cases, this it's left enabled by default. However, let's allow users to specify whether or not they want to have it enabled, as depending on their use-case, it just does not make sense. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-14 10:02:22 +02:00
Fabiano Fidêncio	5d0688079a	build: Allow users to specificy EXTRA_PKGS Right now we've had some logic to add EXTRA_PKGS, but those were restrict to the nvidia builds, and would require changing the file manually. Let's make sure a user can add this just by specifying an env var. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-14 10:02:22 +02:00
Fabiano Fidêncio	40a15ac760	build: Allow adding a guest-hook to the rootfs Kata Containers provides, since forever, a way to run OCI guest-hooks from the rootfs, as long as the files are dropped in a specific location defined in the configuration.toml. However, so far, it's been up to the ones using it to hack the generated image in order to add those guest hooks, which is far from handy. Let's add a way for the ones interested on this feature to just drop a tarball file under the same known build directory, spcificy an env var, and let the guest hooks be installed during the rootfs build. Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-14 10:02:16 +02:00
RuoqingHe	0b4fea9382	Merge pull request #11134 from stevenhorsman/rust-toolchain rust: Add rust-toolchain.toml	2025-04-12 15:03:29 +08:00
Steve Horsman	792180a740	Merge pull request #11105 from stevenhorsman/required-tests-process-update doc: Update required job process	2025-04-11 14:53:27 +01:00
stevenhorsman	93830cbf4d	rust: Add rust-toolchain.toml Add a top-level rust-toolchain.toml with the version that matches version.yaml to ensure that we stay in sync Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-11 09:24:04 +01:00
Steve Horsman	ad68cb9afa	Merge pull request #11106 from stevenhorsman/rust-workspace-settings agent: Inherit rust workspace settings	2025-04-10 09:47:53 +01:00
Xynnn007	17d0db9865	agent: add initdata parse logic Kata-agent now will check if a device /dev/vd* with 'initdata' magic number exists. If it exists, kata-agent will try to read it. Bytes 9~16 are the length of the compressed initdata toml in little endine. Bytes starting from 17 is the compressed initdata. The initdata image device layout looks like 0 8 16 16+length ... EOF 'initdata' length gzip(initdata toml) paddings The initdata will be parsed and put as aa.toml, cdh.toml and policy.rego to /run/confidential-containers/initdata. When AgentPolicy is initialized, the default policy will be overwritten by that. When AA is to be launched, if initdata is once processed, the launch arg will include --initdata parameter. Also, if /run/confidential-containers/initdata/aa.toml exists, the launch args will include -c /run/confidential-containers/initdata/aa.toml. When CDH is to be launched, if initdata is once processed, the launch args will include -c /run/confidential-containers/initdata/cdh.toml Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2025-04-10 13:09:51 +08:00
stevenhorsman	75dc4ce3bf	doc: Update required job process Add information about using required-tests.yaml as a way to track jobs that are required. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 18:13:45 +01:00
Steve Horsman	0dbf4ec39f	Merge pull request #10678 from stevenhorsman/update-gatekeeper-rules-for-md-only-PRs ci: Update gatekeeper tests for md files	2025-04-09 18:10:05 +01:00
stevenhorsman	d1d60cfe89	ci: Update gatekeeper tests for md files Update the required-tests.yaml so that .md files only trigger the static tests, not the build, or CI Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 17:55:27 +01:00
Steve Horsman	9b401cd250	Merge pull request #11090 from stevenhorsman/required-test-updates ci: required-tests fixes/updates	2025-04-09 14:41:57 +01:00
stevenhorsman	576747b060	ci: Skip tests if we only update the required list When making new tests required, or removing existing tests from required, this doesn't impact the CI jobs, so we don't need to run all the tests. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 14:22:47 +01:00
stevenhorsman	9a7c5b914e	ci: required-tests fixes/updates - Remove metrics setup job - Update some truncation typos of job names - Add shellcheck-required - Remove the ok-to-test as a required label on the build test as it isn't needed as a trigger Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 14:22:37 +01:00
Xuewei Niu	5774f131ec	Merge pull request #10938 from Apokleos/fix-iommugrp-symlink runtime-rs: Simplify iommu group base name extraction from symlink	2025-04-09 19:23:48 +08:00
Xuewei Niu	fd9a4548ab	Merge pull request #11129 from RuoqingHe/entend-runtime-rs-workspace runtime-rs: Extend runtime-rs workspace and centralize local dependencies	2025-04-09 19:23:15 +08:00
stevenhorsman	6603cf7872	agent: Update vsock-exporter to use workspace settings To reduce duplication, we could update the vsock-exporter crate to use settings and versions from the agent, where applicable. > [!NOTE] > In order to use the workspace, this has bumped some crate versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 12:02:43 +01:00
stevenhorsman	2cb9fd3c69	agent: Update rustjail to use workspace settings - To reduce duplication, we could update the rustjail crate to use settings and versions from the agent, where applicable. - Also switch to using the derive feature in serde crate rather than the separate serde_derive to avoid keeping both versions in sync > [!NOTE] > In order to use the workspace, this has bumped some crate versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 12:02:43 +01:00
stevenhorsman	655255b50c	agent: Update policy to use workspace settings To reduce duplication, we could update the policy crate to use settings and versions from the agent, where applicable. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 11:42:05 +01:00
stevenhorsman	1bec432ffa	agent: Create workspace package and dependencies - Create agent workspace dependencies and packge info so that the packages in the workspace can use them - Group the local dependencies together for clarity (like in #11129) Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-09 11:42:00 +01:00
Ruoqing He	28c09ae645	runtime-rs: Put local dependencies into workspace Put local dependencies into workspace to avoid complex path dependencies all over the workspace. This gives an overview of local dependencies this workspace uses, where those crates are located, and simplifies the local dependencies referencing process. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-04-09 07:30:29 +00:00
Ruoqing He	3769ad9c0d	runtime-rs: Group local dependencies Judging by the layout of the `Cargo.toml` files, local dependencies are intentionally separated from other dependencies, let's enforce it workspace-wise. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-04-09 03:52:16 +00:00
Ruoqing He	abb5fb127b	runtime-rs: Extend workspace to cover all crates Only `shim` and `shim-ctl` are incorporated in `runtime-rs`'s workspace, let's extend it to cover all crates in `runtime-rs/crates`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-04-09 03:51:48 +00:00
alex.lyn	58bebe332a	runtime-rs: Simplify iommu group base name extraction from symlink Just get base name from iommu group symlink is enough. As the validation will be handled in subsequent steps when constructing the full path /sys/kernel/iommu_groups/$iommu_group. In this PR, it will remove dupicalted validation of iommu_group. Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-04-09 09:28:00 +08:00
Steve Horsman	8df271358e	Merge pull request #11128 from stevenhorsman/disable-metrics-jobs ci: Remove metric jobs	2025-04-08 18:16:35 +01:00
stevenhorsman	e6cca9da6d	ci: Remove metric jobs The metrics runner is broken, so skip the metrics jobs to stop the CI being stuck waiting. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-04-08 17:55:07 +01:00
RuoqingHe	713cbb0c62	Merge pull request #11121 from fidencio/topic/bump-kernel-lts versions: Bump LTS kernel	2025-04-08 17:28:31 +08:00
Xuewei Niu	d3c9cc4e36	Merge pull request #11014 from teawater/mem-agent-doc docs: Add how-to-use-memory-agent.md to howto	2025-04-08 17:20:25 +08:00
Fabiano Fidêncio	a40b919afe	Merge pull request #10724 from likebreath/0109/upgrade_clh_v43.0 versions: Upgrade to Cloud Hypervisor v45.0	2025-04-08 08:11:30 +02:00
Fabiano Fidêncio	bc04c390bd	versions: Bump LTS kernel 6.12.22 has been released Yesterday, let's bump to it. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-04-07 21:46:29 +02:00
Bo Chen	ee84068aed	versions: Upgrade to Cloud Hypervisor v45.0 Details of this release can be found in our roadmap project as iteration v45.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #10723 Signed-off-by: Bo Chen <bchen@crusoe.ai> Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>	2025-04-07 20:33:34 +02:00
Dan Mihai	8779abd0a1	Merge pull request #11057 from mythi/tdx-qgs-uds runtime: qemu: add support to use TDX QGS via Unix Domain Sockets	2025-04-07 07:27:48 -07:00
Dan Mihai	e606a8deb5	Merge pull request #11103 from Ankita13-code/ankitapareek/policy-input-validation policy: Add missing input validations for ExecProcessRequest	2025-04-07 07:26:24 -07:00
Steve Horsman	ba92639481	Merge pull request #11094 from RuoqingHe/2025-03-28-enable-riscv-assets-build ci: Enable `build-kata-static-tarball-riscv64.yaml`	2025-04-07 11:26:15 +01:00
Fabiano Fidêncio	c75ea2582e	Merge pull request #11114 from fidencio/topic/allow-building-the-agent-without-enabling-guest-pull agent: Allow users to build without guest-pull	2025-04-06 12:17:27 +01:00
Fabiano Fidêncio	e3c98a5ac7	agent: Allow users to build without guest-pull For those not interested in CoCo, let's at least allow them to easily build the agent without the guest-pull feature. This reduces the binary size (already stripped) from 25M to 18M. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-04-04 22:58:43 +01:00
Ankita Pareek	7e450bc1c2	policy: Add missing input validations for ExecProcessRequest This commit introduces missing validations for input fields in ExecProcessRequest to harden the security policy. The changes include: - Update rules.rego to add null/empty field enforcements for String_user, SelinuxLabel and ApparmorProfile - Add unit test cases for ExecProcessRequest for each of the validations Signed-off-by: Ankita Pareek <ankitapareek@microsoft.com>	2025-04-03 12:53:59 +00:00
Hui Zhu	17af28acad	docs: Add how-to-use-memory-agent.md to howto Add how-to-use-memory-agent.md (How to use mem-agent to decrease the memory usage of Kata container) to docs to show how to use mem-agent. Fixes: #11013 Signed-off-by: Hui Zhu <teawater@gmail.com>	2025-04-02 17:45:59 +08:00
Lukáš Doktor	009aa6257b	ci.ocp: Override default runtimeclass CPU resources some of the e2e tests spawn a lot of workers which are mainly idle, but the scheduler fails to schedule them due to cpu resource overcommit. For our testing we are more focused on having actual pods running than the speed of the scheduled pods so let's increase the amount of schedulable pods by decreasing the default cpu requests. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-04-02 10:30:40 +02:00
RuoqingHe	2f134514b0	Merge pull request #11097 from kimullaa/robust-user-input kata-deploy: add INSTALLATION_PREFIX validation	2025-04-02 10:05:03 +08:00
Ruoqing He	96e43fbee5	ci: Enable `build-kata-static-tarball-riscv64.yaml` Previously we introduced `build-kata-static-tarball-riscv64.yaml`, enable that workflow in `ci.yaml`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-04-01 16:35:14 +08:00
RuoqingHe	10ceeb0930	Merge pull request #11104 from fidencio/topic/kata-deploy-create-runtimeclasses-by-default kata-deploy: Create runtimeclasses by default	2025-04-01 10:55:44 +08:00
RuoqingHe	b19a8c7b1c	Merge pull request #11066 from kimullaa/update-command-sample kernel: Update the usage in readme	2025-04-01 09:12:43 +08:00
RuoqingHe	b046f79d06	Merge pull request #11100 from kimullaa/remove-double-slash kata-deploy: remove the double "/"	2025-04-01 08:17:00 +08:00
Shunsuke Kimura	a05f5f1827	kata-deploy: add INSTALLATION_PREFIX validation INSTALLATION_PREFIX must begin with a "/" because it is being concatenated with /host. If there is no /, displays a message and makes an error. Fixes: #11096 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-04-01 06:47:30 +09:00
Shunsuke Kimura	a49b6f8634	kata-deploy: Moves the function to the top Move functions that may be used in validation to the top. Fixes: #11097 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-04-01 06:47:30 +09:00
Zvonko Kaiser	d81a1747bd	Merge pull request #11085 from kevinzs2048/fix-virtiomem runtime-go: qemu: Fix sandbox start failing with virtio-mem enable on arm64	2025-03-31 17:09:43 -04:00
Zvonko Kaiser	e5c4cfb8a1	Merge pull request #11081 from BbolroC/unsealed-secret-fix tests: Enable sealed secrets for all TEEs	2025-03-31 11:19:52 -04:00
Shunsuke Kimura	c0af0b43e0	kernel: Update the outdated usage in the readme Since it is difficult to update the README when modifying the options of ./build-kernel.sh, instead of update the README, we encourage users to run the -h command. Fixes: #11065 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-03-31 23:29:58 +09:00
Shunsuke Kimura	902cb5f205	kata-deploy: remove the double "/" Currently, ConfigPath in containerd.toml is a double "/" as follows. ``` [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-clh.options] ConfigPath = "/opt/kata/share/defaults/kata-containers//configuration-clh.toml" ... [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-cloud-hypervisor.options] ConfigPath = "/opt/kata/share/defaults/kata-containers//runtime-rs/configuration-cloud-hypervisor.toml" ... ``` So, removed the double "/". Fixes: #11099 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-03-31 22:31:36 +09:00
Fabiano Fidêncio	28be53ac92	kata-deploy: Create runtimeclasses by default Let's make the life of the users easier and create the runtimeclasses for them by default. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-31 11:29:44 +01:00
Xuewei Niu	abbc9c6b50	Merge pull request #11101 from RuoqingHe/runtime-rs-fix-fmt-check runtime-rs: Remove redundant empty line	2025-03-31 16:28:55 +08:00
Ruoqing He	3c78c42ea5	runtime-rs: Remove redundant empty line While running `cargo fmt -- --check` in `src/runtime-rs` directory, it errors out and suggesting these is an redundant empty line, which prevents `make check` of `runtime-rs` component from passing. Remove redundant empty line to fix this. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-31 00:39:04 +08:00
Steve Horsman	44bab5afc4	Merge pull request #11091 from fidencio/topic/ci-add-kata-deploy-tests-as-required gatekeeper: Add kata-deploy tests as required	2025-03-28 11:05:03 +00:00
Fabiano Fidêncio	5a08d748b9	Merge pull request #11088 from kimullaa/fix-cleanup-failure kata-deploy: Fix kata-cleanup's CrashLoopBackOff	2025-03-27 20:33:52 +01:00
Fabiano Fidêncio	700944c420	gatekeeper: Add kata-deploy tests as required kata-deploy tests have been quite stable, working for more than 10 days without any nightly failure (or any failure reported at all), and I'll be the one maintaining those. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-27 19:47:38 +01:00
Steve Horsman	97bd311a66	Merge pull request #11058 from stevenhorsman/required-static-checks-rename ci: Update static-checks strings	2025-03-27 12:56:28 +00:00
Xuewei Niu	54dcf0d342	Merge pull request #11056 from RuoqingHe/runtime-qemu-riscv runtime: Support and enable build on riscv64	2025-03-27 17:02:21 +08:00
Fabiano Fidêncio	047b7e1fb7	Merge pull request #11063 from lifupan/fix_compile runtime-rs: update the protobuf to 3.7.1	2025-03-27 09:52:20 +01:00
Fabiano Fidêncio	41b536d487	Merge pull request #11059 from microsoft/danmihai1/tests-common tests: k8s: clean-up shellcheck warnings in tests_common.sh	2025-03-27 09:51:49 +01:00
Shunsuke Kimura	9ab6ab9897	kata-deploy: Fix kata-cleanup's CrashLoopBackOff Since kata-deploy.sh references an undefined variable, kata-cleanup.yaml enters a CrashLoopBackOff state. ``` $ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-cleanup/base/kata-cleanup.yaml daemonset.apps/kubelet-kata-cleanup created $ kubectl get pods -n kube-system kubelet-kata-cleanup-zzbd2 0/1 CrashLoopBackOff 3 (33s ago) 80s $ kubectl logs -n kube-system daemonsets/kubelet-kata-cleanup /opt/kata-artifacts/scripts/kata-deploy.sh: line 19: SHIMS: unbound variable ``` Therefore, set an initial value for the environment variables. Fixes: #11083 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-03-27 15:00:19 +09:00
Hyounggyu Choi	0432d2fcdf	Merge pull request #11086 from BbolroC/fix-overwrite-containerd-config tests: Make sure /etc/containerd before writing config	2025-03-27 05:57:31 +01:00
Ruoqing He	46caa986bb	ci: Skip tests depend on virtualization on riscv64 `VMContainerCapable` requires a present `kvm` device, which is not yet available in our RISC-V runners. Skipped related tests if it is running on `riscv-builder`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:47:49 +08:00
Ruoqing He	7f0b1946c5	ci: Enable build-check for runtime on riscv64 `runtime` support for riscv64 is now ready, let enable building and testing on that component. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:38:30 +08:00
Yuting Nie	1f52f83309	runtime: Enable kata-check test on riscv64 Provide according tests to cover `kata-runtime` package, test `kata-runtime`'s `check` functionality on riscv64 platforms. Signed-off-by: Yuting Nie <nieyuting@iscas.ac.cn>	2025-03-27 10:36:55 +08:00
Yuting Nie	b6924ef5e5	runtime: Add getExpectedHostDetails for riscv64 Add `getExpectedHostDetails` with expected value according to template defined in `kata-check_data_riscv64_test.go`. This provides necessary `HostInfo` for tests to cover `kata-check_riscv64.go`. Signed-off-by: Yuting Nie <nieyuting@iscas.ac.cn>	2025-03-27 10:34:34 +08:00
Yuting Nie	594c5e36a6	runtime: Add mock data for kata-check Add definition of `testCPUInfoTemplate` which is retrieved from `/proc/cpuinfo` of a QEMU emulated virtual machine on virt board. Signed-off-by: Yuting Nie <nieyuting@iscas.ac.cn>	2025-03-27 10:33:42 +08:00
Yuting Nie	0ff5cb1e66	runtime: Enable testSetCPUTypeGeneric for riscv64 `testSetCPUTypeGeneric` will be used for writting `kata-check` in `kata-runtime` on riscv64 platforms, enable building for later testing. Signed-off-by: Yuting Nie <nieyuting@iscas.ac.cn>	2025-03-27 10:32:29 +08:00
Ruoqing He	2329aeec38	runtime: Disable race flag for riscv64 `-race` flag used for `go test` is not yet supported on riscv64 platforms, disable it for now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:28:53 +08:00
Ruoqing He	1b4dbebb1b	runtime: Enable runtime to build on riscv64 Convert Rust arch to Go arch in Makefile, and add `riscv64-options.mk` to provide definitions required for runtime to build on riscv64. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:22:55 +08:00
Ruoqing He	805da14634	runtime: Enable runtime check for riscv64 Enable `kata-runtime check` command to work on riscv64 platforms to make sure required features/devices presents. Co-authored-by: Yuting Nie <nieyuting@iscas.ac.cn> Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:07:09 +08:00
Ruoqing He	96b2d25508	runtime: Define default values for QEMU riscv Provide default values while invoking QEMU as the hypervisor for Go runtime on riscv64 platform. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 10:05:36 +08:00
Ruoqing He	1662595146	runtime: Introduce riscv64 to govmm pkg Define `vmm` for riscv64, set `MaxVCPUs` to 512 as QEMU RISC-V virt Generic Virtual Platform [1] define. [1] https://www.qemu.org/docs/master/system/riscv/virt.html Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 09:57:49 +08:00
Ruoqing He	1e4963a3b2	runtime: Define availableGuestProtection for riscv64 `GuestProtection` feature is not made available yet, return `noneProtection` for now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 09:34:53 +08:00
Ruoqing He	4947938ce8	runtime: Introduce riscv64 template for vm factory Set `templateDeviceStateSize` to 8 as other architectures did. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-27 09:28:32 +08:00
Zvonko Kaiser	b7cf4fd2e6	Merge pull request #11053 from ldoktor/ci ci: shellcheck fixes	2025-03-26 13:22:56 -04:00
Hyounggyu Choi	1e187482d4	tests: Make sure /etc/containerd before writing config We get the following error while writing containerd config if a base dir `/etc/containerd` does not exist like: ``` sudo tee /etc/containerd/config.toml << EOF ... EOF tee: /etc/containerd/config.toml: No such file or directory ``` The commit makes sure a base directory for containerd before writing config and drops the config file deletion because a default behaviour of `tee` is overwriting. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-26 18:19:45 +01:00
Hyounggyu Choi	0aa76f7206	tests: Enable sealed secrets for TEEs Fixes: #11011 This commit allows all TEEs to run the sealed secret test. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-26 17:50:41 +01:00
Hyounggyu Choi	423ad8341d	agent: Call cdh_handler for sealed secrets after add_storage() As reported in #11011, mounted secrets are available after a container image is pulled by add_storage() for IBM SE. But secure mount should be handled before the `add_storage()`. Therefore, this commit divides cdh_handler() into: - cdh_handler_trusted_storage() - cdh_handler_sealed_secrets() and calls cdh_handler_sealed_secrets() after add_storage() while keeping cdh_handler_trusted_storage() unchanged. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-26 17:50:41 +01:00
Fabiano Fidêncio	7a0ac55f22	Merge pull request #10984 from fidencio/topic/tests-kata-deploy-ground-work-to-rewrite-the-tests tests: kata-deploy: The rest of the ground work to rewrite the kata-deploy tests	2025-03-26 17:47:48 +01:00
Hyounggyu Choi	8088064b8b	tests: Set default policy before running sealed secrets tests The test `Cannot get CDH resource when deny-all policy is set` completes with a KBS policy set to deny-all. This affects the future TEE test (e.g. k8s-sealed-secrets.bats) which makes a request against KBS. This commit introduces kbs_set_default_policy() and puts it to the setup() in k8s-sealed-secrets.bats. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-26 17:44:38 +01:00
Jakob Naucke	d808cef2fb	agent: AP bind-associate for Secure Execution Kata Containers has support for both the IBM Secure Execution trusted execution environment and the IBM Crypto Express hardware security module (used via the Adjunct Processor bus), but using them together requires specific steps. In Secure Execution, the Acceleration and Enterprise PKCS11 modes of Crypto Express are supported. Both modes require the domain to be _bound_ in the guest, and the latter also requires the domain to be _associated_ with a _guest secret_. Guest secrets must be submitted to the ultravisor from within the guest. Each EP11 domain has a master key verification pattern (MKVP) that can be established at HSM setup time. The guest secret and its ID are to be provided at `/vfio_ap/{mkvp}/secret` and `/vfio_ap/{mkvp}/secret_id` via a key broker service respectively. Bind each domain, and for each EP11 domain, - get the secret and secret ID from the addresses above, - submit the secret to the ultravisor, - find the index of the secret corresponding to the ID, and - associate the domain to the index of this secret. To bind, add the secret, parse the info about the domain, and associate, the s390_pv_core crate is used. The code from this crate also does the AP online check, which can be removed from here. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-03-26 16:37:23 +01:00
Kevin Zhao	211a36559c	runtime-go: qemu: Fix sandbox start failing with virtio-mem enable on arm64 Also add CONFIG_VIRTIO_MEM to arm64 platform Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-03-26 22:31:00 +08:00
Fabiano Fidêncio	404e212102	tests: kata-deploy: Use helm_helper() With this we switch to fully testing with helm, instead of testimg with the kustomizations (which will soon be removed). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-26 13:30:15 +01:00
Fabiano Fidêncio	f7976a40e4	tests: Create a helm_helper() common function Let's use what we have in the k8s functional tests to create a common function to deploy kata containers using our helm charts. This will help us immensely in the kata-deploy testing side in the near future. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-26 13:30:11 +01:00
Fabiano Fidêncio	eb884d33a8	tests: k8s: Export all the default env vars on gha-run.sh This is not strictly needed, but it does help a lot when setting up a cluster manually, while still relying on those scripts. While here, let's also ensure the assignment is between quotes, to make shellchecker happier. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-26 13:23:16 +01:00
Saul Paredes	ae5c587efc	Merge pull request #11074 from Sumynwa/sumsharma/genpolicy_test genpolicy: Refactor tests to allow different request types in a testcases json.	2025-03-25 12:38:19 -07:00
Sumedh Sharma	3406df9133	genpolicy: Refactor tests to add different request types in testcases json This commit introduces changes to add test data for multiple request type in a single testcases.json file. This allows for stateful testing, for ex: enable testing ExecProcessRequest using policy state set after testing a CreateContainerRequest. Fixes #11073. Signed-off-by: Sumedh Sharma <sumsharma@microsoft.com>	2025-03-25 13:52:17 +05:30
Mikko Ylinen	85f3391bcf	runtime: qemu: add support to use TDX QGS via Unix Domain Sockets TDX Quote Generation Service (QGS) signs TDREPORT sent to it from Qemu (GetQuote hypercall). Qemu needs quote-generation-socket address configured for IPC. Currently, Kata govmm only enables vsock based IPC for QGS but QGS supports Unix Domain Sockets too which works well for host process to process IPC (Qemu <-> QGS). The QGS configuration to enable UDS is to run the service with "-port=0" parameter. The same works well here too: setting "tdx_quote_generation_service_socket_port=0" let's users to enable UDS based IPC. The socket path is fixed in QGS and cannot be configured: when "-port=0" is used, the socket appears in /var/run/tdx-qgs/qgs.socket. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-03-25 10:18:40 +02:00
RuoqingHe	7a704453b6	Merge pull request #11075 from microsoft/danmihai1/genpolicy-debug-build genpolicy: add support for BUILD_TYPE=debug	2025-03-25 14:59:15 +08:00
RuoqingHe	5d68600c06	Merge pull request #11010 from stevenhorsman/metrics-containerd-debugging metrics: Test improvements	2025-03-25 11:38:28 +08:00
Dan Mihai	15c9035254	genpolicy: add support for BUILD_TYPE=debug Use "cargo build --release" when BUILD_TYPE was not specified, or when BUILD_TYPE=release. The default "cargo build" behavior is to build in debug mode. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-24 16:10:20 +00:00
Jakob Naucke	683a482d64	protos: Add CDH GetResourceService Add service to get arbitrary data from Confidential Data Hub. Taken from https://github.com/confidential-containers/guest-components/tree/main/api-server-rest. Marked as `#[allow(dead_code)]` because planned use is architecture-specific at this time. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-03-24 15:46:40 +00:00
RuoqingHe	f6a1c6d0e0	Merge pull request #11069 from kimullaa/exit-if-action-is-invalid kata-deploy: return exit code for invalid argument	2025-03-24 09:40:39 +08:00
Shunsuke Kimura	e5d7414c33	kata-deploy: Return exit code for invalid argument It hangs when invalid arguments are specified. ```bash kata-deploy-6sr2p:/# /opt/kata-artifacts/scripts/kata-deploy.sh xxx Action: * xxx ... Usage: /opt/kata-artifacts/scripts/kata-deploy.sh [install/cleanup/reset] ERROR: invalid arguments ... ^C <- hang ``` I changed it to behave the same as when there are no arguments. ```bash kata-deploy-6sr2p:/# /opt/kata-artifacts/scripts/kata-deploy.sh Usage: /opt/kata-artifacts/scripts/kata-deploy.sh [install/cleanup/reset] ERROR: invalid arguments kata-deploy-6sr2p:/# echo $? 1 ``` Fixes: #11068 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2025-03-22 21:32:38 +09:00
Aurélien Bombo	17baa6199b	Merge pull request #11061 from RuoqingHe/2025-03-21-generalize-non-kvm ci: Generalize `GITHUB_RUNNER_CI_ARM64`	2025-03-21 15:23:51 -05:00
Fupan Li	4b93176225	runtime-rs: update the protobuf to 3.7.1 Since some files generated by protobuf were share between runtime-rs and kata agent, and the kata agent's dependency image-rs dependened protobuf@3.7.1, thus we'd better to keep the protobuf version aligned between runtime-rs and agent, otherwise, we couldn't compile the runtime-rs and agent at the same time. Fixes: https://github.com/kata-containers/kata-containers/issues/10650 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-21 17:46:12 +08:00
Ruoqing He	5e81f67ceb	ci: Generalize GITHUB_RUNNER_CI_ARM64 `GITHUB_RUNNER_CI_ARM64` is turned on for self hosted runners without virtualization to skipped those tests depend on virtualization. This may happen to other archs/runners as well, let's generalize it to `GITHUB_RUNNER_CI_NON_VIRT` so we can reuse it on other archs. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-21 09:49:44 +08:00
RuoqingHe	e84f7c2c4b	Merge pull request #11046 from mythi/drop-dcap-libs build: drop libtdx-attest	2025-03-21 09:23:33 +08:00
Dan Mihai	835c6814d7	tests: k8s/tests_common: avoid using regex More straightforward implementation of hard_coded_policy_tests_enabled, that avoids ShellCheck warning: warning: Remove quotes from right-hand side of =~ to match as a regex rather than literally. [SC2076] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 22:23:19 +00:00
Dan Mihai	d83b8349a2	tests: policy: avoid using caller's variable Fix unintended use of caller's variable. Use the corresponding function parameter instead. ShellCheck: warning: policy_settings_dir is referenced but not assigned. [SC2154] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:29 +00:00
Dan Mihai	59a70a2b28	tests: k8s/tests_common: avoid masking return values Avoid masking command return values by declaring and only then assigning. ShellCheck: warning: Declare and assign separately to avoid masking return values. [SC2155] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:29 +00:00
Dan Mihai	b895e3b3e5	tests: k8s/tests_common.sh: add variable assignments Pick the the values exported by other scripts. ShellCheck: warning: AUTO_GENERATE_POLICY is referenced but not assigned. [SC2154] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:29 +00:00
Dan Mihai	0f4de1c94a	tests: tests_common: remove useless assignment ShellCheck: warning: This assignment is only seen by the forked process. [SC2097] warning: This expansion will not see the mentioned assignment. [SC2098] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:29 +00:00
Dan Mihai	9c0d069ac7	tests: tests_common: prevent globbing and word splitting ShellCheck: note: Double quote to prevent globbing and word splitting. [SC2086] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	15961b03f7	tests: k8s/tests_common.sh: -n instead of ! -z ShellCheck: note: Use -n instead of ! -z. [SC2236] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	4589dc96ef	tests: k8s/tests_common.sh: add double quoting ShellCheck: note: Prefer double quoting even when variables don't contain special characters. [SC2248] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	cc5f8d31d2	tests: k8s/tests_common.sh: add braces ShellCheck: add braces around variable references: note: Prefer putting braces around variable references even when not strictly required. [SC2250] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	0d3f9fcee1	tests: tests_common: export variables used externally ShellCheck: export variables used outside of tests_common.sh - e.g., warning: timeout appears unused. Verify use (or export if used externally). [SC2034] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	5df43ffc7c	tests: k8s/tests_common.sh: Prefer [[ ]] over [ ] Replace [ ] with [[ ]] as advised by shellcheck: note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-20 19:02:28 +00:00
Dan Mihai	f79fabab24	Merge pull request #11024 from microsoft/danmihai1/empty-exec-output tests: k8s: retry "kubectl exec" on empty output	2025-03-20 11:03:08 -07:00
stevenhorsman	70d32afbb7	ci: Remove metrics tests from required list The metrics tests haven't been stable, or required through github for many week now, so update the required-tests.yaml list to re-sync Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-20 16:03:03 +00:00
stevenhorsman	607b27fd7f	ci: Update static-checks strings With the refactor in #10948 the names of the static checks has changed, so update these. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-20 13:45:57 +00:00
Mikko Ylinen	f52a565834	build: drop libtdx-attest with the latest CoCo guest-components, tdx-attester no longer depends on libtdx attest. Stop installing it to the rootfs. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-03-20 10:45:30 +02:00
Steve Horsman	c0632f847f	Merge pull request #11043 from stevenhorsman/3.15.0-release release: Bump version to 3.15.0	2025-03-20 07:38:20 +00:00
Greg Kurz	e19b81225c	Merge pull request #11045 from kata-containers/sprt/fix-gha-tag security: ci: Pin third-party actions to commit hashes	2025-03-20 08:14:06 +01:00
Aurélien Bombo	a678046d13	gha: Pin third-party actions to commit hashes A popular third-party action has recently been compromised [1][2] and the attacker managed to point multiple git version tags to a malicious commit containing code to exfiltrate secrets. This PR follows GitHub's recommendation [3] to pin third-party actions to a full-length commit hash, to mitigate such attacks. Hopefully actionlint starts warning about this soon [4]. [1] https://www.cve.org/CVERecord?id=CVE-2025-30066 [2] https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised [3] https://docs.github.com/en/actions/security-for-github-actions/security-guides/security-hardening-for-github-actions#using-third-party-actions [4] https://github.com/rhysd/actionlint/pull/436 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-19 13:52:49 -05:00
stevenhorsman	fad248ef09	release: Bump version to 3.15.0 Bump VERSION and helm-chart versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-19 17:28:06 +00:00
Fabiano Fidêncio	a6e5d28a15	Merge pull request #11055 from stevenhorsman/bump-github.com/containerd/containerd/v1.7.27 runtime: Update github.com/containerd/containerd	2025-03-19 18:19:10 +01:00
stevenhorsman	cb7c599180	runtime: Switch from deprecated tracer `go.opentelemetry.io/otel/trace.NewNoopTracerProvider` is deprectated now, so switch to `go.opentelemetry.io/otel/trace/noop.NewTracerProvider` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-19 14:22:06 +00:00
stevenhorsman	8f22b07aba	runtime: Update github.com/containerd/containerd Update to 1.7.27 to resolve CVE-2024-40635 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-19 13:48:04 +00:00
Lukáš Doktor	d708866b2a	ci.ocp: shellcheck various fixes various manual fixes. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:28 +01:00
Lukáš Doktor	7e11489daf	ci: shellcheck - collection of fixes manual fixes of various issues. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:23 +01:00
Lukáš Doktor	f62e08998c	ci: shellcheck - remove unused argument the "-a" argument was introduced with this tool but never was actually used. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:19 +01:00
Lukáš Doktor	02deb1d782	ci: shellcheck SC2248 SC2248 (style): Prefer double quoting even when variables don't contain special characters, might result in arguments difference, shouldn't in our cases. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:16 +01:00
Lukáš Doktor	d80e7c7644	ci: shellcheck SC2155 SC2155 (warning): Declare and assign separately to avoid masking return values, should be harmless. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:12 +01:00
Lukáš Doktor	6552ac41e0	ci: shellcheck SC2086 SC2086 Double quote to prevent globbing and word splitting, might break places where we deliberately use word splitting, but we are not using it here. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:08 +01:00
Lukáš Doktor	154a4ddc00	ci: shellcheck SC2292 SC2292 (style): Prefer [[ ]] over [ ] for tests in Bash/Ksh. This might result in different handling of globs and some ops which we don't use. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:26:03 +01:00
Lukáš Doktor	667e26036c	ci: shellcheck SC2250 Treat the SC2250 require-variable-braces in CI. There are no functional changes. Related to: #10951 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-03-19 12:25:44 +01:00
Zvonko Kaiser	d37d9feee9	Merge pull request #11035 from kata-containers/sprt/fix-dependabot security: ci: Remove `replace` directives in go.mod files	2025-03-18 12:43:46 -04:00
Steve Horsman	ba5b0777b5	Merge pull request #11002 from fitzthum/bump-gc-0130 Bump Trustee and Guest Components for coco v0.13.0	2025-03-17 16:31:23 +00:00
RuoqingHe	36d2dee3a4	Merge pull request #11042 from RuoqingHe/runtime-rs-riscv runtime-rs: Support and enable build on riscv64	2025-03-17 21:42:15 +08:00
Ruoqing He	cb7508ffdc	ci: Enable runtime-rs component build-check on riscv64 `runtime-rs` is now buildable and testable on riscv64 platforms, enable `build-check` on `runtime-rs`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-17 17:38:59 +08:00
Steve Horsman	f308cbba93	Merge pull request #11015 from AdithyaKrishnan/main CI: Mark SNP as a Required test	2025-03-17 09:27:28 +00:00
Ruoqing He	084fb2d780	runtime-rs: Enable RISC-V build Define `riscv64gc-options.mk` to enable `runtime-rs` to be built on RISC-V platforms. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-17 17:22:48 +08:00
Ruoqing He	fd6c16e209	kata-sys-util: Set NoProtection for riscv64 `available_guets_protection` is required for `runtime-rs` to infer while building it on riscv64 platforms. Set it to `NoProtection` as riscv64 does not support guest protection for now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-17 17:22:48 +08:00
Aurélien Bombo	26bd7989b3	csi-kata-directvolume: Remove `replace` in go.mod Running `go mod tidy` and `go mod vendor` after this resulted in no-ops. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Aurélien Bombo	b965fe8239	tests: Run `go mod vendor` `go mod tidy` was a no-op. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Aurélien Bombo	e9f88757ba	tests: Remove `replace` directives in go.mod Same rationale as for runtime. With tests, the blackfriday replacement was actually meaningful, so I refactored some imports. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Aurélien Bombo	35c92aa6ad	runtime: Run `go mod vendor` Regenerating go module files. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Aurélien Bombo	fa0f85e8b0	runtime: Run `go mod tidy` Tidying up go.mod. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Aurélien Bombo	c3a9c70d45	runtime: Remove `replace` directives in go.mod These replace directives aren't understood by dependabot, hence dependabot can claim to upgrade a dependency, while a replace directive still makes the dependency point to an old version. Fixes: #11020 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-03-14 18:00:36 +00:00
Adithya Krishnan Kannan	32dbee8d7e	CI: Mark SNP as a Required test The SNP CI has been consistently passing and we request the @kata-containers/architecture-committee to mark this test as a required test. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2025-03-14 12:48:55 -05:00
Dan Mihai	dab981b0bc	tests: k8s: retry "kubectl exec" on empty output Retry "kubectl exec" a few times if it unexpectedly produced an empty output string. This is an attempt to work around test failures similar to: https://github.com/kata-containers/kata-containers/actions/runs/13840930994/job/38730153687?pr=10983 not ok 1 Environment variables (from function `grep_pod_exec_output' in file tests_common.sh, line 394, in test file k8s-env.bats, line 36) `grep_pod_exec_output "${pod_name}" "HOST_IP=$[0-9]\+\(\.\\|$$\)\{4\}" "${exec_command[@]}"' failed That test obtained correct ouput from "sh -c printenv" one time, but the second execution of the same command returned an empty output string. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-14 17:03:03 +00:00
Tobin Feldman-Fitzthum	b7786fbcf0	agent: update image-rs for coco v0.13.0 image-rs has gotten a number of significant updates, eliminating corner cases with obscure containers, improving support for local certs, and more. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-03-14 10:44:10 -05:00
Tobin Feldman-Fitzthum	63ec1609bc	versions: update guest-components for coco v0.13.0 Update to the latest hash of guest-components. This will pick up some nice new features including using ec key for the rcar handshake. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-03-14 10:44:10 -05:00
Tobin Feldman-Fitzthum	c352905998	versions: bump trustee for coco v0.13.0 Update to new hashes for Trustee. The MSRV for Trustee is now 1.80.0 so bump the rust toolchain as well. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-03-14 10:44:04 -05:00
Steve Horsman	7968a3c09d	Merge pull request #11028 from Amulyam24/hooks gha: use runner hooks instead of pre/post scripts for ppc64le runners	2025-03-14 15:43:27 +00:00
stevenhorsman	1022d8d260	metrics: Update range for clh tests In `ef0e8669fb` we had been seeing some significantly lower minvalues in the jitter.Result test, so I lowered the mid-value rather than having a very high minpercent, but it appears that the variability of this result is very high, so we are still getting the occasional high value, so reset the midval and just have a bigger ranges on both sides, to try and keep the test stable. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-14 14:54:30 +00:00
stevenhorsman	d77008b817	metrics: Further reduce repeats for boot time tests on qemu I've seen failures on the third run, so reduce it further to just run twice on qemu Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-14 14:53:26 +00:00
stevenhorsman	97151cce4e	metrics: Improve iperf timeout The kubectl wait has a built in timeout of 30s, so wrapping it in waitForProcess, means we have 180/2 * 30 delay, which is much longer than intended, so just set the timeout directly. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-14 14:53:26 +00:00
Amulyam24	becb760e32	gha: use runner hooks instead of pre/post scripts for ppc64le runners This PR makes changes to remove steps to run scripts for preparing and cleaning the runner and instead use runner hooks env variables to manage them. Fixes: #9934 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2025-03-14 17:12:54 +05:30
RuoqingHe	af4058fa82	Merge pull request #10889 from katexochen/p/config-idblock-qemu runtime: make SNP IDBlock configurable	2025-03-14 16:23:05 +08:00
Paul Meyer	a994f142d0	runtime: make SNP IDBlock configurable For a use case, we want to set the SNP IDBlock, which allows configuring the AMD ASP to enforce parameters like expected launch digest at launch. The struct with the config that should be enforced (IDBlock) is signed. The public key is placed in the auth block and the signature is verified by the ASP before launch. The digest of the public key is also part of the attestation report (ID_KEY_DIGESTS). Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-03-14 07:50:54 +01:00
RuoqingHe	810a6dafad	Merge pull request #10939 from mchtech/fix-unbound-var tools: initialize unbound variables in rootfs.sh	2025-03-14 08:22:05 +08:00
Saul Paredes	b7087eb0ea	Merge pull request #10983 from microsoft/cameronbaird/updateinterfacerequest-hardening-upstream genpolicy: Introduce UpdateInterfaceRequest rules in genpolicy-settings	2025-03-13 16:12:03 -07:00
Dan Mihai	b910daf625	Merge pull request #11012 from microsoft/saulparedes/validate_generated_name_upstr policy: validate pod generated name	2025-03-13 14:09:57 -07:00
Steve Horsman	199b16f053	Merge pull request #11022 from microsoft/danmihai1/polist-test-volume-path tests: k8s-policy-pod: safer host path volume source	2025-03-13 20:26:06 +00:00
Dan Mihai	0e26dd4ce8	tests: k8s-policy-pod: safer host path volume source Test using the host path /tmp/k8s-policy-pod-test instead of /var/lib/kubelet/pods. /var/lib/kubelet/pods might happen to contain files that CopyFileRequest would try to send to the Guest before CreateContainerRequest. Such CopyFileRequest was an unintended side effect of this test. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-13 18:56:57 +00:00
Cameron Baird	bceffd5ff6	genpolicy: Introduce UpdateInterfaceRequest rules in genpolicy-settings Introduce rules for UpdateInterfaceRequest and genpolicy tests for them. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-03-13 17:30:01 +00:00
Saul Paredes	1c406e9c1d	Merge pull request #11004 from microsoft/cameronbaird/updateroutesrequest-hardening-upstream genpolicy: Introduce UpdateRoutesRequest rules in genpolicy-settings	2025-03-13 10:11:39 -07:00
Saul Paredes	7a5db51c80	policy: validate pod generated name Validate sandbox name using a regex. If the YAML specifies metadata.name, use a regex that exact matches. If the YAML specifies metadata.generateName, use a regex that matches the prefix of the generated name. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2025-03-13 09:49:57 -07:00
Steve Horsman	e6a78e64e6	Merge pull request #10967 from stevenhorsman/coco-tests-required ci: Add coco required tests	2025-03-13 15:10:22 +00:00
mchtech	0e61eb215d	tools: initialize unbound variables in rootfs.sh Initialize unbound variables in rootfs.sh for RHEL series OS. Signed-off-by: mchtech <michu_an@126.com>	2025-03-13 22:57:43 +08:00
Fupan Li	592d58ca52	Merge pull request #11001 from RuoqingHe/enable-riscv-kernel-build kernel: Support and enable riscv kernel build	2025-03-13 19:28:00 +08:00
Ruoqing He	e0fb8f08d8	ci: Add riscv-builder to actionlint.yaml We have three SG2042 connected and labeled as `riscv-builder`, add that entry to `actionlint.yaml` to help linting while setting up workflows. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	a7e953c7a7	ci: Enable static-tarball build for riscv64 Enable `kernel` and `virtiofsd` static-tarball build for riscv64. Since `virtiofsd` was previously supported and `kernel` is supported now. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	3c8a8ca9c2	kernel: Enable riscv kernel build Modify `build-kernel.sh` to enable building of riscv64 kernel. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	e316f633d8	kernel: Bump kata_config_version Bump kata_config_version since riscv kernel build is introduced. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	31446b8be8	kernel: Skip ACPI common fragment for riscv ACPI is not yet ratified and is still frequently evolving, disable acpi.conf for riscv architecture. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	ebd1214b2e	kernel: Introduce riscv mmu fragment conf Memory hotplug and related features is required, enable them in `mmu.conf`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	734f5d07a9	kernel: Introduce riscv pci fragment conf AIA (Advanced Interrupt Architecture) is available and enabled by default after v6.10 kernel, provide pci.conf to make proper use of IMSIC of AIA. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Ruoqing He	19d78ca844	kernel: Introduce riscv base fragment conf Create `riscv` folder for riscv64 architecture to be inferred while constructing kernel configuration, and introduce `base.conf` which builds 64-bit kernel and with KVM built-in to kernel. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-13 13:43:29 +08:00
Cameron Baird	cf129f3744	genpolicy: Introduce UpdateRoutesRequest rules in genpolicy-settings Introduce rule to block routes from source addresses which are the loopback. Block routes added to the lo device. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-03-12 19:03:57 +00:00
Dan Mihai	71d4ad5fca	Merge pull request #11003 from microsoft/mahuber/grpc-1-58-3 runtime: upgrade grpc vendor dependency	2025-03-12 09:23:07 -07:00
Wainer Moschetta	8c2d1b374c	Merge pull request #10892 from ldoktor/webhook ci: Change the way we modify runtimeclass in webhook	2025-03-12 12:32:45 -03:00
RuoqingHe	386fed342c	Merge pull request #10990 from kata-containers/shell-check-vendor-skip workflows: shellcheck: Expand vendor ignore	2025-03-12 21:34:26 +08:00
Alex Lyn	fdc0d81198	Merge pull request #10994 from teawater/swap7 runtime-rs: Add guest swap support	2025-03-12 17:59:00 +08:00
Hui Zhu	796eab3bef	runtime-rs: Update swap option of configuration file Remove swap configuration from qemu config file because runtime-rs qemu support code doesn't support hotplug block device. Add swap configuration to dragonball and cloud-hypervisor config file. Fixes: #10988 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-12 13:51:35 +08:00
Dan Mihai	4f41989a6a	Merge pull request #11009 from mythi/e2e-skip-flaky-tests tests: k8s: skip trusted storage tests for qemu-tdx	2025-03-11 12:13:35 -07:00
Dan Mihai	e40251d9f8	Merge pull request #11006 from ryansavino/fix-confidential-ssh-dockerfile tests: fix confidential ssh Dockerfile	2025-03-11 11:22:23 -07:00
Aurélien Bombo	33f3a8cf5f	Merge pull request #10973 from microsoft/danmihai1/main ci: temporarily avoid using the Mariner Host image	2025-03-11 10:24:00 -05:00
Steve Horsman	420b282279	Merge pull request #10948 from RuoqingHe/better-matrix ci: Refactor matrix for `build-checks`	2025-03-11 14:13:10 +00:00
Mikko Ylinen	71531a82f4	tests: k8s: skip trusted storage tests for qemu-tdx follow other TEEs to skip trusted storage tests due to #10838. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-03-11 15:14:03 +02:00
Hui Zhu	93cd30862d	libs: Add AddSwapPath to service AgentService AddSwap send the pci path to guest kernel to let it add swap device. But some mmio device doesn't have pci path. To support it add AddSwapPath send virt_path to guest kernel as swap device. Fixes: #10988 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-11 16:02:48 +08:00
Hui Zhu	7787340ab6	runtime-rs: Add guest swap support This commit add guest swap support. When configuration enable_guest_swap is enabled, runtime-rs will start a swap task. When the VM start or update the guest memory, the swap task will be waked up to create and insert a swap file. Before this job, swap task will sleep some seconds (set by configuration guest_swap_create_threshold_secs) to reduce the impact on guest kernel boot performance and prevent the insertion of multiple swap files due to frequent memory elasticity within a short period. The size of swap file is set by configuration guest_swap_size_percent. The percentage of the total memory to be used as swap device. Fixes: #10988 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-11 16:02:31 +08:00
Hui Zhu	4cd9d70c4d	runtime-rs: Add is_direct to struct BlockConfig Add is_direct to struct BlockConfig. This option specifies cache-related options for block devices. Denotes whether use of O_DIRECT (bypass the host page cache) is enabled. If not set, use configurarion block_device_cache_direct. Fixes: #10988 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-11 15:44:40 +08:00
Ryan Savino	1dbe3fb8bc	tests: fix confidential ssh Dockerfile Need to set correct permissions for ssh directories and files Fixes: #11005 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-03-10 18:31:05 -05:00
Dan Mihai	e8405590c1	ci: temporarily avoid using the Mariner Host image Disable the Mariner host during CI, while investigating test failures with new Cloud Hypervisor v43.0. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-10 20:15:09 +00:00
Steve Horsman	730e007abd	Merge pull request #11000 from microsoft/danmihai1/print-exec-output2 tests: k8s: log kubectl exec ouput	2025-03-10 09:31:41 +00:00
Fupan Li	df9c6ae9d7	Merge pull request #10998 from teawater/ma_config runtime-rs: Add mem-agent config to clh and qemu config file	2025-03-10 16:23:20 +08:00
Dan Mihai	509e6da965	tests: k8s-env.bats: log exec output Log the "kubectl exec" ouput, just in case it helps investigate sporadic test errors like: https://github.com/kata-containers/kata-containers/actions/runs/13724022494/job/38387329321?pr=10973 not ok 1 Environment variables (in test file k8s-env.bats, line 37) `grep "HOST_IP=$[0-9]\+\(\.\\|$$\)\{4\}"' failed It appears that the first exec from this test case produced the expected output: MY_POD_NAME=test-env but the second exec produced something else - that will be logged after this change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-07 19:37:20 +00:00
Dan Mihai	95d47e4d05	tests: k8s-configmap.bats: log exec output Log the "kubectl exec" ouput, just in case it helps investigate sporadic test errors like: https://github.com/kata-containers/kata-containers/actions/runs/13724022494/job/38387329268?pr=10973 not ok 1 ConfigMap for a pod (in test file k8s-configmap.bats, line 44) `kubectl exec $pod_name -- "${exec_command[@]}" \| grep "KUBE_CONFIG_2=value-2"' failed It appears that the first exec from this test case produced the expected output: KUBE_CONFIG_1=value-1 but the second exec produced something else - that will be logged after this change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-07 19:35:45 +00:00
Dan Mihai	caee12c796	tests: k8s: add function to log exec output grep_pod_exec_output invokes "kubectl exec", logs its output, and checks that a grep pattern is present in the output. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-07 19:34:57 +00:00
Steve Horsman	014ff8476a	Merge pull request #10992 from microsoft/danmihai1/git-helper gha: always delete workspace on rebase error	2025-03-07 14:26:00 +00:00
Steve Horsman	cb682ef3c8	Merge pull request #10987 from RuoqingHe/enable-docker-on-riscv kata-deploy: Use docker.io for all architectures	2025-03-07 11:14:19 +00:00
Xuewei Niu	0671252466	Merge pull request #10760 from lifupan/route_flags_suport	2025-03-07 18:18:01 +08:00
Hui Zhu	691430ca95	runtime-rs: Add mem-agent config to clh and qemu config file Add mem-agent config to clh and qemu config file. Fixes: #10996 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-07 15:54:59 +08:00
Fupan Li	9a4c0a5c5c	agent: add the route flags support when adding routes Get the route entry's flags passed from host and set it in the add route request. Fixes: #7934 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Fupan Li	d929bc0224	agent: refactor the code of update routes/interfaces We can use the netlink update method to add a route or an interface address. There is no need to delete it first and then add it. This can save two system commissions. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Fupan Li	aad915a7a1	agent: upgrade the netlink related crates Upgrade rtnetlink and related crates to support route flags. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Fupan Li	0995c6528e	runtime-rs: add the route flags support Get the route entry's flags from the host and pass it into kata-agent to add route entries with flags support. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Fupan Li	cda6d0e36c	runtime-rs: upgrade the netlink related crates Upgrade netlink-packet-route and rtnetlink to support route flags. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Fupan Li	1ade2a874f	runtime: add the flags support to the route setting We should support the flags when add the route from host to guest. Otherwise, some route would be set failed. Fixes: #7934 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-07 09:56:08 +08:00
Dan Mihai	7b63f256e5	gha: fix git-helper issues reported by shellcheck ./tests/git-helper.sh:20:5: note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292] ./tests/git-helper.sh:22:26: note: Double quote to prevent globbing and word splitting. [SC2086] ./tests/git-helper.sh:23:7: note: Prefer [[ ]] over [ ] for tests in Bash/Ksh. [SC2292] Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-06 20:28:41 +00:00
Dan Mihai	04adcdace6	gha: always delete workspace on rebase error The workplace was already being deleted on non-x86_64 platforms, but x86_64 can be affected by the same problem too. That might have been the case with the SNP and TDX test runs from: https://github.com/kata-containers/kata-containers/actions/runs/13687511270/job/38313758751?pr=10973 https://github.com/kata-containers/kata-containers/actions/runs/13687511270/job/38313760086?pr=10973 Rebase worked fine for the same patch/PR on other platforms. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-03-06 20:24:09 +00:00
Ruoqing He	3a8131349e	kata-deploy: Use docker.io for all archietcutres Switch to `docker.io` provided by Ubuntu sources. It is not necessary for us to install docker through `get-docker.sh`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-07 02:22:31 +08:00
RuoqingHe	8ef8109b2f	Merge pull request #10985 from RuoqingHe/remove-s390x-conditional-compilation runtime-rs: Remove s390x conditional compilation	2025-03-06 23:13:11 +08:00
Pavel Mores	133528a63c	runtime-rs: remove snp_certs_path support SNP certs were apparently obsoleted by AMD. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-03-06 15:53:24 +01:00
stevenhorsman	a40d5d3daa	ci: Add arm64 K8s tests as required This is based on the request from @fidencio, who is one of the maintainers	2025-03-06 14:39:04 +00:00
stevenhorsman	f45b398170	ci: Add coco required tests Add the zvsi and nontee coco tests to the required jobs list Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-06 14:38:52 +00:00
stevenhorsman	ee0f0b7bfe	workflows: shellcheck: Expand vendor ignore - In the previous PR I only skipped the runtime/vendor directory, but errors are showing up in other vendor packages, so try a wildcard skip - Also update the job step was we can distinguish between the required and non-required versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-06 14:35:12 +00:00
Manuel Huber	c05b976ebe	runtime: upgrade grpc vendor dependency - remove hard link to v.1.47.0 in go.mod - run go mod tidy, go mod vendor to actually update to v1.58.3 - addresses CVE-2023-44487 Signed-off-by: Manuel Huber <mahuber@microsoft.com>	2025-03-06 10:00:49 +00:00
Xuewei Niu	644af52968	Merge pull request #10876 from lifupan/fupan_containerd ci: cri-containerd: upgrade the LTS / Active versions for containerd	2025-03-06 17:08:40 +08:00
Hyounggyu Choi	bf41618a84	Merge pull request #10862 from BbolroC/enable-ibm-se-for-qemu-runtime-rs runtime-rs: Enable IBM SE for QEMU	2025-03-06 05:38:13 +01:00
Ruoqing He	ed6f57f8f6	runtime-rs: Restrict cloud-hypervisor feature Cloud-Hypervisor currently only supports `x86_64` and `aarch64`, this features should not be avaiable even if other architectures explicitly requires it. Restrict `cloud-hypervisor` feature to only `x86_64` and `aarch64`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-06 11:21:57 +08:00
Ruoqing He	6f894450fe	runtime-rs: Drop s390x target predicates Drop `target_arch = "s390x"` all over `runtime-rs`, it is strange to have such predicates on features and code while we do not support it. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-06 11:20:28 +08:00
Xuewei Niu	a54eed6bab	Merge pull request #10975 from teawater/fix_log_level runtime-rs: Fix log_level's comments in configuration-dragonball.toml.in	2025-03-06 10:05:09 +08:00
Alex Lyn	2619b57411	Merge pull request #10937 from Apokleos/bugfix-useless-annotation kata-types: Fix bugs related to annotations in kata-types	2025-03-06 09:37:29 +08:00
Hyounggyu Choi	c3e3ef7b25	Merge pull request #10981 from BbolroC/remove-sclp-console-s390x runtime: Remove console=ttysclp0 for s390x	2025-03-05 21:43:57 +01:00
Fabiano Fidêncio	80e95bd264	Merge pull request #10966 from kata-containers/topic/tests-bring-back-kata-deploy-tests tests: Bring back kata-deploy tests	2025-03-05 21:11:21 +01:00
Zvonko Kaiser	ae63bbb824	Merge pull request #10982 from zvonkok/fix-zvonkos-fix agent: fix permisssion according to runc	2025-03-05 15:08:48 -05:00
Fabiano Fidêncio	545780a83a	shellcheck: tests: k8s: Fix gha-run.sh warnings As we'll touch this file during this series, let's already make sure we solve all the needed warnings. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Fabiano Fidêncio	50f765b19c	shellcheck: tests: Fix gha-run-k8s-common.sh warnings Let's fix all the warnings caught in this file, as we're already touching it. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Fabiano Fidêncio	219db60071	tests: kata-deploy: microk8s: Re-work installation So we can ensure that the user has enough permissions to access microk8s. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Fabiano Fidêncio	c337a21a4e	shellcheck: kata-deploy: Fix warnings He were fixing the few warnings we found in the files present in the functional tests for kata-deploy. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Fabiano Fidêncio	fd832d0feb	tests: kata-deploy: Run installation with only one VMM It doesn't make much sense to test different VMMs as that wouldn't trigger a different code path. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Fabiano Fidêncio	14bf653c35	tests: kata-deploy: Re-add tests, now using github runners As GitHub runners now support nested virt, we're don't depend on garm for those anymore. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-03-05 19:44:27 +01:00
Zvonko Kaiser	3cea080185	agent: fix permisssion according to runc The previous PR mistakenly set all perms to 0o666 we should follow what runc does and fetch the permission from the guest aka host if the file_mode == 0. If we do not find the device on the guest aka host fallback to 0. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-03-05 17:33:40 +00:00
Fupan Li	7024d3c600	CI: cri-containerd: upgrade the LTS / Active versions for containerd As we're testing against the LTS and the Active versions of containers, let's upgrade the lts version from 1.6 to 1.7 and active version from 1.7 to 2.0 to cover the sandboxapi tests. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-03-05 23:09:24 +08:00
Hyounggyu Choi	624f7bfe0b	runtime: Remove console=ttysclp0 for s390x After the introduction of the following kernel parameters (see #6163): ``` CONFIG_SCLP_VT220_TTY=y CONFIG_SCLP_VT220_CONSOLE=y ``` the system log for Kata components (e.g., the agent) no longer appeared on the SCLP console (i.e., /dev/ttysclp0). Let's switch to the default fallback console (likely /dev/console) for logging. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-05 15:06:08 +01:00
Zvonko Kaiser	a5629f9bfa	Merge pull request #10971 from zvonkok/host-guest-mapping agent: Enable VFIO and initContainers	2025-03-05 08:58:45 -05:00
Fabiano Fidêncio	504d9e2b66	Merge pull request #10976 from zvonkok/fix-dev-permissions agent: Fix default linux device permissions	2025-03-05 13:54:06 +01:00
Hyounggyu Choi	4ea7d274c4	runtime-rs: Add new runtimeClass qemu-se-runtime-rs When `KATA_HYPERVISOR` is set to `qemu-se-runtime-rs`, a configuration file is properly referenced and a runtime class should be created via kata-deploy. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-05 13:50:38 +01:00
Hyounggyu Choi	2c72cf5891	runtime-rs: Add SE configuration A configuration file, `configuration-qemu-se-runtime-rs.toml`, is referenced when the `qemu-se-runtime-rs` runtime is configured. This commit adds a template file and updates the Makefile configuration accordingly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-05 13:50:38 +01:00
Hyounggyu Choi	65021caca6	Merge pull request #10963 from RuoqingHe/remove-arch-predicates-in-runtime-rs runtime-rs: Enable Dragonball only for x86_64 & aarch64	2025-03-05 09:10:33 +01:00
Zvonko Kaiser	c73ff7518e	agent: Fix default linux device permissions We had the default permissions set to 0o000 if the file_mode was not present, for most container devices this is the wrong default. Since those devices are meant also to be accessed by users and others add a sane default of 0o666 to devices that do not have any permissions set. Otherwise only root can acess those and we cannot run containers as a user. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-03-05 02:22:24 +00:00
Ruoqing He	186c88b1d5	ci: Move musl-tools installation into Setup rust `musl-tools` is only needed when a component needs `rust`, and the `instance` running is of `x86_64` or `aarch64`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-05 09:43:19 +08:00
Zvonko Kaiser	4bb0eb4590	Merge pull request #10954 from kata-containers/topic/metrics-kata-deploy Rework and fix metrics issues	2025-03-04 20:22:53 -05:00
Hui Zhu	c3c3f23b33	runtime-rs: Fix log_level's comments in configuration-dragonball.toml.in Add double quotes to fix log_level's comments in configuration-dragonball.toml.in. Fixes: #10974 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-03-05 09:21:08 +08:00
Dan Mihai	edf6af2a43	Merge pull request #10955 from microsoft/cameronbaird/hyp-loglevel-default-upstream runtime: Properly set default hyp loglevel to 1	2025-03-04 16:44:08 -08:00
Cameron Baird	d48116114e	runtime: Properly set default hyp loglevel to 1 Tweak default HypervisorLoglevel config option for clh to 1. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-03-04 20:36:40 +00:00
Zvonko Kaiser	248d04c20c	agent: Enable VFIO and initContainers We had a static mapping of host guest PCI addresses, which prevented to use VFIO devices in initContainers. We're tracking now the host-guest mapping per container and removing this mapping if a container is removed. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-03-04 19:53:52 +00:00
Fabiano Fidêncio	874129a11f	Merge pull request #10958 from stevenhorsman/shell-check-errors-fix Shell check errors fix	2025-03-04 17:37:36 +01:00
stevenhorsman	02a2f6a9c1	tests: Sanitize `K8S_TEST_ENTRY` Now we've added the double quotes around `${K8S_TEST_UNION[@]}`, so platforms are failing with: ``` Error: Test file "/home/ubuntu/runner/_layout/_work/kata-containers/kata-containers/tests/integration/kubernetes/k8s-nginx-connectivity.bats " does not exist ``` due to the line continuation, so sanitise the value to try and fix this. Co-authored-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	e33ad56cf4	kernel: bump kata_config_version Bump kernel version as the build-kernel script was updated (even if there was no functional change). Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	2df3e5937a	ci/openshift-ci: Fix script error The space was missing before `]`, so fix this and also swtich to double square brackets and variable braces Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	9a9e88a38d	test: vfio: Attempt to fix logic This was checking that a literal string was non-zero. I'm assume it instead wanted to check if the file exists Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	b220cca253	shellcheck: Fix shellcheck SC2066 > Since you double-quoted this, it will not word split, and the loop will only run once. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	b8cfdd06fb	shellcheck: Fix shellcheck SC2071 > > is for string comparisons. Use -gt instead. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	eb90b93e3f	shellcheck: Fix shellcheck SC2104 > In functions, use return instead of break. > rationale: break or continue are used to abort or continue a loop, and are not the right way to exit a function. Use return instead. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:10 +00:00
stevenhorsman	67bfd4793e	shellcheck: Fix shellcheck SC2242 > Can only exit with status 0-255. Other data should be written to stdout/stderr. Switch exit -1 to exit 1 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:39:01 +00:00
stevenhorsman	ed8347c868	shellcheck: Fix shellcheck SC2070 > -n doesn't work with unquoted arguments. Quote or use [[ ]] Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	dbba6b056b	shellcheck: Fix shellcheck SC2148 > Tips depend on target shell and yours is unknown. Add a shebang. Add ``` #!/usr/bin/env bash ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	c5ff513e0b	shellcheck: Fix shellcheck SC2068 > Double quote array expansions to avoid re-splitting elements Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	58672068ff	shellcheck: Fix shellcheck SC2145 > Argument mixes string and array. Use * or separate argument. - Swap echos for printfs and improve formatting - Replace $@ with $* - Split arrays into separate arguments Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	bc2d7d9e1e	osbuilder: Skip shellcheck on test_images.sh I'm not sure if we use test_images anywhere, so before we invest the time to fix the 120 shellcheck errors and warnings we should decide if we want to keep it. See #10957 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	fb1d4b571f	workflows: Add required shellcheck workflow Start with a required smaller set of shellchecks to try and prevent regressions whilst we fix the current problems Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
stevenhorsman	b3972df3ca	workflows: Shellcheck - ignore vendor Ignore the vendor directories in our shellcheck workflow as we can't fix them. If there is a way to set this in shellcheckrc that would be better, but it doesn't seem to be implemented yet. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-04 09:35:46 +00:00
Zvonko Kaiser	4df406f03c	Merge pull request #10965 from zvonkok/fix-init gpu: fix init symlinks	2025-03-03 14:46:41 -05:00
Zvonko Kaiser	eb2f75ee61	gpu: fix init symlinks With the recent changes we need to make sure NVRC is symlinked for init and sbin/init Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-03-03 17:21:59 +00:00
Greg Kurz	545022f295	Merge pull request #10817 from Jakob-Naucke/virtio-net-ccw Fix virtio-net-ccw	2025-03-03 17:37:46 +01:00
Hyounggyu Choi	e8aa5a5ab7	runtime-rs: Enable virtio-net-ccw for s390x When using `virtio-net-pci` for IBM SE, the following error occurs: ``` update interface: Link not found (Address: f2:21:48:25:f4:10) ``` On s390x, it is more appropriate to use the CCW type of virtio network device. This commit ensures that a subchannel is configured accordingly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-03 16:34:03 +01:00
Hyounggyu Choi	59c1f0b59b	runtime-rs: Suppress kernel parameters for IBM SE For IBM SE, the following kernel parameters are not required: - Basic parameters (reboot and systemd-related) - Rootfs parameters This commit suppresses these parameters when IBM SE is configured. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-03 16:34:03 +01:00
Hyounggyu Choi	4c8e881a84	runtime-rs: Enable IBM SE support for QEMU This commit configures the command line for IBM Secure Execution (SE) and other TEEs. The following changes are made: - Add a new item `Se` to ProtectionDeviceConfig and handle it at sandbox - Introduce `add_se_protection_device()` for SE cmdline config - Bypass rootfs image/initrd validity checks when SE is configured. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-03-03 16:32:18 +01:00
Ruoqing He	2ecb2fe519	runtime-rs: Enable Dragonball for x86_64 & aarch64 `USE_BUILDIN_DB` is turned on by default for architectures do not support `Dragonball`, which leads `s390x` is building `runtime-rs` with `--features dragonball` presents. Let's restrict `USE_BUILDIN_DB` to be enable only for architectures supported by `Dragonball` (namely x86_64 and aarch64 as of now). Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-03-03 12:10:58 +08:00
stevenhorsman	c69509be1c	metrics: Reduce repeats for boot time tests on qemu On qemu the run seems to error after ~4-7 runs, so try a cut down version of repetitions to see if this helps us get results in a stable way. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-02 08:42:00 +00:00
stevenhorsman	0962cd95bc	metrics: Increase minpercent range for qemu iperf test We have a new metrics machine and environment and the iperf jitter result failed as it finished too quickly, so increase the minpercent to try and get it stable Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-02 08:32:26 +00:00
stevenhorsman	ef0e8669fb	metrics: Increase minpercent range for clh tests We have a new metrics machine and environment and the fio write.bw and iperf3 parallel.Results tests failed for clh, as below the minimum range, so increase the minpercent to try and get it stable Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-02 08:32:26 +00:00
stevenhorsman	f81c85e73d	metrics: Increase maxpercent range for clh boot times We have a new metrics machine and environment and the boot time test failed for clh, so increase the maxpercent to try and get it stable Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	435ee86fdd	metrics: Update iperf affinity The iperf deployment is quite a lot out of date and uses `master` for it's affinity and toleration, so update this to control-plane, so it can run on newer Kubernetes clusters Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	85bbc0e969	metrics: Increase wait time The new metrics runner seems slower, so we are seeing errors like: The iperf3 tests are failing with: ``` pod rejected: RuntimeClass "kata" not found ``` so give more time for it to succeed Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	4ce94c2d1b	Revert "metrics: Add init_env function to latency test" This reverts commit `9ac29b8d38`. to remove the duplicate `init_env` call Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	658a5e032b	metrics: Increase containerd start timeout - Move `kill_kata_components` from common.bash into the metrics code base as the only user of it - Increase the timeout on the start of containerd as the last 10 nightlies metric tests have failed with: ``` 223478 Killed sudo timeout -s SIGKILL "${TIMEOUT}" systemctl start containerd ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	3fab7944a3	workflows: Improve metrics jobs - As the metrics tests are largely independent then allow subsequent tests to run even if previous ones failed. The results might not be perfect if clean-up is required, but we can work on that later. - Move the test results check out of the latency test that seems arbitrary and into it's own job step - Add timeouts to steps that might fail/hang if there are containerd/K8s issues Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
stevenhorsman	6f918d71f5	workflows: Update metrics jobs Currently the run-metrics job runs a manual install and does this in a separate job before the metrics tests run. This doesn't make sense as if we have multiple CI runs in parallel (like we often do), there is a high chance that the setup for another PR runs between the metrics setup and the runs, meaning it's not testing the correct version of code. We want to remove this from happening, so install (and delete to cleanup) kata as part of the metrics test jobs. Also switch to kata-deploy rather than manual install for simplicity and in order to test what we recommend to users. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-03-01 17:50:05 +00:00
Zvonko Kaiser	3f13023f5f	Merge pull request #10870 from zvonkok/module-signing gpu: add module signing	2025-03-01 09:51:24 -05:00
Zvonko Kaiser	d971e13446	gpu: Update rootfs.sh Only source NV scripts if variant starts with "nvidia-gpu" Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-03-01 02:08:29 +00:00
Fabiano Fidêncio	4018079b55	Merge pull request #10960 from fidencio/topic/kata-deploy-fix-k0s-deployment kata-deploy: k0s: Fix drop-in path	2025-02-28 18:49:46 +01:00
Zvonko Kaiser	94579517d4	shellcheck: Update nvidia_rootfs.sh With the new rules we need more updates. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 16:36:05 +00:00
Zvonko Kaiser	af1d6c2407	shecllcheck: Update nvidia_chroot.sh Make shellcheck happy with the new rules new updates needed Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 16:27:51 +00:00
Fabiano Fidêncio	c95f9885ea	kata-deploy: k0s: Fix drop-in path The drop-in path should be /etc/containerd (from the containers' perspective), which mounts to the host path /etc/k0s/containerd.d. With what we had we ended up dropping the file under the /etc/k0s/containerd.d/containerd.d/, which is wrong. This is a regression introduce by: `94b3348d3c` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-28 16:32:00 +01:00
Zvonko Kaiser	c4e4e14b32	kernel: bump kata_config_version Mandatory update to have a unique kernel version name Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 15:18:15 +00:00
Fabiano Fidêncio	d13be49f9b	Merge pull request #10846 from stalb/feature/microk8s-support kata-deploy: Update kata-deploy to support microk8s	2025-02-28 13:57:44 +01:00
Stephane Talbot	f80e7370d5	test: Verify deployement of kata-deploy on microk8s Enable fonctional test to verify deployment of kata-deploy on a Microk8s cluster Signed-off-by: Stephane Talbot <Stephane.Talbot@univ-savoie.fr>	2025-02-28 10:10:29 +01:00
Stéphane Talbot	f2ba224e6c	kata-deploy: Update kata-deploy to support microk8s Change kata-deploy script and Helm chart in order to be able to use kata-deploy on a microk8s cluster deployed with snap. Fixes: #10830 Signed-off-by: Stephane Talbot <Stephane.Talbot@univ-savoie.fr>	2025-02-28 10:10:29 +01:00
Ruoqing He	09030ee96e	ci: Refactor build-checks workflow Refator matrix setup and according dependencies installation logic in `build-checks.yaml` and `build-checks-preview-riscv64.yaml` to provide better readability and maintainability. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-28 09:47:25 +08:00
Ruoqing He	eb94700590	ci: Drop install-libseccomp matrix variant `install-libseccomp` is applied only for `agent` component, and we are already combining matrix with `if`s in steps, drop `install-libseccomp` in matrix to reduce complexity. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-28 09:44:53 +08:00
Zvonko Kaiser	4dadd07699	gpu: Update rootfs.sh Pass-through KBUILD_SIGN_PIN to the rootfs build Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	5ab3192c51	gpu: Update nvidia_rootfs.sh We need to handle KBUILD_SIGN_PIN so that the kbuild can decrypte the signing key Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	493ba63c77	gpu: Provide KBUILD_SIGN_PIN to the build.sh At the proper step pass-through the var KBUILD_SIGN_PIN so that the kernel_headers step has the PIN for encrypting the signing key. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	0309b70522	gpu: Pass-through KBUILD_SIGN_PIN In kata-deploy-binaries.sh we need to pass-through the var KBUILD_SIGN_PIN to the other static builder scripts. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	9602ba6ccc	gpu: Add proper KBUILD_SIGN_PIN to entry script Update kata-deploy-binaries-in-docker.sh to read the env variable KBUILD_SIGN_PIN that either can be set via GHA or other means. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	39d3b7fb90	gpu: Update NVIDIA chroot script We need to place the signing key and cert at the right place and hide the KBUILD_SIGN_PIN from echo'ing or xtrace Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	d815fb6f46	gpu: Update kernel-headers Use the kernel-headers as the extra_tarball to move the encrypted key and cert from stage to stage Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	c2cb89532b	gpu: Add the proper handling in build-kernel.sh If KBUILD_SIGN_PIN is provided we can encrypt the signing key for out-of-tree builds and second round jobs in GHA Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:35 +00:00
Zvonko Kaiser	bc8360e8a9	gpu: Add proper config for module signing We want to enable module signing in Kata and Coco Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-28 01:31:34 +00:00
Zvonko Kaiser	f485e52f75	Merge pull request #10953 from zvonkok/shellcheckrc ci: Add shellcheckrc	2025-02-27 13:35:23 -05:00
Fabiano Fidêncio	96ed706d20	Merge pull request #10950 from fidencio/topic/skip-arm-check-tests-that-depend-on-virt ci: arm64: Skip tests that depend on virt on non-virt capable runners	2025-02-27 18:26:32 +01:00
Zvonko Kaiser	abfbc0ab60	ci: Add shellcheckrc Let's have common rules over all shell files. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-27 17:11:24 +00:00
Zvonko Kaiser	33460386b9	Merge pull request #10803 from ryansavino/update-confidential-initrd-22.04 versions: update confidential initrd to 22.04	2025-02-27 09:29:36 -05:00
Fabiano Fidêncio	e18e1ec3a8	ci: arm64: Skip tests that depend on virt on non-virt capable runners The GitHub hosted runners for ARM64 do not provide virtualisation support, thus we're just skipping the tests as those would check whether or not the system is "VMContainerCapable". Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-27 14:43:21 +01:00
Wainer Moschetta	5fda6b69e8	Merge pull request #10883 from stevenhorsman/k0s-version-pinning ci: k8s: Pin k0s version to get cri-o tests back working	2025-02-27 10:11:59 -03:00
Steve Horsman	f3c22411fc	Merge pull request #10930 from stevenhorsman/codeql-config workflows: Add codeql config	2025-02-27 12:43:41 +00:00
stevenhorsman	d08787774f	ci: k8s: Use pinned k0s version Update the code to install the version of k0s that we have in our versions.yaml, rather than just installing the latest, to help our CI being less stable and prone to breaking due to things we don't control. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-27 11:33:23 +00:00
stevenhorsman	3fe35c1594	version: Add k0s version Add external versions support for k0s and initially pin it at v1.31.5 as our cri-o tests started failing when v1.32 became the latest Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-27 11:33:23 +00:00
Fabiano Fidêncio	6e236fd44c	Merge pull request #10652 from burgerdev/sysctls genpolicy: support sysctls from PodSpec and environment defaults	2025-02-27 08:25:14 +01:00
Dan Mihai	cb382e1367	Merge pull request #10925 from katexochen/p/fail-on-layer-pull genpolicy: fail when layer can't be processed	2025-02-26 13:28:38 -08:00
Ryan Savino	ceafa82f2e	tests: skip trusted storage tests for qemu-snp skip tests for trusted storage until #10838 is resolved. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-02-26 14:23:57 -06:00
Ryan Savino	a00a7c500a	build: initrd rootfs init symlink directly to systemd when no AGENT_INIT In some cases, /init is not following two levels of symlinks i.e. /init to /sbin/init to /lib/systemd/systemd Setting /init directly to /lib/systemd/systemd when AGENT_INIT is not mandated Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-02-26 14:23:56 -06:00
Markus Rudy	70709455ef	genpolicy: support sysctl settings Sysctls may be added to a container by the Kubernetes pod definition or by containerd configuration. This commit adds support for the corresponding PodSecurityContext field and an option to specify environment-dependent sysctls in the settings file. The sysctls requested in a CreateContainerRequest are checked against the sysctls in the pod definition, or if not defined there in the defaults in genpolicy-settings.json. There is no check for the presence of expected sysctls, though, because Kubernetes might legitimately omit unsafe syscalls itself and because default sysctls might not apply to all containers. Fixes: #10064 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-02-26 18:56:17 +01:00
Steve Horsman	5aa89bc1d7	Merge pull request #10831 from RuoqingHe/ci-riscv64 ci: Enable partial components build-check on riscv	2025-02-26 17:50:47 +00:00
Fabiano Fidêncio	9d8026b4e5	Merge pull request #10654 from burgerdev/cronjob genpolicy: add get_process_fields to CronJob	2025-02-26 15:13:40 +01:00
Fabiano Fidêncio	7b16df64c9	Merge pull request #10935 from burgerdev/error-messages runtime: add cause to CDI errors	2025-02-26 14:01:22 +01:00
Jakob Naucke	c146980bcd	agent: Handle virtio-net-ccw devices separately On s390x, a virtio-net device will use the CCW bus instead of PCI, which impacts how its uevent should be handled. Take the respective path accordingly. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:43 +01:00
Jakob Naucke	a084b99324	virtcontainers: Separate PCI/CCW for net devices On s390x, virtio-net devices should use CCW, alongside a different device path. Use accordingly. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:43 +01:00
Jakob Naucke	2aa523f08a	virtcontainers: Fix virtio-net-ccw address format Hex device number was formatted as hex twice, thus encoding the string as hex. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:43 +01:00
Jakob Naucke	2a992c4080	virtcontainers: Add CCW device to endpoint To support virtio-net-ccw for s390x, add CCW devices to the Endpoint interface. Add respective fields and functions to implementing structs. Device paths may be empty. PciPath resolves this by being a list that may be empty, but this design does not map to CcwDevice. Use a pointer instead. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:42 +01:00
Jakob Naucke	b325069d72	agent: Update QEMU URL Readthedocs URL was outdated. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:42 +01:00
Jakob Naucke	9935f9ea7e	proto: Rename Interface.pciPath to devicePath Field is being used for both PCI and CCW devices. Name it devicePath to avoid confusion when the device isn't a PCI device. Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>	2025-02-26 11:36:42 +01:00
alex.lyn	a338af3f18	kata-types: Fix bugs related to annotations in kata-types It will address two issuses: (1) expected `,`: --> /root/kata-containers/src/libs/kata-types/tests/test_config.rs:15:9 \| 14 \| KATA_ANNO_CFG_HYPERVISOR_ENABLE_IO_THREADS \| - \| \| \| expected one of `,`, `::`, `as`, or `}` \| help: missing `,` 15 \| KATA_ANNO_CFG_HYPERVISOR_FILE_BACKED_MEM_ROOT_DIR, \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unexpected token (2) remove useless annotation `KATA_ANNO_CFG_HYPERVISOR_CTLPATH`. Fixes #10936 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2025-02-26 17:48:11 +08:00
Fabiano Fidêncio	47a5439a20	Merge pull request #10934 from fidencio/topic/agent-unbreak-non-guest-pull-build agent: Fix non-guest-pull build	2025-02-26 09:45:22 +01:00
Pavel Mores	c5e560e2d1	runtime-rs: handle ProtectionDevice in resource manager and sandbox As part of device preparation in Sandbox we check available protection and create a corresponding ProtectionDeviceConfig if appropriate. The resource-side handling is trivial. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-02-26 09:11:35 +01:00
Pavel Mores	eb47f15b10	runtime-rs: support ProtectionDevice in qemu-rs As an example, or a test case, we add some implementation of SEV/SEV-SNP. Within the QEMU command line generation, the 'Cpu' object is extended to accomodate the EPYC-v4 CPU type for SEV-SNP. 'Machine' is extended to support the confidential-guest-support parameter which is useful for other TEEs as well. Support for emitting the -bios command line switch is added as that seems to be the preferred way of supplying a path to firmware for SEV/SEV-SNP. Support for emitting '-object sev-guest' and '-object sev-snp-guest' with an appropriate set of parameters is added as well. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-02-26 09:11:35 +01:00
Pavel Mores	87deb68ab7	runtime-rs: add implementation of ProtectionDevice ProtectionDevice is a new device type whose implementation structure matches the one of other devices in the device module. It is split into an inner "config" part which contains device details (we implement SEV/SEV-SNP for now) and the customary outer "device" part which just adds a device instance ID and the customary Device trait implementation. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-02-26 09:11:35 +01:00
Pavel Mores	a3f973db3b	runtime-rs: extend SEV/SEV-SNP detection by including a details struct This matches the existing TDX handling where additional details are retrieved right away after TDX is detected. Note that the actual details (cbitpos) acquisition is NOT included at this time. This change might seem bigger than it is. The change itself is just in protection.rs, the rest are corresponding adjustments. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-02-26 09:11:35 +01:00
Pavel Mores	c549d12da7	runtime-rs: parse SEV-SNP related config file settings The 'sev_snp_guest' default value of 'false' is in compliance with the golang runtime behaviour. Signed-off-by: Pavel Mores <pmores@redhat.com>	2025-02-26 09:11:35 +01:00
Markus Rudy	d58f38dfab	genpolicy: add get_process_fields to CronJob This function was accidentally left unimplemented for CronJob, resulting in runAsUser not being supported there. Fixes: #10653 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-02-26 09:00:04 +01:00
Ruoqing He	ec020399b9	ci: Enable partial components build-check on riscv Since we have RISC-V builders available now, let's start with `agent-ctl`, `trace-forwarder` and `genpolicy` components to run build-checks on these `riscv-builder`s, and gradually add the rest components when they are ready, to catch up with other architectures eventually. This workflow could be mannually triggered, `riscv-builder` will be the default instance when that is the case. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-26 15:38:39 +08:00
Markus Rudy	1f6833bd0d	runtime: add cause to CDI errors Adding devices by CDI annotation can fail for a variety of reasons. If that happens, it's helpful to know the root cause of the issue (CDI spec missing, malformatted, requested device not present, etc.). This commit adds the root cause of the CDI device addition to the errors reported back to the caller. Since this error is bubbled up all the way back to the shimv2 task.Create handler, it will be visible in Kubernetes logs and enable fixing the root cause. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-02-26 08:36:15 +01:00
Paul Meyer	9981cdd8a8	genpolicy: fail when layer can't be processed Currently, if a layer can't be processed, we log this a warning and continue execution, finally exit with a zero exit code. This can lead to the generation of invalid policies. One reason a layer might not be processed is that the pull of that layer fails. We need all layers to be processed successfully to generate a valid policy, as otherwise we will miss the verity hash for that layer or we might miss the USER information from a passwd stored in that layer. This will cause our VM to not get through the agent's policy validation. Returning an error instead of printing a warning will cause genpolicy to fail in such cases. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-02-26 08:30:59 +01:00
Fabiano Fidêncio	b3b570e4c4	agent: Fix non-guest-pull build As the guest-pull is a very Confidental Containers specific feature, let's make sure we, at least, don't break folks who decide to build Kata Containers' agent without having this feature enabled (for instance, for the sake of the agent size). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-25 21:48:41 +01:00
Zvonko Kaiser	04c56a0aaf	Merge pull request #10931 from zvonkok/iommufd-fix gpu: IOMMUFD fix	2025-02-25 12:50:24 -05:00
Ruoqing He	ed50e31625	build: Reorganize target selection Architectures here with `musl` available are minority, which is more suitable for enumeration. With this change, we are implicitly choosing gnu target for `ppc64le`, `riscv64` and `s390x`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-26 00:56:54 +08:00
Ruoqing He	562911e170	build: Add riscv mapping for common.bash While installing Rust and Golang in our CI workflow, `arch_to_golang` and `arch_to_rust` are needed for inferring the correct arch string for riscv64 architecture. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-26 00:56:54 +08:00
Ruoqing He	62e2473c32	build: Add riscv64 to utils.mk Since `ARCH` for `riscv64` is `riscv64gc`, we'll need to override it in `utils.mk`, and forcing `gnu` target for `riscv64` because `musl` target is not yet made ready. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-26 00:56:54 +08:00
Zvonko Kaiser	804e5cd332	gpu: IOMMUFD provide proper ID We need a proper ID otherwise QEMU sometimes fails with invalid ID. Use the same pattern as with the old VFIO implementation. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-25 16:24:17 +00:00
stevenhorsman	c97e9e1592	workflows: Add codeql config I noticed that CodeQl using the default config hasn't scanned since May 2024, so figured it would be worth trying an explicit configuration to see if that gets better results. It's mostly the template, but updated to be more relevant: - Only scan PRs and pushes to the `main` branch - Set a pinned runner version rather than latest (with mac support) - Edit the list of languages to be scanned to be more relevant for kata-containers Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-25 15:05:43 +00:00
Fabiano Fidêncio	e09ae2cc0b	Merge pull request #10921 from RuoqingHe/drop-redundant-override build: Drop redundant ARCH override	2025-02-25 14:54:36 +01:00
Fabiano Fidêncio	c01e7f1ed5	Merge pull request #10932 from kata-containers/topic/consolidate-publish-workflow workflows: Refactor publish workflows	2025-02-25 14:50:40 +01:00
stevenhorsman	5000fca664	workflows: Add build-checks to manual CI Currently the ci-on-push workflow that runs on PRs runs two jobs: gatekeeper-skipper.yaml and ci.yaml. In order to test things like for the error ``` too many workflows are referenced, total: 21, limit: 20 ``` on topic branches, we need ci-devel.yaml to have an extra workflow to match ci-on-push, so add the build-checks as this is helpful to run on topic branches anyway. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-25 11:38:49 +00:00
stevenhorsman	23434791f2	workflows: Refactor publish workflows Replace the four different publish workflows with a single one that take input parameters of the arch and runner, so reduce the amount of duplicated code and try and avoid the ``` too many workflows are referenced, total: 21, limit: 20 ``` error	2025-02-25 10:49:09 +00:00
Fabiano Fidêncio	e3eb9e4f28	Merge pull request #10929 from kata-containers/topic/enable-arm-tests arm: ci: k8s: Enable CI	2025-02-24 19:34:28 +01:00
Fabiano Fidêncio	a6186b6244	ci: k8s: arm: Skip "Check the number vcpus are ..." test See https://github.com/kata-containers/kata-containers/issues/10928 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-24 18:43:24 +01:00
Fabiano Fidêncio	1798804c32	ci: k8s: arm: Skip "Pod quota" test See https://github.com/kata-containers/kata-containers/issues/10927 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-24 18:43:24 +01:00
Fabiano Fidêncio	053827cacc	ci: k8s: arm: Skip "Running within memory constraints" test See https://github.com/kata-containers/kata-containers/issues/10926 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-24 18:43:24 +01:00
Fabiano Fidêncio	7bd444fa52	ci: Run k8s tests on arm64 Let's take advantege of the current arm64 runners, and make sure we have those tests running there as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: Kevin Zhao <kevin.zhao@linaro.org>	2025-02-24 18:43:20 +01:00
Aurélien Bombo	16aa6b9b4b	Merge pull request #10911 from kata-containers/sprt/fix-cgroup-race agent: Fix race condition with cgroup watchers	2025-02-24 10:28:58 -06:00
Ruoqing He	265a751837	build: Drop redundant ARCH override There are many `override ARCH = powerpc64le` after where `utils.mk` is included, which are redundant. Drop those redundant `override`s. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-24 22:04:28 +08:00
Fabiano Fidêncio	aa30f9ab1f	versions: Use jammy for x86_64 confidential initrd Set confidential initrd to use jammy rootfs Signed-off-by: Ryan Savino <ryan.savino@amd.com>	2025-02-22 23:57:16 -06:00
Aurélien Bombo	adca339c3c	ci: Fix GH throttling in run-nerdctl-tests Specify a GH API token to avoid the below throttling error: https://github.com/kata-containers/kata-containers/actions/runs/13450787436/job/37585810679?pr=10911#step:4:96 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:17 -06:00
Aurélien Bombo	111803e168	runtime: cgroups: Remove commented out code Doesn't seem like we're going to use this and it's confusing when inspecting code. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:17 -06:00
Aurélien Bombo	1f8c15fa48	Revert "tests: Skip k8s job test on qemu-coco-dev" This reverts commit `a8ccd9a2ac`. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:17 -06:00
Aurélien Bombo	7542dbffb8	Revert "tests: disable k8s-policy-job.bats on coco-dev" This reverts commit `47ce5dad9d`. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:17 -06:00
Aurélien Bombo	a1ed923740	agent: Fix race condition with cgroup watchers In the CI, test containers intermittently fail to start after creation, with an error like below (see #10872 for more details): # State: Terminated # Reason: StartError # Message: failed to start containerd task "afd43e77fae0815afbc7205eac78f94859e247968a6a4e8bcbb987690fcf10a6": No such file or directory (os error 2) I've observed this error to repro with the following containers, which have in common that they're all very short-lived by design (more tests might be affected): * k8s-job.bats * k8s-seccomp.bats * k8s-hostname.bats * k8s-policy-job.bats * k8s-policy-logs.bats Furthermore, appending a `; sleep 1` to the command line for those containers seemed to consistently get rid of the error. Investigating further, I've uncovered a race between the end of the container process and the setting up of the cgroup watchers (to report OOMs). If the process terminates first, the agent will try to watch cgroup paths that don't exist anymore, and it will fail to start the container. The added error context in notifier.rs confirms that the error comes from the missing cgroup: https://github.com/kata-containers/kata-containers/actions/runs/13450787436/job/37585901466#step:17:6536 The fix simply consists in creating the watchers before we start the container but still after we create it -- this is non-blocking, and IIUC the cgroup is guaranteed to already be present then. Fixes: #10872 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-21 17:52:11 -06:00
Fabiano Fidêncio	aaa7008cad	versions: Add a comment about "jammy" being 22.04 I missed that when I added the other comments, so, for the sake of consistency, let's just add it there as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-21 16:02:38 -06:00
Fabiano Fidêncio	a7d33cc0cb	build: Ensure MEASURED_ROOTFS is only used for images We never ever tested MEASURED_ROOTFS with initrd, and I sincerely do not know why we've been setting that to "yes" in the initrd cases. Let's drop it, as it may be causing issues with the jobs that rely on the rootfs-initrd-confidential. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-21 15:32:20 -06:00
Dan Mihai	b90c537f79	Merge pull request #10881 from mythi/build-fixes minor build fixes	2025-02-21 09:54:55 -08:00
Jeremi Piotrowski	304978ad47	Merge pull request #10784 from arvindskumar99/disable_nesting_checks Disabling Nesting Check for SNP upstream	2025-02-21 12:39:18 +01:00
Xuewei Niu	cdb29a4fd1	Merge pull request #10780 from RuoqingHe/setup-dragonball-workspace dragonball: Appease clippy, setup workspace and centralize RustVMM	2025-02-21 14:04:19 +08:00
Hyounggyu Choi	58647bb654	Merge pull request #10743 from zvonkok/iommufd-gpu-fix IOMMUFD GPU enhancement	2025-02-20 23:43:00 +01:00
Zvonko Kaiser	7cca2c4925	gpu: Use a dedicated VFIO group vs iommufd entry We do not want to abuse the sysfsentry lets use a dedicated devfsentry. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-20 18:27:52 +00:00
Zvonko Kaiser	9add633258	qemu: Add command line for IOMMUFD For each IOMMUFD device create an object and assign it to the device, we need additional information that is populated now correctly to decide if we run the old VFIO or new VFIO backend. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-20 18:27:50 +00:00
Fabiano Fidêncio	19a7f27736	Merge pull request #10906 from BbolroC/remove-measured-rootfs-check-for-shimv2-on-s390x shim-v2: Remove MEASURED_ROOTFS assignment for s390x	2025-02-20 15:53:50 +01:00
arvindskumar99	c0a3ecb27b	config: Disabling nesting check for SNP Adding disable_nesting_checks to accomodate SNP on Azure Signed-off-by: arvindskumar99 <arvinkum@amd.com>	2025-02-20 12:24:08 +01:00
Hyounggyu Choi	1a9dabd433	shim-v2: Remove MEASURED_ROOTFS assignment for s390x As a follow-up for #10904, we do not need to set MEASURED_ROOTFS to no on s390x explicitly. The GHA workflow already exports this variable. This commit removes the redundant assignment. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-20 10:43:36 +01:00
Greg Kurz	f51d84b466	Merge pull request #10904 from BbolroC/turn-off-measured-rootfs-s390x-gha-workflows GHA: Turn off MEASURED_ROOTFS in build-kata-static-tarball-s390x	2025-02-20 10:24:23 +01:00
Aurélien Bombo	601c403603	Merge pull request #10818 from burgerdev/plumbing agent: clear log pipes if denied by policy	2025-02-19 16:28:58 -06:00
Aurélien Bombo	cb3467535c	tests: Add policy test for ReadStreamRequest This test verifies that, when ReadStreamRequest is blocked by the policy, the logs are empty and the container does not deadlock. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-02-19 14:03:41 -06:00
Hyounggyu Choi	ca40462a1c	Merge pull request #10903 from BbolroC/fixes-for-cri-containerd-on-ubuntu24 tests: Support systemd unit files in /usr/lib as well as /lib	2025-02-19 19:45:55 +01:00
Hyounggyu Choi	d973d41efb	GHA: Turn off MEASURED_ROOTFS in build-kata-static-tarball-s390x This is the first attempt to remove the following code: ``` if [ "${ARCH}" == "s390x" ]; then export MEASURED_ROOTFS=no fi ``` from install_shimv2() in kata-deploy-binaries.sh. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-19 18:19:19 +01:00
Zvonko Kaiser	238db32126	Merge pull request #10868 from zvonkok/qemu-tdx-experimental-workflow QEMU TDX experimental workflow	2025-02-19 10:09:27 -05:00
Zvonko Kaiser	f0eef73a89	gpu: Add no_patches.txt for TDX flavour As alwasy if we do not have any patches create the no_patches.txt for the specific tag gpu_tdx_... Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-19 14:59:04 +00:00
Zvonko Kaiser	ca4d227562	gpu: Add qemu-tdx-experimental build We need to introduce again the qemu-tdx build for the GPU Depends-on: #10867 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-19 14:48:56 +00:00
Hyounggyu Choi	a8363c28ca	tests: Support systemd unit files in /usr/lib as well as /lib On Ubuntu 24.04, due to the /usr merge, system-provided unit files now reside in `/usr/lib/systemd/system/` instead of `/lib/systemd/system/`. For example, the command below now returns a different path: ``` $ systemctl show containerd.service -p FragmentPath /usr/lib/systemd/system/containerd.service ``` Previously, on Ubuntu 22.04 and earlier, it returned: ``` /lib/systemd/system/containerd.service ``` The current pattern `if [[ $unit_file == /lib* ]]` fails to match the new path. To ensure compatibility across versions, we update the pattern to match both `/lib` and `/usr/lib` like: ``` if [[ $unit_file =~ ^/(usr/)?lib/ ]] ``` Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-19 14:34:59 +01:00
Zvonko Kaiser	0d786577c6	Merge pull request #10867 from zvonkok/qemu-snp-tdx-experimental gpu: QEMU SNP+TDX experimental updates	2025-02-19 08:26:37 -05:00
Ruoqing He	a8a096b20c	dragonball: Centralize RustVMM crates Centralize all RustVMM crates to workspace.dependencies to prevent having multiple versions of each RustVMM crate, which is error-prone and inconsistent. With this setup, updates on RustVMM crates would be much easier. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-19 21:20:30 +08:00
Ruoqing He	b129972e12	dragonball: Setup workspace Setup workspace in dragonball, move `dbs` crates one level up to be managed as members of dragonball workspace. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-19 21:20:30 +08:00
Ruoqing He	a174e2be03	dragonball: Appease clippy introduced by 1.80.0 New clippy warnings show up after Rust Tool Chain bumped from 1.75.0 to 1.80.0, fix accrodingly. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-19 21:20:30 +08:00
Ruoqing He	6bb193bbc0	spell: Update dictionary for dbs crates Add entries for dbs_* crates' README.md to pass `kata-spell-check.sh` spell checking. Changed British terms to American terms in README of `dbs_pci` to pass `hunspell` check. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-19 21:20:30 +08:00
Zvonko Kaiser	73b7a3478c	Merge pull request #10893 from RuoqingHe/fix-static-check ci: Fix spell_check and improve header_check	2025-02-19 08:08:40 -05:00
Mikko Ylinen	926119040c	packaging: make install_oras.sh to run curl without sudo sudo hides the environment variables that are sometimes useful with the builds (for example: proxy settings). While install_oras.sh could run completely without sudo in the container it's COPY'd to, make minimal changes to it to keep it functional outside the container too while still addressing the problem of 'sudo curl' not working with proxy env variables. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-02-19 09:34:13 +02:00
Mikko Ylinen	0d8242aee4	agent: rename cargo config To mitigate: warning: `.../kata-containers/src/agent/.cargo/config` is deprecated in favor of `config.toml` note: if you need to support cargo 1.38 or earlier, you can symlink `config` to `config.toml` Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2025-02-19 09:34:13 +02:00
Fabiano Fidêncio	c8db24468c	Merge pull request #10894 from BbolroC/use-multi-arch-for-qemu-sample example: Use multi-arch image for test-deploy-kata-qemu.yaml	2025-02-18 23:43:52 +01:00
Dan Mihai	672462e6b8	Merge pull request #10895 from katexochen/p/agent-deps agent: make policy feature optional again	2025-02-18 13:27:23 -08:00
Dan Mihai	6b389fdd4f	Merge pull request #10896 from katexochen/p/oci-client-genplicy genpolicy: bump oci-distribution to v0.12.0	2025-02-18 12:42:23 -08:00
Markus Rudy	67fbad5f37	genpolicy: bump oci-distribution to v0.12.0 This picks up a security fix for confidential pulling of unsigned images. The crate moved permanently to oci-client, which required a few import changes. Co-authored-by: Paul Meyer <katexochen0@gmail.com> Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-02-18 16:32:00 +01:00
Ruoqing He	d23284a0dc	header_check: Check header for changed text files We are running `header_check` for non-text files like binary files, symbolic link files, image files (pictures) and etc., which does not make sense. Filter out non-text files and run `header_check` only for text files changed. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-18 22:39:53 +08:00
Paul Meyer	80af09aae9	agent: make policy feature optional again This was messed up a little when factoring out the policy crate. Removing the dependencies no longer used by the agent and making the import of kata-agent-policy optional again. Signed-off-by: Paul Meyer <katexochen0@gmail.com>	2025-02-18 15:28:06 +01:00
Hyounggyu Choi	4646058c0c	example: Use multi-arch image for test-deploy-kata-qemu.yaml An image `registry.k8s.io/hpa-example` only supports amd64. Let's use a multi-arch image `quay.io/prometheus/prometheus` for the QEMU example instead. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-18 14:23:09 +01:00
Ruoqing He	7e49e83779	spell: Add missing entries for kata-spell-check `kata-dictionary.dic` changes after running `kata-spell-check.sh make-dict`. This is due to someone forgot to first update entries in data and run `make-dict`, but directly updated `kata-dictionary.dic` instead. Add mssing entries to data and re-run `make-dict` to generate correct `kata-dictionary.dic`. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-18 19:06:34 +08:00
Lukáš Doktor	d0ef78d3a4	ci: Change the way we modify runtimeclass in webhook previously we used to deploy the webhook and then modified the cm from our ci/openshift-ci/ script to the desired value, but sometimes it happens that the webhook pod starts before we modify the cm and keeps using the default value. Let's change the approach and modify the deployments in-place. The only cons is it leaves the git dirty, but since this script is only supposed to be used in ci it should be safe. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2025-02-18 11:39:22 +01:00
Anastassios Nanos	1e6cea24c8	Merge pull request #10890 from zvonkok/arm64-fix-release release: Remove artifacts for release	2025-02-17 22:29:23 +02:00
Zvonko Kaiser	1d9915147d	release: Remove artifacts for release We need to make sure the release does not have any residual binaries left for the release payload Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-17 20:16:48 +00:00
Anastassios Nanos	ae1be28ddd	Merge pull request #10880 from nubificus/3.14.0-release release: Bump version to 3.14.0	2025-02-17 20:25:30 +02:00
Zvonko Kaiser	72833cb00b	Merge pull request #10878 from zvonkok/agent_cdi_timeout gpu: agent cdi timeout	2025-02-17 12:49:51 -05:00
Zvonko Kaiser	fda095a4c9	Merge pull request #10786 from zvonkok/gpu-config-update gpu: Update config files	2025-02-17 12:45:54 -05:00
Anastassios Nanos	c7347cb76d	release: Bump version to 3.14.0 Bump VERSION and helm-chart versions Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2025-02-17 16:47:24 +00:00
Fabiano Fidêncio	639bc84329	Merge pull request #10787 from fidencio/topic/bump-kernel-to-6.12.11 version: Bump kernel to 6.12.13	2025-02-17 17:39:14 +01:00
Fabiano Fidêncio	7ae5fa463e	versions: Bump coco-guest-components So attestation-agent and others have a version including the ttrpc bump to v0.8.4, allowing us to use the latest LTS kernel. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-17 15:16:54 +01:00
Fabiano Fidêncio	1381cab6f0	build: Fix rootfs cache logic We've been appending to the wrong variable for quite some time, it seems, leading to not actually regenerating the rootfs when needed. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-17 13:55:36 +01:00
Fabiano Fidêncio	7fc7328bbc	versions: Bump kernel to 6.12.13 Let's try to keep up with the LTS patch releases. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-17 13:47:35 +01:00
Simon Kaegi	f5edbfd696	kernel: support loop device in v6.8+ kernels Set CONFIG_BLK_DEV_WRITE_MOUNTED=y to restore previous kernel behaviour. Kernel v6.8+ will by default block buffer writes to block devices mounted by filesystems. This unfortunately is what we need to use mounted loop devices needed by some teams to build OSIs and as an overlay backing store. More info on this config item [here](https://cateee.net/lkddb/web-lkddb/BLK_DEV_WRITE_MOUNTED.html) Fixes: #10808 Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>	2025-02-17 13:47:35 +01:00
Fabiano Fidêncio	d96e8375c4	Merge pull request #10885 from stevenhorsman/bump-agent-crates-to-resolve-CVEs agent: Bump agent crates to resolve CVEs	2025-02-17 12:11:43 +01:00
stevenhorsman	e5a284474d	deps: Update cookie-store & publicsuffix Run: ``` cargo update -p cookie-store cargo update -p publicsuffix ``` to update the version of idna and resolve CVE-2024-12224 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-14 17:30:03 +00:00
stevenhorsman	5656fc6139	deps: Bump reqwest Bump reqwest to 0.12.12 to pick up fixes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-14 17:30:03 +00:00
stevenhorsman	3a3849efff	deps: Update quinn-proto Update quin-proto to fix CVE-2024-45311 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-14 17:30:03 +00:00
Fabiano Fidêncio	64ceb0832a	Merge pull request #10851 from fidencio/topic/bump-image-rs-to-bring-in-ttrpc-0.8.4 agent: Bump image-rs to 514c561d93	2025-02-14 18:21:56 +01:00
Fabiano Fidêncio	d5878437a4	Merge pull request #10845 from DataDog/dind-subcgroup-fix Add process to init subcgroup when we're using dind with cgroups v2	2025-02-14 18:12:24 +01:00
Steve Horsman	469c651fc0	Merge pull request #10879 from nubificus/fix_version packaging(release): Properly handle version tag for the release bundle	2025-02-14 14:40:37 +00:00
Zvonko Kaiser	908aacfa78	gpu: Update the logging around CDI Removed a rogue printf and updated the logging to say that we're waiting for CDI spec(s) to be generated rather than saying there is an error, it's not we have a timeout after that it is an error. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-14 14:32:00 +00:00
Zvonko Kaiser	4bda16565b	gpu: Update timeouts With the create_container_timeout the dial_timeout is lest important. Add the custom timeout for GPUs in create_container_timeout Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-14 14:29:18 +00:00
Zvonko Kaiser	66ccc25724	tdx: Update GPU config for the latest TDX stack We need extra kernel_params for TDX Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-14 14:29:18 +00:00
Zvonko Kaiser	d4dd87a974	gpu: Update config files With the recent changed to cgroupsv1 and AGENT_INIT=no we need update to the config files. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-14 14:29:18 +00:00
Anastassios Nanos	b13db29aaa	packaging(release): Properly handle version tag for the release bundle The tags created automatically for published Github releases are probably not annotated, so by simply running `git describe` we are not getting the correct tag. Use a `git describe --tags` to allow git to look at all tags, not just annotated ones. Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2025-02-14 12:41:08 +00:00
Zvonko Kaiser	2499d013bd	gpu: Update handle_cdi_devices AgentConfig now has the cdi_timeout from the kernel cmdline, update the proper function signature and use it in the for loop. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-13 20:11:48 +00:00
Zvonko Kaiser	d28410ed75	Merge pull request #10877 from AdithyaKrishnan/main CI: Deprecate SEV	2025-02-13 14:55:11 -05:00
Zvonko Kaiser	95aa21f018	gpu: Add CDI timeout via kernel config Some systems like a DGX where we have 8 H100 or 8 H800 GPUs need some extended time to be initialized. We need to make sure we can configure CDI timeout, to enable even systems with 16 GPUs. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-13 19:23:19 +00:00
Adithya Krishnan Kannan	6cc5b79507	CI: Deprecate SEV Phase 1 of Issue #10840 AMD has deprecated SEV support on Kata Containers, and going forward, SNP will be the only AMD feature supported. As a first step in this deprecation process, we are removing the SEV CI workflow from the test suite to unblock the CI. Will be adding future commits to remove redundant SEV code paths. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2025-02-13 12:20:21 -06:00
Steve Horsman	0a39f59a9b	Merge pull request #10874 from stevenhorsman/skip-consistently-failing-block-volume-test tests: Skip block volume test on fc, stratovirt	2025-02-13 15:39:45 +00:00
Zvonko Kaiser	a0766986e7	Merge pull request #10832 from RuoqingHe/update-yq ci: Update yq to v4.44.5 to support riscv64	2025-02-13 08:33:02 -05:00
stevenhorsman	56fb2a9482	tests: Skip block volume test on fc, stratovirt The block volume test has failed on 10/10 nightlies and all the PRs I've seen, so skip it until it can be assessed. See #10873 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-13 11:50:35 +00:00
stevenhorsman	2d266df846	test: Update expected error in signed image tests We are seeing a different error in the new version of image-rs, so update our tests to match. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-13 11:44:51 +00:00
stevenhorsman	d28a512d29	agent: Wait for network before init_image_service Based on the guidance from @Xynnn007 in #10851 > The new version of image-rs will do attestation once ClientBuilder.build().await() is called, while the old version will do so lazily the first image pull request comes. Looks like it's called in rpc::start() in kata-agent, when I'm afraid the network hasn't been initialized yet. > I am not sure if the guest network is prepared after the DNS is configured (in create_sandbox), if so we can move (the init_image_service) right after that. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-13 11:44:51 +00:00
Tobin Feldman-Fitzthum	a13d5a3f04	agent: Bump image-rs to 514c561d93 As this brings in the commit bumping ttrpc to 0.8.4, which fixes connection issues with kernel 6.12.9+. As image-rs has a new builder pattern and several of the values in the image client config have been renamed, let's change the agent to account for this. Signed-off-by: Tobin Feldman-Fitzthum <tobin@linux.ibm.com> Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-13 11:44:51 +00:00
Steve Horsman	8614e5efc4	Merge pull request #10869 from stevenhorsman/bump-kcli-ubuntu-version ci: k8s: Bump kcli image version	2025-02-13 09:59:20 +00:00
Antoine Gaillard	4b5b788918	agent: Use init subcgroup for process attachment in DinD cgroups v2 enforces stricter delegation rules, preventing operations on cgroups outside our ownership boundary. When running Docker-in-Docker (DinD), processes must be attached to an "init" subcgroup within the systemd unit. This fix detects and uses the init subcgroup when proxying process attachment. Fixes #10733 Signed-off-by: Antoine Gaillard <antoine.gaillard@datadoghq.com>	2025-02-13 10:44:51 +01:00
Dan Mihai	958cd8dd9f	Merge pull request #10613 from 3u13r/feat/policy/refactor-out-policy-crate-and-network-namespace policy: add policy crate and add network namespace check to policy	2025-02-12 18:28:09 -08:00
Alex Lyn	e1b780492f	Merge pull request #10839 from RuoqingHe/appease-clippy dragonball: Appease clippy	2025-02-13 09:12:15 +08:00
Zvonko Kaiser	acd2a933da	Merge pull request #10864 from fidencio/topic/packaging-move-to-ubuntu-22-04 packaging: Move builds to Ubuntu 22.04	2025-02-12 14:29:41 -05:00
Wainer Moschetta	62e239ceaa	Merge pull request #10810 from arvindskumar99/nydus_perm_install Skipping SNP and SEV from deploying and deleting Snapshotter	2025-02-12 14:38:56 -03:00
stevenhorsman	fd7bcd88d0	ci: k8s: Bump kcli image version When trying to deploy nydus on kcli locally we get the following failure: ``` root@sh-kata-ci1:~# kubectl get pods -n nydus-system NAMESPACE NAME READY STATUS RESTARTS AGE nydus-system nydus-snapshotter-5kdqs 0/1 CrashLoopBackOff 4 (84s ago) 7m29s ``` Digging into this I found that the nydus-snapshotter service is failing with: ``` ubuntu@kata-k8s-worker-0:~$ journalctl -u nydus-snapshotter.service -- Logs begin at Wed 2025-02-12 15:06:08 UTC, end at Wed 2025-02-12 15:20:27 UTC. -- Feb 12 15:10:39 kata-k8s-worker-0 systemd[1]: Started nydus snapshotter. Feb 12 15:10:39 kata-k8s-worker-0 containerd-nydus-grpc[6349]: /usr/local/bin/containerd-nydus-grpc: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required b> Feb 12 15:10:39 kata-k8s-worker-0 containerd-nydus-grpc[6349]: /usr/local/bin/containerd-nydus-grpc: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required b> Feb 12 15:10:39 kata-k8s-worker-0 systemd[1]: nydus-snapshotter.service: Main process exited, code=exited, status=1/FAILURE ``` I think this is because 20.04 has version: ``` ubuntu@kata-k8s-worker-0:~$ ldd --version ldd (Ubuntu GLIBC 2.31-0ubuntu9.16) 2.31 ``` so it's too old for the nydus snapshotter. Also 20.04 is EoL soon, so bumping is better. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-12 15:38:18 +00:00
Zvonko Kaiser	fbc8454d3d	Merge pull request #10866 from zvonkok/enable-cc-gpu-build gpu: enable confidential initrd build	2025-02-12 09:26:08 -05:00
Ruoqing He	897e2e2b6e	dragonball: Appease clippy Some problem hidden in `dbs` crates are revealed after making these crates workspace components, fix according to `cargo clippy` suggests. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-12 19:44:34 +08:00
Leonard Cohnen	ec0af6fbda	policy: check the linux network namespace Peer pods have a linux namespace of type network. We want to make sure that all container in the same pod use the same namespace. Therefore, we add the first namespace path to the state and check all other requests against that. This commit also adds the corresponding integration test in the policy crate showcasing the benefit of having rust integration tests for the policy. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2025-02-12 10:41:15 +01:00
Leonard Cohnen	7aca7a6671	policy: use agent policy crate in genpolicy test The generated rego policies for `CreateContainerRequest` are stateful and that state is handled in the policy crate. We use this policy crate in the genpolicy integration test to be able to test if those state changes are handled correctly without spinning up an agent or even a cluster. This also allows to easily test on a e.g., CreateContainerRequest level instead of relying on changing the yaml that is applied to a cluster. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2025-02-12 10:41:15 +01:00
Leonard Cohnen	d03738a757	genpolicy: expose create as library This commit allows to programmatically invoke genpolicy. This allows for other rust tools that don't want to consume genpolicy as binary to generate policies. One such use-case is the policy integration test implemented in the following commits. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2025-02-12 10:41:15 +01:00
Leonard Cohnen	cf54a1b0e1	agent: move policy module into separate crate The policy module augments the policy generated with genpolicy by keeping and providing state to each invocation. Therefore, it is not sufficient anymore to test the passing of requests in the genpolicy crate. Since in Rust, integration tests cannot call functions that are not exposed publicly, this commit factors out the policy module of the agent into its own crate and exposes the necessary functions to be consumed by the agent and an integration tests. The integration test itself is implemented in the following commits. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2025-02-12 10:41:15 +01:00
Fupan Li	ec7b2aa441	Merge pull request #10850 from teawater/direct Clean the config block_device_cache_direct of runtime-rs	2025-02-12 09:45:37 +08:00
Zvonko Kaiser	5431841a80	Merge pull request #10814 from kata-containers/shellcheck-gha gha: Add shellcheck	2025-02-11 18:30:41 -05:00
Zvonko Kaiser	2d8531cd20	gpu: Add TDX experimental target for GPUs We have custom branches on coco/qemu to support GPUs in TDX and SNP add experimental target. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 17:32:31 +00:00
Zvonko Kaiser	7ded74c068	gpu: Add version for QEMU+TDX+SNP SNP and TDX patches for GPU are not compatible hence we need an own build for TDX. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 17:32:31 +00:00
Zvonko Kaiser	e4679055c6	gpu: qemu-snp-experimental no patches The branch has all the needed cherry-picks Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 17:32:31 +00:00
Zvonko Kaiser	7a219b3f03	gpu: Add GPU+SNP QEMU build Since the CPU SNP is upstreamed and available via our default QEMU target we're repurposing the SNP-experimental for the GPU+SNP enablement. First step is to update the version we're basing it off. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 17:32:31 +00:00
Zvonko Kaiser	b231a795d7	gha: Add shellcheck We need to start to fix our scripts. Lets run shellcheck and see what needs to be reworked. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 16:00:34 +00:00
Zvonko Kaiser	befb2a7c33	gpu: Confidential Initrd Start building the confidential initrd Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-11 15:41:36 +00:00
Fupan Li	5b809ca440	CI: a workaround for containerd v2.x e2e test the latest containerd had an issue for its e2e test, thus we should do the following fix to workaround this issue. For much info about this issue, please see: https://github.com/containerd/containerd/pull/11240 Once this pr was merged and release new version, we can remove this workaround. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	a3fd3d90bc	ci: Add the sandbox api testcases A test case is added based on the intergrated cri-containerd case. The difference between cri containerd integrated testcase and sandbox api testcase is the "sandboxer" setting in the sandbox runtime handler. If the "sandboxer" is set to "" or "podsandbox", then containerd will use the legacy shimv2 api, and if the "sandboxer" is set to "shim", then it will use the sandbox api to launch the pod. In addition, add a containerd v2.0.0 version. Because containerd officially supports the sandbox api from version 2.0.0. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	36bf080c1e	runtime-rs: register the sandbox api service add and resiger the sandbox api service, thus runtime-rs can deal with the sandbox api rpc call from the containerd. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	8332f427d2	runtime-rs: add the wait and status method for sandbox api Add the sandbox wait and sandbox status method for sandbox api. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	2d6b1e6b13	runtime-rs: add the sandbox api support For Kata-Containers, we add SandboxService for these new calls alongside the existing TaskService, including processing requests and replies, and properly calling VirtSandbox's interfaces. By splitting the start logic of the sandbox, virt_container is compatible with calls from the SandboxService and TaskService. In addition, we modify the processing of resource configuration to solve the problem that SandboxService does not have a spec file when creating a pod. Sandbox api can be supported from containerd 1.7. But there's a difference from container 2.0. To enbale it from 2.0, you can support the sandbox api for a specific runtime by adding: sandboxer = "shim", take kata runtime as an example: [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata] runtime_type = "io.containerd.kata.v2" sandboxer = "shim" privileged_without_host_devices = true pod_annotations = ["io.katacontainers.*"] For container version 1.7, you can enable it by: 1: add env ENABLE_CRI_SANDBOXES=true 2: add sandbox_mode = "shim" to runtime config. Acknowledgement This work was based on @wllenyj's POC code: (`f5b62a2d7c`) Signed-off-by: Fupan Li <fupan.lfp@antgroup.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>	2025-02-11 15:21:53 +01:00
Fupan Li	65e908a584	runtime-rs: add the sandbox init for sandbox api For the processing of init sandbox, the init of task api has some more special processing procedures than the init of sandbox api, so these two types of init are separated here. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	be40646d04	runtime-rs: move the sandbox start from sandbox init function Split the sandbox start from the sandbox init process, and call them separately. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	438f81b108	runtime-rs: only get the containerd id when start container When start the sandbox, the sandbox id would be passed from the shim command line, and it only need to get the containerd id from oci spec when starting the pod container instead of the pod sandbox. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	9492c45d06	runtime-rs: load the cgroup path correctly When the sandbox api was enabled, the pause container would be removed and sandbox start api only pass an empty bundle directory, which means there's no oci spec file under it, thus the cgroup config couldn't get the cgroup path from pause container's oci spec. So we should set a default cgroup path for sandbox api case. In the future, we can promote containerd to pass the cgroup path during the sandbox start phase. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	78b96a6e2e	runtime-rs: fix the issue of missing create sandbox dir It's needed to make sure the sandbox storage path exist before return it. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	97785b1f3f	runtime-rs: rustfmt against lib.rs It seemed some files was mssing run rustfmt. This commit do rustfmt for them. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Fupan Li	33555037c0	protocols: Add the cri api protos Add the cri api protos to support the sandbox api. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-02-11 15:21:53 +01:00
Hui Zhu	27cff15015	runtime-rs: Remove block_device_cache_direct from config of fc Remove block_device_cache_direct from config of fc in runtime-rs because fc doesn't support this config. Fixes: #10849 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-02-11 14:04:11 +08:00
Hui Zhu	70d9afbd1f	runtime-rs: Add block_device_cache_direct to config of ch and dragonball Add block_device_cache_direct to config of ch and dragonball in runtime-rs because they support this config. Fixes: #10849 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-02-11 14:04:11 +08:00
Hui Zhu	db04c7ec93	runtime-rs: Add block_device_cache_direct config to ch and qemu Add block_device_cache_direct config to ch and qemu in runtime-rs. Fixes: #10849 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-02-11 14:04:11 +08:00
Hui Zhu	e4cbc6abce	runtime-rs: CloudHypervisorInner: Change config type This commit change config in CloudHypervisorInner to normal HypervisorConfig to decrease the change of its type. Fixes: #10849 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-02-11 14:04:11 +08:00
Fabiano Fidêncio	75ac09baba	packaging: Move builds to Ubuntu 22.04 As Ubuntu 20.04 will reach its EOL in April. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-10 21:25:43 +01:00
Fabiano Fidêncio	c9f5966f56	Merge pull request #10860 from kata-containers/topic/debug-ci workflows: build: Do not store unnecessary content on the tarball	2025-02-10 20:01:37 +01:00
Fabiano Fidêncio	ec290853e9	workflows: build: Do not store unnecessary content on the tarball Otherwise we may end up simply unpacking kata-containers specific binaries into the same location that system ones are needed, leading to a broken system (most likely what happened with the metrics CI, and also what's happening with the GHA runners). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-10 18:57:29 +01:00
Steve Horsman	fb341f8ebb	Merge pull request #10857 from fidencio/topic/ci-tdx-only-use-one-machine-for-testing ci: Only use the Ubuntu TDX machine in the CI	2025-02-10 15:25:06 +00:00
Fabiano Fidêncio	23cb5bb6c2	ci: Only use the Ubuntu TDX machine in the CI We've been hitting issues with the CentOS 9 Stream machine, which Intel doesn't have cycles to debug. After raising this up in the Confidential Containers community meeting we got the green light from Red Hat (Ariel Adam) to just disable the CI based on CentOS 9 Stream for now. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-02-10 12:50:16 +01:00
Zvonko Kaiser	eb1cf792de	Merge pull request #10791 from kata-containers/gpu_ci_cd gpu: Add first target and fix extratarballs	2025-02-06 15:47:27 -05:00
Zvonko Kaiser	62a975603e	Merge pull request #10806 from stevenhorsman/rust-1.80.0-bump Rust 1.80.0 bump	2025-02-06 14:49:23 -05:00
Dan Mihai	fdf3088be0	Merge pull request #10842 from microsoft/danmihai1/disable-job-policy-test tests: disable k8s-policy-job.bats on coco-dev	2025-02-06 09:09:49 -08:00
Hyounggyu Choi	48c5b1fb55	Merge pull request #10841 from BbolroC/make-measured-rootfs-configurable local-build: Do not build measured rootfs on s390x	2025-02-06 16:07:15 +01:00
Hyounggyu Choi	1bdb34e880	tests: Skip trusted storage tests for IBM SE Let's skip all tests for trusted storage until #10838 is resolved. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-06 12:09:14 +01:00
Hyounggyu Choi	27ce3eef12	local-build: Do not use measured rootfs on s390x IBM SE ensures to make initrd measured by genprotimg and verified by ultravisor. Let's not build the measured rootf on s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-02-06 10:12:55 +01:00
stevenhorsman	fce49d4206	dragonball: Skip unsafe tests Skip tests that use unsafe uses of file descriptor which causes ``` fatal runtime error: IO Safety violation: owned file descriptor already closed ``` See #10821 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-06 08:54:17 +00:00
Fabiano Fidêncio	2ceb7a35fc	versions: Bump rust to 1.80.0 (matching coco-guest-components) This is needed in order to avoid agent build issues, such as: ``` error[E0658]: use of unstable library feature 'lazy_cell' --> /home/ansible/.cargo/git/checkouts/guest-components-1e54b222ad8d9630/514c561/ocicrypt-rs/src/lib.rs:10:5 \| 10 \| use std::sync::LazyLock; \| ^^^^^^^^^^^^^^^^^^^ \| = note: see issue #109736 <https://github.com/rust-lang/rust/issues/109736> for more information ``` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-06 08:53:51 +00:00
Fabiano Fidêncio	76df852f33	packaging: agent: Add rust version to the builder image name As we want to make sure a new builder image is generated if the rust version is bumped. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-06 08:53:51 +00:00
stevenhorsman	d3e0ecc394	kata-ctl: Allow empty const Due to the way that multi-arch support is done, on various platforms we will get a clippy error: ``` error: this expression always evaluates to false ``` which might not be true on those other platforms, so allow this code pattern to suppress the clippy error Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-06 08:53:51 +00:00
Fabiano Fidêncio	6de8e59109	Merge pull request #10824 from stevenhorsman/updates-in-prep-of-rust-1.80-bump Updates in prep of rust 1.80 bump	2025-02-06 09:05:23 +01:00
Dan Mihai	47ce5dad9d	tests: disable k8s-policy-job.bats on coco-dev k8s-policy-job is modeled after the older k8s-job, and it appears that both of them fail occasionally on coco-dev. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-02-05 23:06:16 +00:00
Arvind Kumar	47534c1c3e	nydus: Skipping SNP and SEV from deploying and deleting Snapshotter Preparing to install nydus permanently on the AMD node, so disabling deploy and delete command for SNP and SEV. Signed-off-by: Arvind Kumar <arvinkum@amd.com>	2025-02-05 12:26:53 -06:00
Zvonko Kaiser	45bd451fa0	ci: add arm64 attestation Do the very same thing that we do on amd64 and add attestation Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
Zvonko Kaiser	9a7dff9c40	gpu: Add arm64 targets We want to make sure we deliver arm64 GPU targets as well Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
Zvonko Kaiser	968318180d	ci: Add extratarballs steps We introduced extratarballs with a make target. The CI currently only uploads tarballs that are listed in the matrix. The NV kernel builds a headers package which needs to be uploaded as well. The get-artifacts has a glob to download all artifacts hence we should be good. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
Zvonko Kaiser	b04bdf54a5	gpu: Add rootfs target amd64/arm64 Adding the initrd build first to get the rootfs on amd64. With that we can start to add tests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-05 16:30:20 +00:00
stevenhorsman	7831caf1e7	libs/safe-path: Fix doc formatting Clippy fails with ``` error: doc list item missing indentation ``` so indent further to avoid this.	2025-02-05 15:16:47 +00:00
stevenhorsman	17b1e94f1a	cargo: Update time crate So it avoids us hitting ``` error[E0282]: type annotations needed for `Box<_>` --> /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/time-0.3.31/src/format_description/parse/mod.rs:83:9 \| 83 \| let items = format_items \| ^^^^^ ... 86 \| Ok(items.into()) \| ---- type must be known at this point \| help: consider giving `items` an explicit type, where the placeholders `_` are specified \| 83 \| let items: Box<_> = format_items \| ++++++++ ``` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 15:16:47 +00:00
stevenhorsman	e9393827e8	agent: Workaround ppc formatting On powerpc64le platform the ip neigh command has a trailing space after the state, so the test is failing e.g. ``` assertion `left == right` failed left: "169.254.1.1 lladdr 6a:92:3a:59:70:aa PERMANENT \n" right: "169.254.1.1 lladdr 6a:92:3a:59:70:aa PERMANENT\n" ``` Trim the whitespace to make the test pass on all platforms Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 15:16:47 +00:00
stevenhorsman	1ac0e67245	kata-ctl: Add stub of missing method for ppc `host_is_vmcontainer_capable` is required, but wasn't implemented for powerpc64, so copy the aarch64 approach @Amulyam24 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 15:16:47 +00:00
stevenhorsman	bd3c93713f	kata-sys-util: Complete code move In #7236 the guest protection code was moved to kata-sys-utils, but some of it was left behind, and the adjustment to the new location wasn't completed, so the powerpc64 code doesn't build now we've fixed the cfg to test it. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 15:16:47 +00:00
stevenhorsman	9f865f5bad	kata-ctl: Allow dead_code Some of the Kernel structs have `#[allow(dead_code)]` but not all and this results in the clippy error: ``` error: fields `name` and `value` are never read ``` so complete the job started before to remove the error. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	61a252094e	dragonball: Fix feature typo Replace `legacy_irq` with `legacy-irq` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	add785f677	dragonball: Remove unused fields `metrics` is never used, so remove this code Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	dde34bb7b8	runtime-rs: Remove un-used code The `r#type` method is never used, so neither are the log type constants Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	71fffb8736	runtime-rs: Allow dead code Clippy errors with: ``` error: field `driver` is never read --> crates/resource/src/network/utils/link/driver_info.rs:77:9 \| 76 \| pub struct DriverInfo { \| ---------- field in this struct 77 \| pub driver: String, \| ^^^^^^ ``` We set this, but never read it, so clippy is correct, but I'm not sure if it's useful for logging, or other purposes, so I'll allow it for now. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	d75a0ccbd1	dragonball: Allow test-mock feature Clippy fails with: ``` warning: unexpected `cfg` condition value: `test-mock` --> /root/go/src/github.com/kata-containers/kata-containers/src/dragonball/src/dbs_pci/src/vfio.rs:1929:17 \| 1929 \| #[cfg(all(test, feature = "test-mock"))] \| ^^^^^^^^^^^^^^^^^^^^^ help: remove the condition \| = note: no expected values for `feature` = help: consider adding `test-mock` as a feature in `Cargo.toml` ``` So add it as an expected cfg in the linter to skip this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	bddaea6df1	runtime-rs: Allow enable-vendor feature Clippy fails with: ``` error: unexpected `cfg` condition value: `enable-vendor` --> crates/hypervisor/src/device/driver/vfio.rs:180:11 \| 180 \| #[cfg(feature = "enable-vendor")] \| ^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: expected values for `feature` are: `ch-config`, `cloud-hypervisor`, `default`, and `dragonball` = help: consider adding `enable-vendor` as a feature in `Cargo.toml` ``` So add it as an expected cfg in the linter to skip this Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	bed128164a	runtime-rs: Allow unexpected config Clippy fails with: ``` error: unexpected `cfg` condition value: `enable-vendor` --> crates/hypervisor/src/device/driver/vfio.rs:180:11 \| 180 \| #[cfg(feature = "enable-vendor")] \| ^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: expected values for `feature` are: `ch-config`, `cloud-hypervisor`, `default`, and `dragonball` = help: consider adding `enable-vendor` as a feature in `Cargo.toml` ``` allow this until we can check this behaviour with @Apokleos Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	53bcb0b108	runtime-rs: Fix for-loops-over-fallibles Clippy complains about: ``` error: for loop over a `&Result`. This is more readably written as an `if let` statement --> crates/hypervisor/src/firecracker/fc_api.rs:99:22 \| 99 \| for param in &kernel_params.to_string() { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	c332a91ef8	runtime-rs: Fix doc list item missing indentation Add the extra space to format the list correctly Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	fe98d49a29	runtime-rs: Remove direct implementation of ToString Fix clippy error: ``` direct implementation of `ToString` ``` by switching to implement Display instead Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:02 +00:00
stevenhorsman	730c56af2a	runtime-rs: Fix clippy::unnecessary-get-then-check Clippy errors with: ``` error: unnecessary use of `get(&id).is_none()` --> crates/hypervisor/src/device/device_manager.rs:494:29 \| 494 \| if self.devices.get(&id).is_none() { \| -------------^^^^^^^^^^^^^^^^^^ \| \| \| help: replace it with: `!self.devices.contains_key(&id)` ``` so fix this as suggested Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	a9358b59b7	runtime-rs: Allow unused enum field Clippy errors with: ``` error: field `0` is never read --> crates/hypervisor/src/qemu/cmdline_generator.rs:375:25 \| 375 \| DeviceAlreadyExists(String), // Error when trying to add an existing device \| ------------------- ^^^^^^ ``` but this is used when creating the error later, so add an allow to ignore this warning Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	1d9efeb92b	runtime-rs: Remove use of legacy constants Fix clippy error ``` error: usage of a legacy numeric constant ``` by swapping `std::u8::MAX` for `u8::MAX` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	225c7fc026	kata-ctl: Allow unused enum field Clippy errors with: ``` error: field `0` is never read ``` but the field is required for the `map_err`, so ignore this error for now to avoid too much disruption Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	f1d3450d1f	runtime-rs: Remove unused config `gdb` is only activated by a feature `guest_debug` that doesn't exist, so remove this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	1e90fc38de	dragonball: Fix incorrect reference There were references to `config_manager::DeviceInfoGroup` which doesn't exist, so I guess it means `DeviceConfigInfo` instead, so update them Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	f389b05f20	dragonball: Fix doc formatting issue Clippy errors with: ``` error: doc list item missing indentation ``` which I think is because the Return is between two list items, so add a blank line to separate this into a separate paragraph Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	8bea57326a	dragonballl: Fix thread_local initializer error clippy errors with: ``` error: initializer for `thread_local` value can be made `const` ``` so update as suggested Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	7257ee0397	agent: Remove implementation of ToString Fix clippy error: ``` direct implementation of `ToString` ``` by switching to implement Display instead Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	ca87aca1a6	agent: Remove use of legacy constants Fix clippy error ``` error: usage of a legacy numeric constant ``` by swapping `std::i32::<MIN/MAX>` for `i32::<MIN/MAX>` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	6008fd56a1	agent: Fix clippy error ``` error: file opened with `create`, but `truncate` behavior not defined ``` `truncate(true)` ensures the file is entirely overwritten with new data which I believe is the behaviour we want Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	a640bb86ec	agent: cdh: Remove unnecessary borrows Fix clippy error: ``` error: the borrowed expression implements the required traits ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	a131eec5c1	agent: config: Remove supports_seccomp supports_seccomp is never used, so throws a clippy error Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	0bd36a63d9	agent: Fix clippy error ``` error: bound is defined in more than one place ``` Move Sized into the later definition of `R` & `W` rather than defining them in two places Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	7709198c3b	rustjail: Fix clippy error ``` error: file opened with `create`, but `truncate` behavior not defined ``` `truncate(true)` ensures the file is entirely overwritten with new data which I believe is the behaviour we want Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
Fabiano Fidêncio	b4de302cb2	genpolicy: Adjust to build with rust 1.80.0 ``` error: field `image` is never read --> src/registry.rs:35:9 \| 34 \| pub struct Container { \| --------- field in this struct 35 \| pub image: String, \| ^^^^^ \| = note: `Container` has derived impls for the traits `Debug` and `Clone`, but these are intentionally ignored during dead code analysis = note: `-D dead-code` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(dead_code)]` error: field `use_cache` is never read --> src/utils.rs:106:9 \| 105 \| pub struct Config { \| ------ field in this struct 106 \| pub use_cache: bool, \| ^^^^^^^^^ \| = note: `Config` has derived impls for the traits `Debug` and `Clone`, but these are intentionally ignored during dead code analysis error: could not compile `genpolicy` (bin "genpolicy") due to 2 previous errors ``` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	099b241702	powerpc64: Add target_endian = "little" Based on comments from @Amulyam24 we need to use the `target_endian = "little"` as well as target_arch = "powerpc64" to ensure we are working on powerpc64le. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:45:01 +00:00
stevenhorsman	4c006c707a	build: Fix powerpc64le target_arch Starting with version 1.80, the Rust linter does not accept an invalid value for `target_arch` in configuration checks: ``` Compiling kata-sys-util v0.1.0 (/home/ddd/Work/kata/kata-containers/src/libs/kata-sys-util) error: unexpected `cfg` condition value: `powerpc64le` --> /home/ddd/Work/kata/kata-containers/src/libs/kata-sys-util/src/protection.rs:17:34 \| 17 \| #[cfg(any(target_arch = "s390x", target_arch = "powerpc64le"))] \| ^^^^^^^^^^^^^^------------- \| \| \| help: there is a expected value with a similar name: `"powerpc64"` \| = note: expected values for `target_arch` are: `aarch64`, `arm`, `arm64ec`, `avr`, `bpf`, `csky`, `hexagon`, `loongarch64`, `m68k`, `mips`, `mips32r6`, `mips64`, `mips64r6`, `msp430`, `nvptx64`, `powerpc`, `powerpc64`, `riscv32`, `riscv64`, `s390x`, `sparc`, `sparc64`, `wasm32`, `wasm64`, `x86`, and `x86_64` = note: see <https://doc.rust-lang.org/nightly/rustc/check-cfg/cargo-specifics.html> for more information about checking conditional configuration = note: `-D unexpected-cfgs` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unexpected_cfgs)]` ``` According [to GitHub user @Urgau][explain], this is a new warning introduced in Rust 1.80, but the problem exists before. The correct architecture name should be `powerpc64`, and the differentiation between `powerpc64le` and `powerpc64` should use the `target_endian = "little"` check. [explain]: #10072 (comment) Fixes: #10067 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> [emlima: fix some more occurences and typos] Signed-off-by: Emanuel Lima <emlima@redhat.com> [stevenhorsman: fix some more occurences and typos] Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-05 14:20:47 +00:00
Zvonko Kaiser	429b2654f4	Merge pull request #10812 from zvonkok/fix-arch-build-gpu gpu: Fix arm64 build	2025-02-04 17:03:37 -05:00
Dan Mihai	3fc170788d	Merge pull request #10811 from microsoft/cameronbaird/hyp-loglevel-upstream CLH: config: add hypervisor_loglevel	2025-02-04 11:59:21 -08:00
Zvonko Kaiser	eeacd8fd74	gpu: Adapt rootfs build for multi-arch Add aarch64 and x86_64 handling. Especially build the Rust dependency with the correct rust musl target. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-02-04 16:44:21 +00:00
Steve Horsman	9060904c4f	Merge pull request #10826 from kata-containers/topic/crio-test-timeouts workflows: Add delete kata-deploy timeouts for crio tests	2025-02-04 13:09:49 +00:00
Markus Rudy	937fd90779	agent: clear log pipes if denied by policy Container logs are forwarded to the agent through a unix pipe. These pipes have limited capacity and block the writer when full. If reading logs is blocked by policy, a common setup for confidential containers, the pipes fill up and eventually block the container. This commit changes the implementation of ReadStream such that it returns empty log messages instead of a policy failure (in case reading log messages is forbidden by policy). As long as the runtime does not encounter a failure, it keeps pulling logs periodically. In turn, this triggers the agent to flush the pipes. Fixes: #10680 Co-Authored-By: Aurélien Bombo <abombo@microsoft.com> Signed-off-by: Markus Rudy <mr@edgeless.systems>	2025-02-04 13:17:29 +01:00
Ruoqing He	8e073a6715	ci: Update yq to v4.44.5 to support riscv64 In v4.44.5 of `yq`, artifacts for riscv64 are released. Update the version used for `yq` and enable `install_yq.sh` to work on riscv64. Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-02-04 19:36:34 +08:00
Zvonko Kaiser	95c63f4982	Merge pull request #10827 from stevenhorsman/bump-golang-1.22.11 versions: Bump golang version	2025-02-03 16:06:56 -05:00
Zvonko Kaiser	7dc8060051	Merge pull request #10828 from stevenhorsman/fix-versions-comments versions: Fix formatting	2025-02-03 16:06:37 -05:00
stevenhorsman	546e3ae9ea	versions: Fix formatting The static_checks_versions test uses yamllint which fails with: ``` [comments] too few spaces before comment ``` many times and so makes code reviews more annoying with all these extra messages. Other it's probably not the worse issues, I checked the [yaml spec](https://yaml.org/spec/1.2.2/#66-comments) and it does say > Comments must be separated from other tokens by white space characters so it's easiest to fix it and move on. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-03 17:08:25 +00:00
Zvonko Kaiser	122ad95da6	Merge pull request #10751 from ryansavino/snp-upstream-host-kernel-support snp: update kata to use latest upstream packages for snp	2025-02-03 11:20:59 -05:00
stevenhorsman	d9eb1b0e06	versions: Bump golang version Bump golang versions so we are more up-to-date and have the extra security fixes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-03 15:28:53 +00:00
stevenhorsman	5203158195	workflows: Add delete kata-deploy timeouts for crio tests I've also seen cases (the qemu, crio, k0s tests) where Delete kata-deploy is still running for this test after 2 hours, and had to be manually cancelled, so let's try adding a 5m timeout to the kata-deploy delete to stop CI jobs hanging. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-02-03 11:45:43 +00:00
Greg Kurz	a806d74ce3	Merge pull request #10807 from kata-containers/dependabot/go_modules/src/tools/csi-kata-directvolume/go_modules-8d4d0c168c build(deps): bump github.com/golang/glog from 1.2.0 to 1.2.4 in /src/tools/csi-kata-directvolume in the go_modules group across 1 directory	2025-02-01 08:29:44 +01:00
Cameron Baird	b6b0addd5e	config: add hypervisor_loglevel Implement HypervisorLoglevel config option for clh. Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2025-01-31 18:37:03 +00:00
Steve Horsman	41f23f1d2a	Merge pull request #10823 from stevenhorsman/fix-virtiofsd-build-error packaging: virtiofsd: Allow building a specific commit	2025-01-31 16:18:02 +00:00
stevenhorsman	1cf1a332a5	packaging: virtiofsd: Allow building a specific commit #10714 added support for building a specific commit, but due to the clone only having `--depth=1`, we can only reset to a commit if it's the latest on the `main` branch, otherwise we will get: ``` + git clone --depth 1 --branch main https://gitlab.com/virtio-fs/virtiofsd virtiofsd Cloning into 'virtiofsd'... warning: redirecting to https://gitlab.com/virtio-fs/virtiofsd.git/ + pushd virtiofsd + git reset --hard cecc61bca981ab42aae6ec490dfd59965e79025e ... fatal: Could not parse object 'cecc61bca981ab42aae6ec490dfd59965e79025e'. ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-31 11:24:23 +00:00
Greg Kurz	0215d958da	Merge pull request #10805 from balintTobik/egrep_removal egrep/fgrep removal	2025-01-30 18:26:59 +01:00
Hyounggyu Choi	530fedd188	Merge pull request #10767 from BbolroC/enable-coldplug-vfio-ap-s390x Enable VFIO-AP coldplug for s390x	2025-01-30 12:11:00 +01:00
Balint Tobik	1943a1c96d	tests: replace egrep with grep -E to avoid deprecation warning https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-29 11:26:27 +01:00
Balint Tobik	47140357c4	docs: replace egrep/fgrep with grep -E/-F to avoid deprecation warning https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-29 11:25:54 +01:00
Ryan Savino	90e2b7d1bc	docs: updated build and host setup instructions for SNP Referenced AMD developer page for latest SEV firmware. Instructions to point to upstream 6.11 kernel or later. Referenced sev-utils and AMDESE fork for kernel setup. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-01-28 18:09:40 -06:00
Ryan Savino	c1ca49a66c	snp: set snp to use upstream qemu in config use upstream qemu in snp and nvidia snp configs. load ovmf with bios flag on qemu cmdline instead of file. Fixes: #10750 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-01-28 18:09:40 -06:00
Ryan Savino	af235fc576	Revert "builds: ovmf: Workaround Zeex repo becoming private" This reverts commit `aff3d98ddd`.	2025-01-28 18:09:40 -06:00
Ryan Savino	bb7ca954c7	ovmf: upgrade standard and sev ovmf ovmf upgraded to latest tag for standard and sev. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-01-28 18:09:40 -06:00
Ryan Savino	e87231edc7	snp: remove snp certs on qemu cmdline snp standard attestation with the upstream kernel and qemu do not support extended attestation with certs. Fixes: #10750 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2025-01-28 18:09:40 -06:00
Zvonko Kaiser	f9bbe4e439	Merge pull request #10785 from zvonkok/agent-cgv2-activate agent: Add proper activation param handling to activate cgroupV2	2025-01-28 14:21:15 -05:00
dependabot[bot]	df5eafd2a1	build(deps): bump github.com/golang/glog Bumps the go_modules group with 1 update in the /src/tools/csi-kata-directvolume directory: [github.com/golang/glog](https://github.com/golang/glog). Updates `github.com/golang/glog` from 1.2.0 to 1.2.4 - [Release notes](https://github.com/golang/glog/releases) - [Commits](https://github.com/golang/glog/compare/v1.2.0...v1.2.4) --- updated-dependencies: - dependency-name: github.com/golang/glog dependency-type: direct:production dependency-group: go_modules ... Signed-off-by: dependabot[bot] <support@github.com>	2025-01-28 17:38:14 +00:00
Fabiano Fidêncio	5e00a24145	Merge pull request #10749 from zvonkok/pass-through-stack gpu: Add driver version selection	2025-01-28 16:24:16 +01:00
Hyounggyu Choi	dde627cef4	test: Run full set of zcrypttest for VFIO-AP coldplug Previously, the test for VFIO-AP coldplug only checked whether a passthrough device was attached to the VM guest. This commit expands the test to include a full set of zcrypttest to verify that the device functions properly within a container. Additionally, since containerd has been upgraded to v1.7.25 on the test machine, it is no longer necessary to run the test via crictl. The commit removes all related codes/files. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Hyounggyu Choi	47db9b3773	agent: Run check_ap_device() for VFIO-AP coldplug This commit updates the device handler to call check_ap_device() instead of wait_for_ap_device() for VFIO-AP coldplug. The handler now returns a SpecUpdate for passthrough devices if the device is online (e.g., `/sys/devices/ap/card05/05.001f/online` is set to 1). Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Hyounggyu Choi	200cbfd0b0	kata-types: Introduce new type `vfio-ap-cold` for VFIO-AP coldplug This newly introduced type will be used by the VFIO-AP device handler on the agent. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Hyounggyu Choi	4a6ba534f1	runtime: Introduce new gRPC device type for VFIO-AP coldplug This commit introduces a new gRPC device type, `vfio-ap-cold`, to support VFIO-AP coldplug. This enables the VM guest to handle passthrough devices differently from VFIO-AP hotplug. With this new type, the guest no longer needs to wait for events (e.g., device addition) because the device already exists at the time the device type is checked. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Hyounggyu Choi	419b5ed715	runtime: Add DeviceInfo to Container for VFIO coldplug configuration Even though ociSpec.Linux.Devices is preserved when vfio_mode is VFIO, it has not been updated correctly for coldplug scenarios. This happens because the device info passed to the agent via CreateContainerRequest is dropped by the Kata runtime. This commit ensures that the device info is added to the sandbox's device manager when vfio_mode is VFIO and coldPlugVFIO is true (e.g., vfio-ap-cold), allowing ociSpec.Linux.Devices to be properly updated with the device information before the container is created on the guest. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-28 10:53:00 +01:00
Balint Tobik	233d15452b	runtime: replace egrep with grep -E to avoid deprecation warning https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-28 10:46:44 +01:00
Balint Tobik	e657f58cf9	ci: replace egrep with grep -E to avoid deprecation warning https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-28 10:46:44 +01:00
Zvonko Kaiser	9f2799ba4f	Merge pull request #10790 from JakubLedworowski/add-xattr-to-confidential-kernel kernel: Add CONFIG_TMPFS_XATTR to tdx.conf	2025-01-27 13:47:08 -05:00
Zvonko Kaiser	d2528ef84f	gpu: Initialize unbound variables rootfs.sh Since we're importing some build script for nvidia and we're setting set -u we have some unbound variables in rootfs.sh add initialization for those. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 18:37:21 +00:00
Zvonko Kaiser	9162103f85	agent: Update macro for e.g. String type stack-only types are handled properly with the parse_cmdline_param macro advancted types like String couldn't be guarded by a guard function since it passed the variable by value rather than reference. Now we can have guard functions for the String type parse_cmdline_param!( param, CGROUP_NO_V1, config.cgroup_no_v1, get_string_value, \| no_v1 \| no_v1 == "all" ); Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:43 +00:00
Zvonko Kaiser	aab9d36e47	agent: Add tests for cgroup_no_v1 The only valid value is "all", ignore all other Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:43 +00:00
Zvonko Kaiser	e1596f7abf	agent: Add option to parse cgroup_no_v1 For AGENT_INIT=yes we do not run systemd and hence systemd.unified_... does not mean anything to other init systems. Providing cgroup_no_v1=all is enough to signal other init systemd to use cgroupV2. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:43 +00:00
Zvonko Kaiser	cd7001612a	gpu: rootfs adjust for AGENT_INIT=no Since we're defaulting to AGENT_INIT=no for all the initrd/images adapt the NV build to properly get kata-agent installed. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:21 +00:00
Zvonko Kaiser	10974b7bec	gpu: AGENT_INIT=no We're setting globally for each initrd and image AGENT_INIT=no Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:21 +00:00
Zvonko Kaiser	98e0dc1676	gpu: Add set -u to scripts Make the scripts more robust by failing on unset varaibles Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:21 +00:00
Zvonko Kaiser	f153229865	gpu: Add driver version selection Besides latest and lts options add an option to specify the exact driver version. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-27 17:56:21 +00:00
Steve Horsman	311c3638c6	Merge pull request #10794 from fidencio/topic/bump-ubuntu-version-for-the-confidential-rootfs-and-initrd versions: Bump Ubuntu base image & initrd	2025-01-27 15:55:16 +00:00
Fabiano Fidêncio	84b0ca1b18	versions: Bump Ubuntu rootfs / initrd versions While I wish we could be bumping to the very same version everywhere, it's not possible and it's been quite a ride to get a combination of things that work. Let me try to describe my approach here: * Do NOT stay on 20.04 * This version will be EOL'ed by April * This version has a very old version of systemd that causes a bug when trying to online the cpusets for guests using systemd as init, causing then a breakage on the qemu-coco-non-tee and TDX non-attestation set of tests * Bump to 22.04 when possible * This was possible for the majority of the cases, but for the confidential initrd & confidential images for x86_64, the reason being failures on AMD SEV CI (which I didn't debug), and a kernel panic on the CentOS 9 Stream TDX machine * 22.04 is being used instead of 24.04 as multistrap is simply broken on Ubuntu 24.04, and I'd prefer to stay on an LTS release whenever it's possible * Bump to 24.10 for x86_64 image confidential * This was done as we got everything working with 24.10 in the CI. * This requires using libtdx-attest from noble (Ubuntu 24.04), as Intel only releases their sgx stuff for LTS releases. * Stick to 20.04 for x86_64 initrd confidential * 24.10 caused a panic on their CI * This is only being used by AMD so far, so they can decide when to bump, after doing the proper testing & debug that the bump will work as expected for them Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 15:08:20 +01:00
Carlos Segarra	b6e0effc06	tdx: bump version of libtdx-attest in rootfs-builder Bump libtdx-attest to its 1.22 release. Signed-off-by: Carlos Segarra <carlos@carlossegarra.com>	2025-01-27 15:08:20 +01:00
Fabiano Fidêncio	2b5dbfacb8	osbuilder: ubuntu: Try to install pyinstaller using --break-system-packages We first try without passing the `--break-system-packages` argument, as that's not supported on Ubuntu 22.04 or older, but that's required on Ubuntu 24.04 or newer. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 15:08:20 +01:00
Fabiano Fidêncio	c54f78bc6b	local-build: cache: Consider os name & version for image/initrd Otherwise a bump in the os name and / or os version would lead to the CI using a cached artefact. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 15:08:20 +01:00
Fabiano Fidêncio	4a66acc6f5	osbuilder: ubuntu: Abort if multistrap fails (but not on 20.04) We have gotten Ubuntu 20.04 working pretty much "by luck", as multistrap fails the deployment, and then a hacky function was introduced to add the proper dbus links. However, this does not scale at all, and we should: * Fail if multistrap fails * I won't do this for Ubuntu 20.04 as it's working for now and soon enough it'll be EOL * Add better logging to ensure someone can know when multistrap fails Below you can find the failure that we're hitting on Ubuntu 20.04: ```sh Errors were encountered while processing: dbus ERR: dpkg configure reported an error. Native mode configuration reported an error! I: Tidying up apt cache and list data. Multistrap system reported 1 error in /rootfs/. I: Tidying up apt cache and list data. ``` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 15:08:16 +01:00
Fabiano Fidêncio	585f82f730	osbuilder: ubuntu: Ensure OS_VERSION is passed & used Right now we're hitting an interesting situation with osbuilder, where regardless of what's being passed Ubuntu 20.04 (focal) is being used when building the rootfs-image, as shown in the snippets of the logs below: ``` ffidenci@tatu:~/src/upstream/kata-containers/kata-containers$ make rootfs-image-confidential-tarball /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build//kata-deploy-copy-libseccomp-installer.sh "agent" make agent-tarball-build ... make pause-image-tarball-build ... make coco-guest-components-tarball-build ... make kernel-confidential-tarball-build ... make rootfs-image-confidential-tarball-build make[1]: Entering directory '/home/ffidenci/src/upstream/kata-containers/kata-containers' /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build//kata-deploy-binaries-in-docker.sh --build=rootfs-image-confidential sha256:f16c57890b0e85f6e1bbe1957926822495063bc6082a83e6ab7f7f13cabeeb93 Build kata version 3.13.0: rootfs-image-confidential INFO: DESTDIR /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/rootfs-image-confidential/destdir INFO: Create image build image ~/src/upstream/kata-containers/kata-containers/tools/osbuilder ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/rootfs-image-confidential/builddir INFO: Build image INFO: image os: ubuntu INFO: image os version: latest Creating rootfs for ubuntu /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/rootfs.sh -o 3.13.0-13f0807e9f5687d8e5e9a0f4a0a8bb57ca50d00c-dirty -r /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/rootfs-image-confidential/builddir/rootfs-image/ubuntu_rootfs ubuntu INFO: rootfs_lib.sh file found. Loading content ~/src/upstream/kata-containers/kata-containers/tools/osbuilder/rootfs-builder/ubuntu ~/src/upstream/kata-containers/kata-containers/tools/osbuilder ~/src/upstream/kata-containers/kata-containers/tools/osbuilder INFO: rootfs_lib.sh file found. Loading content INFO: build directly WARNING: apt does not have a stable CLI interface. Use with caution in scripts. Get:1 http://security.ubuntu.com/ubuntu focal-security InRelease [128 kB] Get:2 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB] Get:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease [128 kB] Get:4 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [4276 kB] Get:5 http://archive.ubuntu.com/ubuntu focal-backports InRelease [128 kB] Get:6 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages [11.3 MB] Get:7 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [1297 kB] Get:8 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [30.9 kB] Get:9 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [4187 kB] Get:10 http://archive.ubuntu.com/ubuntu focal/restricted amd64 Packages [33.4 kB] Get:11 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages [1275 kB] Get:12 http://archive.ubuntu.com/ubuntu focal/multiverse amd64 Packages [177 kB] Get:13 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [4663 kB] Get:14 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1589 kB] Get:15 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [34.6 kB] Get:16 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [4463 kB] Get:17 http://archive.ubuntu.com/ubuntu focal-backports/main amd64 Packages [55.2 kB] Get:18 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [28.6 kB] Fetched 34.1 MB in 5s (6284 kB/s) ... ``` The reason this is happening is due to a few issues in different places: 1. IMG_OS_VERSION, passed to osbuilder, is not used anywhere and OS_VERSION should be used instead. And we should break if OS_VERSION is not properly passed down 2. Using UBUNTU_CODENAME is simply wrong, as it'll use whatever comes as the base container from kata-deploy's local-build scripts, and it has just been working by luck Note that at the same time this commit fixes the wrong behaviour, it would break the rootfses build as they are, this we need to set the versions.yaml to use 20.04 were it was already using 20.04 even without us knowing. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 14:19:42 +01:00
Fabiano Fidêncio	02a18c1359	versions: Clarify which release matches a codename It'll make the life of the developers not so familiar with Ubuntu easier. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 14:19:42 +01:00
Fabiano Fidêncio	ca96a6ac76	versions: Use Ubuntu codename instead of versions As this is required as part of the osbuilder tool to be able to properly set the repositories used when building the rootfs. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 14:19:39 +01:00
Fabiano Fidêncio	353ceb948e	versions: Don't use the yaml variable definitions While having variables are nice, those are more extensive to write down, and actually confusing for tired developer eyes to read, plus we're mixing the use of the yaml variables here and there together with not using them for some architectures. With the best "all or nothing" spirit, let's just make it easier for our developers to read the versions.yaml and easily understand what's being used. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-27 14:19:08 +01:00
Jakub Ledworowski	42531cf6c4	kernel: Add CONFIG_TMPFS_XATTR to confidential kernel During pull inside the guest, overlayfs expects xattrs. Fixes: [guest-components#876](https://github.com/confidential-containers/guest-components/issues/876) Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>	2025-01-27 07:07:54 +01:00
Zvonko Kaiser	b4c710576e	Merge pull request #10782 from stevenhorsman/clh-metrics-write-update metrics: Increase minval range for blogbench test	2025-01-24 10:21:20 -05:00
Steve Horsman	54e7e1fdc3	Merge pull request #10768 from kata-containers/dependabot/go_modules/src/runtime/go_modules-28d0d344dd build(deps): bump the go_modules group across 3 directories with 1 update	2025-01-24 12:04:56 +00:00
Greg Kurz	17f3eb0579	Merge pull request #10766 from balintTobik/remove_shebang Remove shebang in non-executable completion script	2025-01-24 12:29:03 +01:00
Alex Lyn	ee635293c6	Merge pull request #10740 from RuoqingHe/virtiofsd-riscv64 virtiofsd: Enable build for RISC-V	2025-01-24 15:43:56 +08:00
Zvonko Kaiser	f5c509d58e	Merge pull request #10779 from kata-containers/topic/arm64-static-build-runner workflows: Move arm static checks runner	2025-01-23 22:29:16 -05:00
Fabiano Fidêncio	4bc978416c	Merge pull request #10720 from fidencio/topic/test-cgroupsv2-on-guest kernel: Ensure no cgroupsv1 is used	2025-01-23 21:26:49 +01:00
Aurélien Bombo	66d292bdb4	Merge pull request #10732 from microsoft/danmihai/minor-systemd-cleanup rootfs: minor systemd file deletion cleanup	2025-01-23 11:29:25 -06:00
Fabiano Fidêncio	b47cc6fffe	cri-containerd: Skip TestDeviceCgroup till it's adapted to cgroupsv2 As the devices controller works in a different way in cgroupsv2, the "/sys/fs/cgroup/devices/devices.list" file simply doesn't exist. For now, let's skip the test till the test maintainer decides to re-enable it for cgroupsv2. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
Fabiano Fidêncio	0626d7182a	tests: k8s-cpu-ns: Adapt to cgroupsv2 The changes done are: * cpu/cpu.shares was replaced by cpu.weight * The weight, according to our reference[0], is calculated by: weight = (1 + ((request - 2) * 9999) / 262142) * cpu/cpu.cfs_quota_us & cpu/cpu.cfs_period_us were replaced by cpu.max, where quota and period are written together (in this order) [0]: https://github.com/containers/crun/blob/main/crun.1.md#cgroup-v2 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
Fabiano Fidêncio	4307f0c998	Revert "ci: mariner: Ensure kernel_params can be set" This reverts commit `091ad2a1b2`, in order to ensure tests would be running with cgroupsv2 on the guest. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
Fabiano Fidêncio	c653719270	kernel: Ensure no cgroupsv1 is used Let's ensure that we're fully running the guest on cgroupsv2. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 17:25:56 +01:00
stevenhorsman	d031e479ab	metrics: Increase minval range for blogbench test In the last couple of days I've seen the blogbench metrics write latency test on clh fail a few times because the latency was too low, so adjust the minimum range to tolerate quicker finishes. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-23 15:58:31 +00:00
Fabiano Fidêncio	66d881a5da	Merge pull request #10755 from fidencio/topic/ensure-systemd-is-used-as-init-for-coco-cases rootfs-confidential: Ensure systemd is used as init	2025-01-23 15:25:24 +01:00
stevenhorsman	3acce82c91	ci: Update gatekeeper tests for static workflow The static-checks targets are `pull_request`, so they can run the PR workflow version, so we want to update the required-tests.yaml so that static-check workflow changes do trigger static checks in order to test them properly. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-23 14:23:09 +00:00
stevenhorsman	d625f20d18	workflows: Move arm static checks runner Now we have the build-assets running on the gh-hosted runners, try the same approach for the static-checks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-23 14:23:09 +00:00
Zvonko Kaiser	a23d6a1241	Merge pull request #10777 from zvonkok/arm64-nvidia-gpu-kernel gpu: Fix arm64 kernel build	2025-01-23 07:14:30 -05:00
Christophe de Dinechin	9a92a4bacf	cli: Remove shebang in non-executable completion script Raised during package review [1] by rpmlint [1] https://bugzilla.redhat.com/show_bug.cgi?id=1590425#c8 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> Signed-off-by: Balint Tobik <btobik@redhat.com>	2025-01-23 13:11:25 +01:00
Fabiano Fidêncio	734ef71cf7	tests: k8s: confidential: Cleanup $HOME/.ssh/known_hosts I've noticed the following error when running the tests with SEV: ``` 2025-01-21T17:10:28.7999896Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2025-01-21T17:10:28.8000614Z # @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ 2025-01-21T17:10:28.8001217Z # @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2025-01-21T17:10:28.8001857Z # IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! 2025-01-21T17:10:28.8003009Z # Someone could be eavesdropping on you right now (man-in-the-middle attack)! 2025-01-21T17:10:28.8003348Z # It is also possible that a host key has just been changed. 2025-01-21T17:10:28.8004422Z # The fingerprint for the ED25519 key sent by the remote host is 2025-01-21T17:10:28.8005019Z # SHA256:x7wF8zI+LLyiwphzmUhqY12lrGY4gs5qNCD81f1Cn1E. 2025-01-21T17:10:28.8005459Z # Please contact your system administrator. 2025-01-21T17:10:28.8006734Z # Add correct host key in /home/kata/.ssh/known_hosts to get rid of this message. 2025-01-21T17:10:28.8007031Z # Offending ED25519 key in /home/kata/.ssh/known_hosts:178 2025-01-21T17:10:28.8007254Z # remove with: 2025-01-21T17:10:28.8008172Z # ssh-keygen -f "/home/kata/.ssh/known_hosts" -R "10.244.0.71" ``` And this was causing a failure to ssh into the confidential pod. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 12:04:13 +01:00
Fabiano Fidêncio	18137b1583	tests: k8s: confidential: Increase log_buf_len to 4M Relying on dmesg is really not ideal, as we may lose important info, mainly those which happen very early in the boot, depending on the size of kernel ring buffer. So, for this specific test, let's increase the kernel ring buffer, by default, to 4M. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 12:04:13 +01:00
Fabiano Fidêncio	d5f907dcf1	rootfs-confidential: Ensure systemd is used as init Let's make sure that we don't use Kata Containers' agent as init for the Confidential related rootfses, as we don't want to increase the agent's complexity for no reason ... mainly when we can rely on a proper init system. : - images already used systemd as init - initrds are now using systemd as init Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-23 12:04:13 +01:00
dependabot[bot]	d2cb14cdbc	build(deps): bump the go_modules group across 3 directories with 1 update Bumps the go_modules group with 1 update in the /src/runtime directory: [golang.org/x/net](https://github.com/golang/net). Bumps the go_modules group with 1 update in the /src/tools/csi-kata-directvolume directory: [golang.org/x/net](https://github.com/golang/net). Bumps the go_modules group with 1 update in the /tools/testing/kata-webhook directory: [golang.org/x/net](https://github.com/golang/net). Updates `golang.org/x/net` from 0.25.0 to 0.33.0 - [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0) Updates `golang.org/x/net` from 0.23.0 to 0.33.0 - [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0) Updates `golang.org/x/net` from 0.23.0 to 0.33.0 - [Commits](https://github.com/golang/net/compare/v0.25.0...v0.33.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/net dependency-type: direct:production dependency-group: go_modules - dependency-name: golang.org/x/net dependency-type: indirect dependency-group: go_modules ... Signed-off-by: dependabot[bot] <support@github.com>	2025-01-23 10:18:22 +00:00
Fupan Li	944eb2cf3f	Merge pull request #10762 from teawater/remove_enable_swap libs/kata-types: Remove config enable_swap	2025-01-23 14:03:42 +08:00
Fupan Li	ebd8ec227b	Merge pull request #10778 from zvonkok/kata-agent-cgroupsV2 agent: Ensure proper cgroupsV2 handling with init_mode=true	2025-01-23 14:00:13 +08:00
Zvonko Kaiser	afd286f6d6	agent: Ensure proper cgroupsV2 with init_mode=yes When the agent is run as the init process cgroupfs is being setup. In the case of cgroupsV1 we needed to enable the memory hiearchy this is now per default enabled in cgroupsV2. Additionally the file /sys/fs/cgroup/memory/memory.use_hierarchy isn't even available with V2. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-23 03:54:51 +00:00
Fabiano Fidêncio	3f8abb4da7	Merge pull request #10776 from kata-containers/topic/arm64-runners workflows: Switch to github-hosted arm runners	2025-01-22 23:14:28 +01:00
Zvonko Kaiser	91c6d524f8	gpu: Fix arm64 kernel build CONFIG_IOASID (not configurable) in newer kernels. Removing it. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-22 18:15:57 +00:00
Fabiano Fidêncio	6baa60d77d	Merge pull request #10775 from fidencio/topic/update-ttrpc-crate agent: Update ttrpc to include the fix for connectivity issues	2025-01-22 17:45:38 +01:00
stevenhorsman	ab27e11d31	workflows: Switch to github-hosted arm runner Now that gituhb have hosted arm runners https://github.blog/changelog/2025-01-16-linux-arm64-hosted-runners-now-available-for-free-in-public-repositories-public-preview/ we should try and switch our arm64 builder jobs to run on these. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-22 16:27:17 +00:00
Greg Kurz	90b6d5725b	Merge pull request #10773 from RuoqingHe/retry-on-aks-throttle ci: Retry on failure of Create AKS cluster	2025-01-22 15:30:57 +01:00
Ruoqing He	373a388844	ci: Retry on failure of Create AKS cluster The `Create AKS cluster` step in `run-k8s-tests-on-aks.yaml` is likely to fail fail since we are trying to issue `PUT` to `aks` in a relatively high frequency, while the `aks` end has it's limit on `bucket-size` and `refill-rate`, documented here [1]. Use `nick-fields/retry@v3` to retry in 10 seconds after request fail, based on observations that AKS were request 7, or 8 second delays before retry as part of their 429 response [1] https://learn.microsoft.com/en-us/azure/aks/quotas-skus-regions#throttling-limits-on-aks-resource-provider-apis Fixes: #10772 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-22 13:24:51 +00:00
Fabiano Fidêncio	a8678a7794	deps: Update ttrpc to v0.8.4 Update the ttrpc crate to include the fix from Moritz Sanft, which solves the connectivity issues with 6.12.x kernels* *: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.12.9&id=3257813a3ae7462ac5cde04e120806f0c0776850 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-22 13:05:43 +01:00
Fabiano Fidêncio	e71bc1f068	Merge pull request #10770 from zvonkok/gpu_kernel_dep gpu: Add kernel dep for the non coco use-case	2025-01-22 12:53:39 +01:00
Greg Kurz	17d053f4bb	Merge pull request #10711 from teawater/balloon Add reclaim_guest_freed_memory config to qemu and cloud-hypervisor	2025-01-22 10:57:13 +01:00
Hui Zhu	c148b70da7	libs/kata-types: Remove config enable_swap Remove config enable_swap because there is no code use it. Fixes: #10761 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-01-22 11:08:45 +08:00
Aurélien Bombo	4e9d1363b3	Merge pull request #10754 from sprt/sprt/ci-gh-pr-number-coco ci: Unify on `$GH_PR_NUMBER` environment variable	2025-01-21 15:07:24 -06:00
Zvonko Kaiser	4621f53e4a	gpu: Add kernel dep for the non coco use-case Add the kernel dependency to the non coco use-case so that a rootfs build can be executed via GHA. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-21 16:18:14 +00:00
Zvonko Kaiser	61c282c725	Merge pull request #10769 from kata-containers/revert-10764-gpu_ci_cd Revert "gpu: Add rootfs target amd64/arm64"	2025-01-21 11:09:52 -05:00
Zvonko Kaiser	9fd430e46b	Revert "gpu: Add rootfs target amd64/arm64" Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-21 16:08:30 +00:00
Zvonko Kaiser	ef1639b6bf	Merge pull request #10764 from zvonkok/gpu_ci_cd gpu: Add rootfs target amd64/arm64	2025-01-21 09:51:20 -05:00
Ruoqing He	7e76ef587a	virtiofsd: Enable build for RISC-V With this change, `virtiofsd` (gnu target) could be built and then to be used with other components. Depends: #10741 Fixes: #10739 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-01-21 18:05:37 +08:00
Hui Zhu	185b94b7fa	runtime-rs: Add reclaim_guest_freed_memory cloud-hypervisor support Add reclaim_guest_freed_memory config to cloud-hypervisor in runtime-rs. Fixes: #10710 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-01-21 10:34:21 +08:00
Hui Zhu	487171d992	runtime-rs: Add reclaim_guest_freed_memory qemu support Add reclaim_guest_freed_memory config to qemu in runtime-rs. Fixes: #10710 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-01-21 10:34:18 +08:00
Hui Zhu	8f550de88a	runtime-rs: db: Change config enable_balloon_f_reporting Change config enable_balloon_f_reporting of db to reclaim_guest_freed_memory. Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-01-21 10:34:08 +08:00
Hui Zhu	42f5ef9ff1	kernel: config: Add CONFIG_VIRTIO_BALLOON to virtio.conf Add CONFIG_VIRTIO_BALLOON to virtio.conf to open virtio-balloon. Fixes: #10710 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2025-01-21 10:34:04 +08:00
Zvonko Kaiser	8b097244e7	gpu: Add rootfs initrd build for arm64 We need the arm64 builds as well for GH and GB systems. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-20 19:03:52 +00:00
Zvonko Kaiser	f525631522	gpu: Add rootfs target amd64 Adding the initrd build first to get the rootfs on amd64. With that we can start to add tests. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-20 19:01:42 +00:00
Zvonko Kaiser	d7059e9024	Merge pull request #10736 from zvonkok/gpu-rootfs-fix gpu: Fix rootfs build	2025-01-17 14:44:41 -05:00
Aurélien Bombo	0d70dc31c1	ci: Unify on $GH_PR_NUMBER environment variable While working on #10559, I realized that some parts of the codebase use $GH_PR_NUMBER, while other parts use $PR_NUMBER. Notably, in that PR, since I used $GH_PR_NUMBER for CoCo non-TEE tests without realizing that TEE tests use $PR_NUMBER, the tests on that PR fail on TEEs: https://github.com/kata-containers/kata-containers/actions/runs/12818127344/job/35744760351?pr=10559#step:10:45 ... 44 error: error parsing STDIN: error converting YAML to JSON: yaml: line 90: mapping values are not allowed in this context ... 135 image: ghcr.io/kata-containers/csi-kata-directvolume: ... So let's unify on $GH_PR_NUMBER so that this issue doesn't repro in the future: I replaced all instances of PR_NUMBER with GH_PR_NUMBER. Note that since some test scripts also refer to that variable, the CI for this PR will fail (would have also happened with the converse substitution), hence I'm not adding the ok-to-test label and we should force-merge this after review. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2025-01-17 10:53:08 -06:00
Fabiano Fidêncio	c018a1cc61	Merge pull request #10741 from RuoqingHe/update-virtiofsd-build-image virtiofsd: Update ubuntu to 22.04 for gnu target	2025-01-16 20:51:10 +01:00
Zvonko Kaiser	2777b13db7	Merge pull request #10742 from zvonkok/3.13.0-release release: Bump version to 3.13.0	2025-01-16 10:05:48 -05:00
Ruoqing He	c70195d629	virtiofsd: Update ubuntu to 22.04 for gnu target With ubuntu 20.04 image, virtiofsd gnu target couldn't be built due to "unsupported ISA subset z" reported by "cc". Updating to ubuntu 22.04 image addresses this problem. Relates: #10739 Signed-off-by: Ruoqing He <heruoqing@iscas.ac.cn>	2025-01-16 17:27:38 +08:00
Zvonko Kaiser	e82fdee20f	runtime: Add proper IOMMUFD parsing With newer kernels we have a new backend for VFIO called IOMMUFD this is a departure from VFIO IOMMU Groups since it has only one device associated with an IOMMUFD entry. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-15 23:39:33 +00:00
Zvonko Kaiser	f0bd83b073	gpu: Fix rootfs build The pyinstaller is located per default under /usr/local/bin some prior versions were installing it to ${HOME}. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-15 20:37:51 +00:00
Aurélien Bombo	0d93f59f5b	Merge pull request #10738 from microsoft/danmihai1/empty-pty-lines runtime: skip empty Guest console output lines	2025-01-15 10:33:24 -06:00
Zvonko Kaiser	0b04f43ac6	release: Bump version to 3.13.0 Bump VERSION and helm-chart versions Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2025-01-15 16:13:22 +00:00
Zvonko Kaiser	365def9b4a	Merge pull request #10735 from BbolroC/kubectl-create-retry-trusted-storage tests: Introduce retry_kubectl_apply() for trusted storage	2025-01-14 21:59:45 -05:00
Dan Mihai	2e21f51375	runtime: skip empty Guest console output lines Skip logging empty lines of text from the Guest console output, if there are any such lines. Without this change, the Guest console log from CLH + /dev/pts/0 has twice as many lines of text. Half of these lines are empty. Fixes: #10737 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-15 00:28:26 +00:00
Hyounggyu Choi	f7816e9206	tests: Introduce retry_kubectl_apply() for trusted storage On s390x, some tests for trusted storage occasionally failed due to: ```bash etcdserver: request timed out ``` or ```bash Internal error occurred: resource quota evaluation timed out ``` These timeouts were not observed previously on k3s but occur sporadically on kubeadm. Importantly, they appear to be temporary and transient, which means they can be ignored in most cases. To address this, we introduced a new wrapper function, `retry_kubectl_apply()`, for `kubectl create`. This function retries applying a given manifest up to 5 times if it fails due to a timeout. However, it will still catch and handle any other errors during pod creation. Fixes: #10651 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-14 21:15:44 +01:00
Fabiano Fidêncio	121ac0c5c0	Merge pull request #10727 from microsoft/danmihai1/mariner3-guest image: bump mariner guest version to 3.0	2025-01-14 19:06:28 +01:00
Fabiano Fidêncio	3658ea2320	Merge pull request #10731 from microsoft/danmihai1/quiet-rootfs-build rootfs: reduced console output by default	2025-01-14 19:02:42 +01:00
Chengyu Zhu	7d34ca4420	Merge pull request #10674 from bpradipt/fix-10398 agent: alternative implementation for sealed_secret as volume	2025-01-14 18:55:45 +08:00
Fabiano Fidêncio	4578969c5d	Merge pull request #10730 from BbolroC/bump-coco-trustee versions: Bump trustee to latest	2025-01-14 08:56:11 +01:00
Dan Mihai	c4da296326	rootfs: delete links to deleted files Delete symbolic links to files being deleted. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-13 21:28:44 +00:00
Dan Mihai	5b8471ffce	rootfs: print the path to files being deleted Show the list of files being deleted. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-13 21:28:34 +00:00
Dan Mihai	a49d0fb343	rootfs: delete systemd units/files from rootfs.sh Move the deletion of unnecessary systemd units and files from image_builder.sh into rootfs.sh. The files being deleted can be applicable to other image file formats too, not just to the rootfs-image format created by image_builder.sh. Also, image_builder.sh was deleting these files after it calculated the size of the rootfs files, thus missing out on the opportunity to possibly create a smaller image file. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-13 21:28:23 +00:00
Dan Mihai	0f522c09d9	rootfs: reduced console output by default Use "set -x" only when the user specified DEBUG=1. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-13 19:34:05 +00:00
Pradipta Banerjee	36580bb642	tests: Update sealed secret CI value to base64url The existing encoding was base64 and it fails due to `874948638a` Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>	2025-01-13 09:37:05 -05:00
Hyounggyu Choi	2cdb549a75	versions: Bump trustee to latest This update addresses an issue with token verification for SE and SNP introduced in the last update by #10541. Bumping the project to the latest commit resolves the issue. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-13 15:07:33 +01:00
Pradipta Banerjee	5218345e34	agent: alternative implementation for sealed_secret as volume The earlier implementation relied on using a specific mount-path prefix - `/sealed` to determine that the referenced secret is a sealed secret. However that was restrictive for certain use cases as it forced the user to always use a specific mountpath naming convention. This commit introduces an alternative implementation to relax the restriction. A sealed secret can be mounted in any mount-path. However it comes with a potential performance penality. The implementation loops through all volume mounts and reads the file to determine if it's a sealed secret or not. Fixes: #10398 Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>	2025-01-11 12:36:44 -05:00
Dan Mihai	4707883b40	image: bump mariner guest version to 3.0 Use Mariner 3.0 (a.k.a., Azure Linux 3.0) as the Guest CI image. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2025-01-11 17:36:19 +00:00
Fabiano Fidêncio	2d9baf899a	Merge pull request #10719 from msanft/msanft/runtime/fix-boolean-opts runtime: use actual booleans for QMP `device_add` boolean options	2025-01-11 16:38:06 +01:00
Zvonko Kaiser	f08a9eac11	Merge pull request #10721 from stevenhorsman/more-metrics-latency-minimum-range-fixes metrics: Increase latency test range	2025-01-10 21:59:39 -05:00
Moritz Sanft	e5735b221c	runtime: use actual booleans for QMP `device_add` boolean options Since `be93fd5372`, which is included in QEMU since version 9.2.0, the options for the `device_add` QMP command need to be typed correctly. This makes it so that instead of `"on"`, the value is set to `true`, matching QEMU's expectations. This has been tested on QEMU 9.2.0 and QEMU 9.1.2, so before and after the change. The compatibility with incorrectly typed options for the `device_add` command is deprecated since version 6.2.0 [^1]. [^1]: https://qemu-project.gitlab.io/qemu/about/deprecated.html#incorrectly-typed-device-add-arguments-since-6-2 Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>	2025-01-10 11:53:56 +01:00
Wainer Moschetta	5fae2a9f91	Merge pull request #9871 from wainersm/fix-print_cluster_name tests/gha-run-k8s-common: shorten AKS cluster name	2025-01-09 14:35:02 -03:00
stevenhorsman	aaae5b6d0f	metrics: clh: Increase network-iperf3 range We hit a failure with: ``` time="2025-01-09T09:55:58Z" level=warning msg="Failed Minval (0.017600 > 0.015000) for [network-iperf3]" ``` The range is very big, but in the last 3 test runs I reviewed we have got a minimum value of 0.015s and a max value of 0.052, so there is a ~350% difference possible so I think we need to have a wide range to make this stable. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-09 11:25:57 +00:00
stevenhorsman	e946d9d5d3	metrics: qemu: Increase latency test range After the kernel version bump, in the latest nightly run https://github.com/kata-containers/kata-containers/actions/runs/12681309963/job/35345228400 The sequential read throughput result was 79.7% of the expected (so failed) and the sequential write was 84% of the expected, so was fairly close, so increase their minimum ranges to make them more robust. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-09 11:25:50 +00:00
Wainer dos Santos Moschetta	badc208e9a	tests/gha-run-k8s-common: shorten AKS cluster name Because az client restricts the name to be less than 64 characters. In some cases (e.g. KATA_HYPERVISOR=qemu-runtime-rs) the generated name will exceed the limit. This changed the function to shorten the name: * SHA1 is computed from metadata then compound the cluster's name * metadata as plain-text are passed as --tags Fixes: #9850 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2025-01-08 16:39:07 -03:00
Fabiano Fidêncio	8f8988fcd1	Merge pull request #10714 from fidencio/topic/update-virtiofsd virtiofsd: Update to its v1.13.0 ( + one patch) release :-)	2025-01-08 17:59:29 +01:00
Fabiano Fidêncio	7e5e109255	Merge pull request #10541 from fitzthum/bump-trustee-010 Update Trustee and Guest Components	2025-01-08 17:44:13 +01:00
Fabiano Fidêncio	eb3fe0d27c	Merge pull request #10717 from fidencio/topic/re-enable-oom-test-for-mariner tests: Re-enable oom tests for mariner	2025-01-08 17:43:56 +01:00
Fabiano Fidêncio	65e267294b	Merge pull request #10718 from stevenhorsman/metrics-blogbench-latency-minimal-range-increase metrics: Increase latency minimum range	2025-01-08 17:09:36 +01:00
stevenhorsman	dc069d83b5	metrics: Increase latency test range The bump to kernel 6.12 seems to have reduced the latency in the metrics test, so increase the ranges for the minimal value, to account for this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2025-01-08 15:11:49 +00:00
Fabiano Fidêncio	967d5afb42	Revert "tests: k8s: Skip one of the empty-dir tests" This reverts commit `9aea7456fb`. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-08 14:07:34 +01:00
Fabiano Fidêncio	7ae2ca4c31	virtiofsd: Update to its v1.13.0 + one patch release Together with the bump, let's also bump the rust version needed to build the package, with the caveat that virtiofsd doesn't actually use a pinned version as part of their CI, so we're bumping to whatever is the version on `alpine:rust` (which is used in their CI). It's important to note that we're using a version which brings in one extra patch apart from the release, as the next virtiofsd release will happen at the end of February, 2025. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-08 14:07:34 +01:00
Fabiano Fidêncio	0af3536328	packaging: virtiofsd: Allow building a specific commit Right now we've been only building releases from virtiofsd, but we'll need to pin a specific commit till v1.14.0 is out, thus let's add the needed machinery to do so. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-08 14:07:34 +01:00
Tobin Feldman-Fitzthum	41c7f076fa	packaging: updating guest components build script The guest-components directory has been re-arranged slightly. Adjust the installation path of the LUKS helper script to account for this. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-01-07 16:59:10 -06:00
Tobin Feldman-Fitzthum	cafc7d6819	versions: update trustee and guest components Trustee has some new features including a plugin backend, support for PKCS11 resources, improvements to token verification, and adjustments to logging, and more. Also update guest-components to pickup improvements and keep the KBS protocol in sync. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2025-01-07 16:59:10 -06:00
Fabiano Fidêncio	53ac0f00c5	tests: Re-enable oom tests for mariner Since we bumped to the 6.12.x LTS kernel, we've also adjusted the aggressivity of the OOM test, which may be enough to allow us to re-enable it for mariner. Fixes: #8821 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-07 18:33:17 +01:00
Fabiano Fidêncio	f4a39e8c40	Merge pull request #10468 from fidencio/topic/early-tests-on-next-lts-kernel versions: Move kernel to the latest 6.12 release (the current LTS)	2025-01-07 18:02:04 +01:00
Fupan Li	bd56891f84	Merge pull request #10702 from lifupan/fix_containerdname CI: change the containerd tarball name from cri-containerd-cni to containerd	2025-01-07 18:56:15 +08:00
Fupan Li	b19db40343	CI: change the containerd tarball name to containerd Since from https://github.com/containerd/containerd/pull/9096 containerd removed cri-containerd-*.tar.gz release bundles, thus we'd better change the tarball name to "containerd". BTW, the containerd tarball containerd the follow files: bin/ bin/containerd-shim bin/ctr bin/containerd-shim-runc-v1 bin/containerd-stress bin/containerd bin/containerd-shim-runc-v2 thus we should untar containerd into /usr/local directory instead of "/" to keep align with the cri-containerd. In addition, there's no containerd.service file,runc binary and cni-plugin included, thus we should add a specific containerd.service file and install install the runc binary and cni-pluginspecifically. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2025-01-07 17:39:05 +08:00
Fabiano Fidêncio	9aea7456fb	tests: k8s: Skip one of the empty-dir tests An issue has been created for this, and we should fix the issue before the next release. However, for now, let's unblock the kernel bump and have the test skipped. Reference: https://github.com/kata-containers/kata-containers/issues/10706 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-06 21:48:20 +01:00
Fabiano Fidêncio	44ff602c64	tests: k8s: Be more aggressive to get OOM Let's increase the amount of bytes allocated per VM worker, so we can hit the OOM sooner. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2025-01-06 21:48:20 +01:00
Fabiano Fidêncio	f563f0d3fc	versions: Update kernel to v6.12.8 There are lots of configs removed from latest kernel. Update them here for convenience of next kernel upgrade. Remove CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE [1] Remove CONFIG_IP_NF_TARGET_CLUSTERIP [2] Remove CONFIG_NET_SCH_CBQ [3] Remove CONFIG_AUTOFS4_FS [4] Remove CONFIG_EMBEDDED [5] Remove CONFIG_ARCH_RANDOM & CONFIG_RANDOM_TRUST_CPU [6] [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=a7e4676e8e2cb158a4d24123de778087955e1b36 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=9db5d918e2c07fa09fab18bc7addf3408da0c76f [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=051d442098421c28c7951625652f61b1e15c4bd5 [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=1f2190d6b7112d22d3f8dfeca16a2f6a2f51444e [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=ef815d2cba782e96b9aad9483523d474ed41c62a [6] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.2&id=b9b01a5625b5a9e9d96d14d4a813a54e8a124f4b Apart from the removals, CONFIG_CPU_MITIGATIONS is now a dependency for CONFIG_RETPOLINE (which has been renamed to CONFIG_MITIGATION_RETPOLINE) and CONFIG_PAGE_TABLE_ISOLATION (which has been renamed to CONFIG_MITIGATION_PAGE_TABLE_ISOLATION). I've added that to the whitelist because we still build older versions of the kernel that do not have that dependency. Fixes: #8408 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2025-01-06 21:48:20 +01:00
Xuewei Niu	71b14d40f2	Merge pull request #10696 from teawater/kt kata-ctl: direct-volume: Auto create KATA_DIRECT_VOLUME_ROOT_PATH	2025-01-02 14:04:37 +08:00
Hui Zhu	d15a7baedd	kata-ctl: direct-volume: Auto create KATA_DIRECT_VOLUME_ROOT_PATH Got following issue: kata-ctl direct-volume add /kubelet/kata-direct-vol-002/directvol002 "{\"device\": \"/home/t4/teawater/coco/t.img\", \"volume-type\": \"directvol\", \"fstype\": \"\", \"metadata\":"{}", \"options\": []}" subsystem: kata-ctl_main Dec 30 09:43:41.150 ERRO Os { code: 2, kind: NotFound, message: "No such file or directory", } The reason is KATA_DIRECT_VOLUME_ROOT_PATH is not exist. This commit create_dir_all KATA_DIRECT_VOLUME_ROOT_PATH before join_path to handle this issue. Fixes: #10695 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-30 17:55:49 +08:00
Xuewei Niu	6400295940	Merge pull request #10683 from justxuewei/nxw/remove-mut	2024-12-29 00:49:38 +08:00
Fupan Li	2068801b80	Merge pull request #10626 from teawater/ma Add mem-agent to kata	2024-12-24 14:11:36 +08:00
Steve Horsman	2322f6df94	Merge pull request #10686 from stevenhorsman/ppc64le-all-prepare-steps-timeout workflows: Add more ppc64le timeouts	2024-12-20 19:08:48 +00:00
stevenhorsman	9b6fce9e96	workflows: Add more ppc64le timeouts Unsurprisingly now we've got passed the containerd test hangs on the ppc64le, we are hitting others in the "Prepare the self-hosted runner" stage, so add timeouts to all of them to avoid CI blockages. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 17:31:24 +00:00
Steve Horsman	162e2af4f5	Merge pull request #10685 from stevenhorsman/ppc64le-containerd-test-timeout workflows: Add timeout to some ppc64le steps	2024-12-20 16:55:40 +00:00
stevenhorsman	d9d8d53bea	workflows: Add timeout to some ppc64le steps In some runs e.g. https://github.com/kata-containers/kata-containers/actions/runs/12426384186/job/34697095588 and https://github.com/kata-containers/kata-containers/actions/runs/12422958889/job/34697016842 we've seen the Prepare the self-hosted runner and Install dependencies steps get stuck for 5hours+. If they are working then it should take a few minutes, so let's add timeouts and not hold up whole the CI if they are stuck Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 16:37:36 +00:00
Steve Horsman	99f239bc44	Merge pull request #10380 from stevenhorsman/required-tests-guidance doc: Add required jobs info	2024-12-20 16:24:42 +00:00
stevenhorsman	d1d4bc43a4	static-checks: Add words to dictionary devmapper and snapshotters are being marked as spelling errors, so add them to the kata dictionary Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 14:16:52 +00:00
stevenhorsman	7612839640	doc: Add required jobs info Add information about what required jobs are and our initial guidelines for how jobs are eligible for being made required, or non-required Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-20 14:12:13 +00:00
Xuewei Niu	ecf98e4db8	runtime-rs: Remove unneeded `mut` from `new_hypervisor()` `set_hypervisor_config()` and `set_passfd_listener_port()` acquire inner lock, so that `mut` for `hypervisor` is unneeded. Fixes: #10682 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-12-20 17:08:10 +08:00
Steve Horsman	2c6126d3ab	Merge pull request #10676 from stevenhorsman/fix-qemu-coco-dev-skip tests: Fix qemu-coc-dev skip	2024-12-20 08:56:54 +00:00
Xuewei Niu	ea60613be9	Merge pull request #9387 from deagon/fix-broken-usage packaging: fix the broken usage help	2024-12-20 15:20:37 +08:00
Guoqiang Ding	75baf75726	packaging: fix the broken usage help Using the plain usage text instead of the bad variable reference. Fixes: #9386 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-12-20 13:58:40 +08:00
stevenhorsman	dd02b6699e	tests: Fix qemu-coc-dev skip Fix the logic to make the test skipped on qemu-coco-dev, rather than the opposite and update the syntax to make it clearer as it incorrectly got written and reviewed by three different people in it's prior form. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-19 19:50:46 +00:00
Steve Horsman	79495379e2	Merge pull request #10668 from stevenhorsman/update-release-process-post-3.12 doc: Update the release process	2024-12-19 14:16:30 +00:00
Steve Horsman	99b9ef4e5a	Merge pull request #10675 from stevenhorsman/release-repeat-abort release: Abort if release version exists	2024-12-19 11:55:44 +00:00
stevenhorsman	c3f13265e4	doc: Update the release process Add a step to wait for the payload publish to complete before running the release action. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-19 09:52:39 +00:00
Zvonko Kaiser	f2d72874a1	Merge pull request #10620 from kata-containers/topic/fix-remove-artifact-ordering workflows: Remove potential timing issues with artifacts	2024-12-18 13:22:12 -05:00
Zvonko Kaiser	fc2c77f3b6	Merge pull request #10669 from zvonkok/qemu-aarch64-fix qemu: Fix aarch64 build	2024-12-18 08:26:55 -05:00
stevenhorsman	e2669d4acc	release: Abort if release version exists In order to check that we don't accidentally overwrite release artifacts, we should add a check if the release name already exists and bail if it does. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-18 11:04:19 +00:00
Zvonko Kaiser	07d2b00863	qemu: Fix aarch64 build Building static binaries for aarch64 requires disabling PIE We get an GOT overflow and the OS libraries are only build with fpic and not with fPIC which enables unlimited sized GOT tables. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-18 03:26:14 +00:00
Zvonko Kaiser	39bf10875b	Merge pull request #10663 from zvonkok/3.12.0-relase release: Bump version to 3.12.0	2024-12-17 10:00:42 -05:00
Zvonko Kaiser	28b57627bd	release: Bump version to 3.12.0 Bump VERSION and helm-chart versions Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-16 18:41:51 +00:00
Xuewei Niu	02b5fa15ac	Merge pull request #10655 from liubogithub/patch-1 kata-ctl: fix outdated comments	2024-12-16 13:11:25 +08:00
Hyounggyu Choi	cfbc425041	Merge pull request #10660 from BbolroC/fix-leading-zero-issue-for-vfio-ap vfio-ap: Assign default string "0" for empty APID and APQI	2024-12-13 17:40:29 +01:00
Hyounggyu Choi	341e5ca58e	vfio-ap: Assign default string "0" for empty APID and APQI The current script logic assigns an empty string to APID and APQI when APQN consists entirely of zeros (e.g., "00.0000"). However, this behavior is incorrect, as "00" and "0000" are valid values and should be represented as "0". This commit ensures that the script assigns the default string “0” to APID and APQI if their computed values are empty. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-12-13 14:39:03 +01:00
Liu Bo	95fc585103	kata-ctl: fix outdated comments MgmnClient can also tolerate short sandbox id. Signed-off-by: Liu Bo <liub.liubo@gmail.com>	2024-12-12 21:59:54 -08:00
stevenhorsman	cf8b82794a	workflows: Only remove artifacts in release builds Due to the agent-api tests requiring the agent to be deployed in the CI by the tarball, so in the short-term lets only do this on the release stage, so that both kata-manager works with the release and the agent-api tests work with the other CI builds. In the longer term we need to re-evaluate what is in our tarballs (issue #10619), but want to unblock the tests in the short-term. Fixes: #10630 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-12 17:38:27 +00:00
stevenhorsman	e1f6aca9de	workflows: Remove potential timing issues with artifacts With the code I originally did I think there is potentially a case where we can get a failure due to timing of steps. Before this change the `build-asset-shim-v2` job could start the `get-artifacts` step and concurrently `remove-rootfs-binary-artifacts` could run and delete the artifact during the download and result in the error. In this commit, I try to resolve this by making sure that the shim build waits for the artifact deletes to complete before starting. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-12 16:52:54 +00:00
Fabiano Fidêncio	7b0c1d0a8c	Merge pull request #10492 from zvonkok/upgrade-qemu-9.1.0 qemu: Upgrade qemu 9.1.2	2024-12-12 08:15:39 +01:00
Fupan Li	07fe7325c2	Merge pull request #10643 from justxuewei/fix-bind-vol runtime-rs & agent: Fix the issues with bind volumes	2024-12-12 11:34:52 +08:00
Fupan Li	372346baed	Merge pull request #10641 from justxuewei/fix-build-type runtime-rs: Ignore BUILD_TYPE if it is not release	2024-12-12 11:32:49 +08:00
Xuewei Niu	5f1b1d8932	Merge pull request #10638 from justxuewei/fix-stderr-fifo runtime-rs: Fix the issues with stderr fifo	2024-12-12 10:03:46 +08:00
Fabiano Fidêncio	a5c863a907	Merge pull request #10581 from ryansavino/snp-enable-skipped Revert "ci: Skip the failing tests in SNP"	2024-12-11 18:22:17 +01:00
Zvonko Kaiser	cc9ecedaea	qemu: Bump version, new options, add no_patches We want to have the latest QEMU version available which is as of this writing v9.1.2 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> qemu: Add new options for 9.1.2 We need to fence specific options depending on the version and disable ones that are not needed anymore Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> qemu: Add no_patches.txt Since we do not have any patches for this version let's create the appropriate files. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:32:39 +00:00
Zvonko Kaiser	69ed4bc3b7	qemu: Add depedency The new QEMU build needs python-tomli, now that we bumped Ubuntu we can include the needed tomli package Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:32:20 +00:00
Zvonko Kaiser	c82db45eaa	qemu: Disable pmem We're disabling pmem support, it is heavilly broken with Ubuntu's static build of QEMU and not needed Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:32:19 +00:00
Zvonko Kaiser	a88174e977	qemu: Replace from source build with package In jammy we have the liburing package available, hence remove the source build and include the package. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:22:54 +00:00
Zvonko Kaiser	c15f77737a	qemu: Bump Ubuntu version in Dockerfile We need jammy for a new package that is not available in focal Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:22:54 +00:00
Zvonko Kaiser	eef2795226	qemu: Use proper QEMU builder Do not use hardcoded abs path. Use the deduced rel path. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:22:54 +00:00
Zvonko Kaiser	e604e51b3d	qemu: Build as user We moved all others artifacts to be build as a user, QEMU should not be the exception Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:22:54 +00:00
Zvonko Kaiser	1d56fd0308	qemu: Remove abs path We want to stick with the other build scripts and only use relative paths. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-11 16:22:54 +00:00
Ryan Savino	7d45382f54	Revert "ci: Skip the failing tests in SNP" This reverts commit `2242aee099`.	2024-12-10 16:20:31 -06:00
Xuewei Niu	3fb91dd631	agent: Fix the issues with bind volumes The mount type should be considered as empty if the value is `Some("none")`. Fixes: #10642 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-12-11 00:51:32 +08:00
Xuewei Niu	59ed19e8b2	runtime-rs: Fix the issues with bind volumes This path fixes the logic of getting the type of volume: when the type of OCI mount is Some("none") and the options have "bind" or "rbind", the type will be considered as "bind". Fixes: #10642 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-12-11 00:50:36 +08:00
Xuewei Niu	2424c1a562	runtime-rs: Ignore BUILD_TYPE if it is not release This patch fixes that by adding `--release` only if `BUILD_TYPE=release`. Fixes: #10640 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-12-11 00:27:28 +08:00
Xuewei Niu	b4695f6303	runtime-rs: Fix the issues with stderr fifo When tty is enabled, stderr fifo should never be opened. Fixes: #10637 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-12-10 21:48:52 +08:00
Aurélien Bombo	037281d699	Merge pull request #10593 from microsoft/saulparedes/improve_namespace_validation policy: improve pod namespace validation	2024-12-09 11:55:09 -06:00
Steve Horsman	9b7fb31ce6	Merge pull request #10631 from stevenhorsman/action-lint-workflow Action lint workflow	2024-12-09 09:33:07 +00:00
Fabiano Fidêncio	bec1de7bd7	Merge pull request #10548 from Sumynwa/sumsharma/clh_tweak_vm_configs runtime: Set memory config shared=false when shared_fs=None in CLH.	2024-12-06 23:15:29 +01:00
Sumedh Alok Sharma	ac4f986e3e	runtime: Set memory config shared=false when shared_fs=None in CLH. This commit sets memory config `shared` to false in cloud hypervisor when creating vm with shared_fs=None && hugePages = false. Currently in runtime/virtcontainers/clh.go,the memory config shared is by default set to true. As per the CLH memory document, (a) shared=true is needed in case like when using virtio_fs since virtiofs daemon runs as separate process than clh. (b) for shared_fs=none + hugespages=false, shared=false can be set to use private anonymous memory for guest (with no file backing). (c) Another memory config thp (use transparent huge pages) is always enabled by default. As per documentation, (b) + (c) can be used in combination. However, with the current CLH implementation, the above combination cannot be used since shared=true is always set. Fixes #10547 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-12-06 21:22:51 +05:30
stevenhorsman	b4b3471bcb	workflows: linting: Fix shellcheck SC1001 > This \/ will be a regular '/' in this context Remove ignored escape Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	491210ed22	workflows: linting: Fix shellcheck SC2006 > Use $(...) notation instead of legacy backticks `...` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	5d7c5bdfa4	workflows: linting: Fix shellcheck SC2015 > A && B \|\| C is not if-then-else. C may run when A is true Refactor the echo so that we can't get into a situation where the retry of workspace delete happens if the original one was successful Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	c2ba15c111	workflows: linting: Fix shellcheck SC2206 > Quote to prevent word splitting/globbing Double quote variables expanded in an array Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	007514154c	workflows: linting: Fix shellcheck SC2068 > Double quote array expansions to avoid re-splitting elements Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	4ef05c6176	workflows: linting: Fix shellcheck SC2116 > Useless echo? Instead of 'cmd $(echo foo)', just use 'cmd foo' Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	f02d540799	workflows: Bump outdated action versions Bump some actions that are significantly out-of-date and out of sync with the versions used in other workflows Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	935327b5aa	workflows: linting: Fix shellcheck SC2046 > Quote this to prevent word splitting. Quote around subshell Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	e93ed6c20e	workflows: linting: Add tdx labels The tdx runners got split into two different runners, so we need to update the known self-hosted runner labels Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	d4bd314d52	workflows: linting: Fix incorrect properties These properties are currently invalid, so either fix, or remove them Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	9113606d45	workflows: linting: Fix shellcheck SC2086 > Double quote to prevent globbing and word splitting. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 13:50:12 +00:00
stevenhorsman	42cd2ce6e4	workflows: Add actionlint workflows On PRs that update anything in the workflows directory, add an actionlint run to validate our workflow files for errors and hopefully catch issues earlier. Fixes: #9646 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-06 11:36:08 +00:00
Fabiano Fidêncio	a93ff57c7d	Merge pull request #10627 from kata-containers/topic/release-helm-charm-tarball release: helm: Add the chart as part of the release	2024-12-06 11:22:43 +01:00
Fabiano Fidêncio	300a827d03	release: helm: Add the chart as part of the release So users can simply download the chart and use it accordingly without the need to download the full repo. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-12-06 11:19:34 +01:00
Fabiano Fidêncio	652662ae09	Merge pull request #10551 from fidencio/topic/kata-deploy-allow-multi-deployment kata-deploy: Add support to multi-installation	2024-12-06 11:16:20 +01:00
Hui Zhu	d3a6bcdaa5	runtime-rs: configuration-dragonball.toml.in: Add config for mem-agent Add config for mem-agent to configuration-dragonball.toml.in. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:28 +08:00
Hui Zhu	2b6caf26e0	agent-ctl: Add mem-agent API support Add sub command MemAgentMemcgSet and MemAgentCompactSet to agent-ctl to configate the mem-agent inside the running kata-containers. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:24 +08:00
Hui Zhu	cb86d700a6	config: Add config of mem-agent Add config of mem-agent to configate the mem-agent. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:20 +08:00
Hui Zhu	692ded8f96	agent: add support for MemAgentMemcgSet and MemAgentCompactSet Add MemAgentMemcgSet and MemAgentCompactSet to agent API to set the config of mem-agent memcg and compact. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:16 +08:00
Hui Zhu	f84ad54d97	agent: Start mem-agent in start_sandbox mem-agent will run with kata-agent. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:13 +08:00
Hui Zhu	74a17f96f4	protocols/protos/agent.proto: Add mem-agent support Add MemAgentMemcgConfig and MemAgentCompactConfig to AgentService. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:09 +08:00
Hui Zhu	ffc8390a60	agent: Add mem-agent to Cargo.toml Add mem-agent to Cargo.toml of agent. mem-agent will be integrated into kata-agent. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:05 +08:00
Hui Zhu	4407f6e098	mem-agent: Add to src mem-agent is a component designed for managing memory in Linux environments. Sub-feature memcg: Utilizes the MgLRU feature to monitor each cgroup's memory usage and periodically reclaim cold memory. Sub-feature compact: Periodically compacts memory to facilitate the kernel's free page reporting feature, enabling the release of more idle memory from guests. During memory reclamation and compaction, mem-agent monitors system pressure using Pressure Stall Information (PSI). If the system pressure becomes too high, memory reclamation or compaction will automatically stop. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 10:00:02 +08:00
Hui Zhu	f9c63d20a4	kernel/configs: Add mglru, debugfs and psi to dragonball-experimental Add mglru, debugfs and psi to dragonball-experimental/mem_agent.conf to support mem_agent function. Fixes: #10625 Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-12-06 09:59:59 +08:00
Fabiano Fidêncio	111082db07	kata-deploy: Add support to multi-installation This is super useful for development / debugging scenarios, mainly when dealing with limited hardware availability, as this change allows multiple people to develop into one single machine, while still using kata-deploy. Fixes: #10546 Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-12-05 17:42:53 +01:00
Fabiano Fidêncio	0033a0c23a	kata-deploy: Adjust paths for qemu-coco-dev as well I missed that when working on the INSTALL_PREFIX feature, so adding it now. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-12-05 17:42:53 +01:00
Fabiano Fidêncio	62b3a07e2f	kata-deploy: helm: Add overlooked INSTALLATION_PREFIX env var At the same time that INSTALLATION_PREFIX was added, I was working on the helm changes to properly do the cleanup / deletion when it's removed. However, I missed adding the INSTALLATION_PREFIX env var there. which I'm doing now. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-12-05 17:42:53 +01:00
Steve Horsman	5d96734831	Merge pull request #10572 from ldoktor/gk-stalled-results ci.gatekeeper: Update existing results	2024-12-04 19:02:14 +00:00
Wainer Moschetta	a94982d8b8	Merge pull request #10617 from stevenhorsman/skip-k8s-job-test-on-non-tee tests: Skip k8s job test on qemu-coco-dev	2024-12-04 15:47:33 -03:00
Saul Paredes	84a411dac4	policy: improve pod namespace validation - Remove default_namespace from settings - Ensure container namespaces in a pod match each other in case no namespace is specified in the YAML Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-12-04 10:17:54 -08:00
Steve Horsman	c86f76d324	Merge pull request #10588 from stevenhorsman/metrics-clh-min-range-relaxation metrics: Increase minval range for failing tests	2024-12-04 16:10:26 +00:00
stevenhorsman	a8ccd9a2ac	tests: Skip k8s job test on qemu-coco-dev The tests is unstable on this platform, so skip it for now to prevent the regular known failures covering up other issues. See #10616 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-04 16:00:05 +00:00
Steve Horsman	9e609dd34f	Merge pull request #10615 from kata-containers/topic/update-remove-artifact-filter workflows: Fix remove artifact name filter	2024-12-04 15:02:35 +00:00
Fabiano Fidêncio	531a29137e	Merge pull request #10607 from microsoft/danmihai1/less-logging runtime: skip logging some of the dial errors	2024-12-04 15:01:45 +01:00
stevenhorsman	14a3adf4d6	workflows: Fix remove artifact name filter - Fix copy-paste errors in artifact filters for arm64 and ppc64le - Remove the trailing wildcard filter that falsely ends up removing agent-ctl and replace with the tarball-suffix, which should exactly match the artifacts Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-12-04 13:34:42 +00:00
Alex Lyn	5f9cc86b5a	Merge pull request #10604 from 3u13r/euler/fix/genpolicy-rego-state-getter genpolicy: align state path getter and setter	2024-12-04 13:57:34 +08:00
Alex Lyn	c7064027f4	Merge pull request #10574 from BbolroC/add-ccw-subchannel-qemu-runtime-rs Add subchannel support to qemu-runtime-rs for s390x	2024-12-04 09:17:45 +08:00
Aurélien Bombo	57d893b5dc	Merge pull request #10563 from sprt/csi-deploy coco: ci: Fully implement compilation of CSI driver and require it for CoCo tests [2/x]	2024-12-03 18:58:14 -06:00
Aurélien Bombo	4aa7d4e358	ci: Require CSI driver for CoCo tests With the building/publishing step for the CSI driver validated, we can set that as a requirement for the CoCo tests. Depends on: #10561 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-03 14:43:36 -06:00
Aurélien Bombo	fe55b29ef0	csi-kata-directvolume: Remove go version check The driver build recipe has a script to check the current Go version against the go.mod version. However, the script is broken ($expected is unbound) and I don't believe we do this for other components. On top of this, Go should be backward-compatible. Let's keep things simple for now and we can evaluate restoring this script in the future if need be. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-03 14:43:36 -06:00
Aurélien Bombo	fb87bf221f	ci: Implement build step for CSI driver This fully implements the compilation step for csi-kata-directvolume. This component can now be built by the CI running: $ cd tools/packaging/kata-deploy/local-build $ make csi-kata-directvolume-tarball A couple notes: * When installing the binary, we rename it from directvolplugin to csi-kata-directvolume on the fly to make it more readable. * We add go to the tools builder Dockerfile to support building this tool. * I've noticed the file install_libseccomp.sh gets created by the build process so I've added it to a .gitignore. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-03 14:43:36 -06:00
Aurélien Bombo	0f6113a743	Merge pull request #10612 from kata-containers/sprt/fix-csi-publish2 ci: Fix Docker publishing for CSI driver, 2nd try	2024-12-03 14:43:28 -06:00
Aurélien Bombo	a23ceac913	ci: Fix Docker publishing for CSI driver, 2nd try Follow-up to #10609 as it seems GHA doesn't allow hard links: https://github.com/kata-containers/kata-containers/actions/runs/12144941404/job/33868901896?pr=10563#step:6:8 Note that I also updated the `needs` directive as we don't need the Kata payload container, just the tarball artifact. Part of: #10560 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-03 13:04:46 -06:00
Dan Mihai	2a67038836	Merge pull request #10608 from microsoft/saulparedes/policy_metadatata_uid policy: ignore optional metadata uid field	2024-12-03 10:19:12 -08:00
Dan Mihai	25e6f4b2a5	Merge pull request #10592 from microsoft/saulparedes/add_constants_to_rules policy: add constants to rules.rego	2024-12-03 10:17:10 -08:00
Aurélien Bombo	5e1fc5a63f	Merge pull request #10609 from kata-containers/sprt/fix-publish-csi ci: Fix Docker publishing for CSI driver	2024-12-03 11:21:55 -06:00
Hyounggyu Choi	8b998e5f0c	runtime-rs: Introduce get_devno_ccw() for deduplication The devno assignment logic is repeated in 5 different places during device addition. To improve code maintainability and readability, this commit introduces a standalone function, `get_devno_ccw()`, to handle the deduplication. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-12-03 15:35:03 +01:00
Leonard Cohnen	9b614a4615	genpolicy: align state path getter and setter Before this patch there was a mismatch between the JSON path under which the state of the rule evaluation is set in comparison to under which it is retrieved. This resulted in the behavior that each time the policy was evaluated, it thought it was the _first_ time the policy was evaluated. This also means that the consistency check for the `sandbox_name` was ineffective. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-12-03 13:25:24 +01:00
Aurélien Bombo	85d3bcd713	ci: Fix Docker publishing for CSI driver The compilation succeeds, however Docker can't find the binary because we specify an absolute path. In Docker world, an absolute path is absolute to the Docker build context (here: src/tools/csi-kata-directvolume). To fix this, we link the binary into the build context, where the Dockerfile expects it. Failure mode: https://github.com/kata-containers/kata-containers/actions/runs/12068202642/job/33693101962?pr=10563#step:8:213 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-12-02 15:50:01 -06:00
Saul Paredes	711d12e5db	policy: support optional metadata uid field This prevents a deserialization error when uid is specified Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-12-02 11:24:58 -08:00
Dan Mihai	efd492d562	runtime: skip logging some of the dial errors With full debug logging enabled there might be around 1,500 redials so log just ~15 of these redials to avoid flooding the log. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-12-02 19:11:32 +00:00
Hyounggyu Choi	9c19d7674a	Merge pull request #10590 from zvonkok/fix-ci ci: Fix variant for confidential targets	2024-12-02 18:39:52 +01:00
Saul Paredes	9105c1fa0c	policy: add constants to rules.rego Reuse constants where applicable Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-12-02 08:28:58 -08:00
Hyounggyu Choi	6f4f94a9f0	Merge pull request #10595 from BbolroC/add-zvsi-devmapper-to-gatekeeper-required-jobs gatekeeper: add run-k8s-tests-on-zvsi(devmapper) to required jobs	2024-12-02 15:28:14 +01:00
Zvonko Kaiser	20442c0eae	ci: Fix variant for confidential targets The default initrd confidential target will have a variant=confidential we need to accomodate this and make sure we also accomodate aaa-xxx-confidential targets. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-12-02 14:21:03 +00:00
stevenhorsman	b87b4b6756	metrics: Increase ranges range for qemu failing tests We've also seen the qemu metrics tests are failing due to the results being slightly outside the max range for network-iperf3 parallel and minimum for network-iperf3 jitter tests on PRs that have no code changes, so we've increase the bounds to not see false negatives. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-29 10:52:16 +00:00
stevenhorsman	4011071526	metrics: Increase minval range for failing tests We've seen a couple of instances recently where the metrics tests are failing due to the results being below the minimum value by ~2%. For tests like latency I'm not sure why values being too low would be an issue, but I've updated the minpercent range of the failing tests to try and get them passing. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-29 10:50:02 +00:00
Hyounggyu Choi	de3452f8e1	gatekeeper: add run-k8s-tests-on-zvsi(devmapper) to required jobs As the following CI job has been marked as required: - kata-containers-ci-on-push / run-k8s-tests-on-zvsi / run-k8s-tests (devmapper, qemu, kubeadm) we need to add it to the gatekeeper's required job list. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-28 12:46:47 +01:00
Fabiano Fidêncio	bdf10e651a	Merge pull request #10597 from kata-containers/topic/unbreak-ci-3rd-time-s-a-charm Unbreak the CI, 3rd attempt	2024-11-28 12:36:09 +01:00
Fabiano Fidêncio	92b8091f62	Revert "ci: unbreak: Reallow no-op builds" This reverts commit `559018554b`. As we've noticed that this is causing issues with initrd builds in the CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-28 12:02:40 +01:00
Fabiano Fidêncio	ca2098f828	build: Allow dummy builds (for when adding a new target) This will help us to simply allow a new dummy build whenever a new component is added. As long as the format `$(call DUMMY,$@)` is followed, we should be good to go without taking the risk of breaking the CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-28 11:13:24 +01:00
Fabiano Fidêncio	f9930971a2	Merge pull request #10594 from sprt/sprt/unbreak-ci-noop-build ci: unbreak: Reallow no-op builds	2024-11-28 07:38:25 +01:00
Aurélien Bombo	559018554b	ci: unbreak: Reallow no-op builds #9838 previously modified the static build so as not to repeatedly copy the same assets on each matrix iteration: https://github.com/kata-containers/kata-containers/pull/9838#issuecomment-2169299202 However, that implementation breaks specifiying no-op/WIP build targets such as done in `e43c59a`. Such no-op builds have been a historical of the project requirement because of a GHA limitation. The breakage is due to no-op builds not generating a tar file corresponding to the asset: https://github.com/kata-containers/kata-containers/actions/runs/12059743390/job/33628926474?pr=10592 To address this breakage, we revert to the `cp -r` implementation and add the `--no-clobber` flag to still preserve the current behavior. Note that `-r` will also create the destination directory if it doesn't exist. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-27 18:40:29 -06:00
Fabiano Fidêncio	9699c7ed06	Merge pull request #10589 from kata-containers/sprt/fix-csi-publish gha: Unbreak CI and work around workflow limit	2024-11-27 23:52:55 +01:00
Aurélien Bombo	eac197d3b7	Merge pull request #10564 from microsoft/danmihai1/clh-endpoint-type runtime: clh: addNet() logging clean-up	2024-11-27 14:44:14 -06:00
Aurélien Bombo	7f659f3d63	gha: Unbreak CI and work around workflow limit #10561 inadvertently broke the CI by going over the limit of 20 reusable workflows: https://github.com/kata-containers/kata-containers/actions/runs/12054648658/workflow This commit fixes that by inlining the job. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-27 12:23:15 -06:00
Aurélien Bombo	16a91fccbe	Merge pull request #10561 from sprt/csi-driver-ci coco: ci: Lay groundwork for compiling and publishing CSI driver image [1/x]	2024-11-27 10:26:45 -06:00
Fabiano Fidêncio	175fe8bc66	Merge pull request #10585 from fidencio/topic/kata-deploy-use-drop-in-containerd-config-whenever-it-is-possible kata-deploy: Use drop-in files whenever it's possible	2024-11-27 16:36:18 +01:00
Steve Horsman	6bb00d9a1d	Merge pull request #10583 from squarti/agent-startup-cdh-client agent: fix startup when guest_components_procs is set to none	2024-11-27 11:43:07 +00:00
Fabiano Fidêncio	500508a592	kata-deploy: Use drop-in files whenever it's possible This will make our lives considerably easier when it comes to cleaning up content added, while it's also a groundwork needed for having multiple installations running in parallel. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-27 12:27:08 +01:00
Steve Horsman	3240f8a4b8	Merge pull request #10586 from stevenhorsman/delete-rootfs-binary-assets-after-rootfs-build workflows: Remove rootfs binary artifacts	2024-11-27 10:03:20 +00:00
Fabiano Fidêncio	c472fe1924	Merge pull request #10584 from fidencio/topic/kata-deploy-prepare-for-containerd-config-version-3 kata-deploy: Support containerd configuration version 3	2024-11-26 18:44:56 +01:00
stevenhorsman	3e5d360185	workflows: Remove rootfs binary artifacts We need the publish certain artefacts for the rootfs, like the agent, guest-components, pause bundle etc as they are consumed in the `build-asset-rootfs` step. However after this point they aren't needed and probably shouldn't be included in the overall kata tarball, so delete them once they aren't needed any more to avoid them being included. Fixes: #10575 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-26 15:24:20 +00:00
Fabiano Fidêncio	6f70ab9169	kata-deploy: Adapt how the containerd version is checked for k0s Let's actually mount the whole /etc/k0s as /etc/containerd, so we can easily access the containerd configuration file which has the version in it, allowing us to parse it instead of just making a guess based on kubernetes distro being used. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-26 16:15:11 +01:00
Silenio Quarti	1230bc77f2	agent: fix startup when guest_components_procs is set to none This PR ensures that OCICRYPT_CONFIG_PATH file is initialized only when CDH socket exists. This prevents startup error if attestation binaries are not installed in PodVM. Fixes: https://github.com/kata-containers/kata-containers/issues/10568 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-11-26 09:57:04 -05:00
Fabiano Fidêncio	f5a9aaa100	kata-deploy: Support containerd config version 3 On Ubuntu 24.04, with the distro default containerd, we're already getting: ``` $ containerd config default \| grep "version = " version = 3 ``` With that in mind, let's make sure that we're ready to support this from the next release. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-26 14:01:50 +01:00
Fupan Li	28166c8a32	Merge pull request #10577 from Apokleos/fix-vfiodev-name runtime-rs: fix vfio device name combination issue	2024-11-26 09:35:45 +08:00
Dan Mihai	d93900c128	Merge pull request #10543 from microsoft/danmihai1/regorus-warning genpolicy: avoid regorus warning	2024-11-25 16:47:33 -08:00
Zvonko Kaiser	1b10e82559	Merge pull request #10516 from zvonkok/kata-agent-cdi ci: Fix error on self-hosted machines	2024-11-25 18:49:37 -05:00
Ryan Savino	e46d24184a	Merge pull request #10386 from kimullaa/fix-build-error-when-using-sev-snp docs: Fix several build failures when I tried the procedures in "Kata Containers with AMD SEV-SNP VMs"	2024-11-25 16:58:52 -06:00
Dan Mihai	f340b31c41	genpolicy: avoid regorus warning Avoid adding to the Guest console warnings about "agent_policy:10:8". "import input" is unnecessary. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-11-25 21:19:01 +00:00
Zvonko Kaiser	c3d1b3c5e3	Merge pull request #10464 from zvonkok/nvidia-gpu-rootfs gpu: NVIDIA GPU initrd/image build	2024-11-25 16:16:42 -05:00
Fabiano Fidêncio	8763a9bc90	Merge pull request #10520 from fidencio/topic/drop-clear-linux-rootfs osbuilder: Drop Clear Linux	2024-11-25 21:16:03 +01:00
Dan Mihai	78cbf33f1d	runtime: clh: addNet() logging clean-up Avoid logging the same endpoint fields twice from addNet(). Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-11-25 19:58:54 +00:00
alex.lyn	5dba680afb	runtime-rs: fix vfio device name combination issue Fixes #10576 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-11-25 14:01:43 +08:00
Hyounggyu Choi	48e2df53f7	runtime-rs: Add devno to DeviceVirtioScsi A new attribute named `devno` is added to DeviceVirtioScsi. It will be used to specify a device number for a CCW bus type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Hyounggyu Choi	2cc48f7822	runtime-rs: Add devno to DeviceVhostUserFs A new attribute named `devno` is added to DeviceVhostUserFs. It will be used to specify a device number for a CCW bus type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Hyounggyu Choi	920484918c	runtime-rs: Add devno to VhostVsock A new attribute named `devno` is added to VhostVsock. It will be used to specify a device number for a CCW bus type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Hyounggyu Choi	9486790089	runtime-rs: Add devno to DeviceVirtioSerial A new attribute named `devno` is added to DeviceVirtioSerial. It will be used to specify a device number for a CCW bus type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Hyounggyu Choi	516daecc50	runtime-rs: Add devno to DeviceVirtioBlk A new attribute named `devno` is added to DeviceVirtioBlk. It will be used to specify a device number for a CCW bus type. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Hyounggyu Choi	30a64092a7	runtime-rs: Add CcwSubChannel to provide devno for CCW devices To explicitly specify a device number on the QEMU command line for the following devices using the CCW transport on s390x: - SerialDevice - BlockDevice - VhostUserDevice - SCSIController - VSOCKDevice this commit introduces a new structure CcwSubChannel and implements the following methods: - add_device() - remove_device() - address_format_ccw() - set_addr() You can see the detailed explanation for each method in the comment. This resolves the 1st part of #10573. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-11-23 13:45:36 +01:00
Steve Horsman	322073bea1	Merge pull request #10447 from ldoktor/required-jobs ci: Required jobs	2024-11-22 09:15:11 +00:00
Lukáš Doktor	e69635b376	ci.gatekeeper: Remove unused variable this is a left-over from previous way of iterating over jobs. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-22 09:27:11 +01:00
Lukáš Doktor	fa7bca4179	ci.gatekeeper: Print the older job id let's print the also the existing result's id when printing the information about ignoring older result id to simplify debugging. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-22 09:27:11 +01:00
Lukáš Doktor	6c19a067a0	ci.gatekeeper: Update existing results tha matching run_id means we're dealing with the same job but with updated results and not with an older job. Update the results in such case. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-22 09:27:09 +01:00
Aurélien Bombo	5e4990bcf5	coco: ci: Add no-op steps to deploy CSI driver This adds no-op steps that'll be used to deploy and clean up the CSI driver used for testing. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-21 16:08:06 -06:00
Aurélien Bombo	893f6a4ca0	ci: Introduce job to publish CSI driver image This adds a new job to build and publish the CSI driver Docker image. Of course this job will fail after we merge this PR because the CSI driver compilation job hasn't been implemented yet. However that will be implemented directly after in #10561. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-21 16:07:59 -06:00
Aurélien Bombo	e43c59a2c6	ci: Add no-op step to compile CSI driver This adds a no-op build step to compile the CSI driver. The actual compilation will be implemented in an ulterior PR, so as to ensure we don't break the CI. Addresses: #10560 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-21 16:06:55 -06:00
Zvonko Kaiser	0debf77770	gpu: NVIDIA gpu initrd/image build With each release make sure we ship a GPU enabled rootfs/initrd Fixes: #6554 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-21 18:57:23 +00:00
Steve Horsman	b4da4b5e3b	Merge pull request #10377 from coolljt0725/fix_build osbuilder: Fix build dependency of ubuntu rootfs with Docker	2024-11-21 08:45:59 +00:00
Jitang Lei	ed4c727c12	osbuilder: Fix build dependency of ubuntu rootfs with Docker Build ubuntu rootfs with Docker failed with error: `Unable to find libclang` Fix this error by adding libclang-dev to the dependency. Signed-off-by: Jitang Lei <leijitang@outlook.com>	2024-11-21 10:49:27 +08:00
Zvonko Kaiser	e9f36f8187	ci: Fixing simple typo change evn to env Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-20 18:40:14 +00:00
Zvonko Kaiser	a5733877a4	ci: Fix error on self-hosted machines We need to clean-up any created files/dirs otherwise we cause problems on self-hosted runners. Using tempdir which will be removed automatically. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-20 18:40:13 +00:00
Lukáš Doktor	62e8815a5a	ci: Add documentation to cover mapping format to help people with adding new entries. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-20 17:25:59 +01:00
Lukáš Doktor	64306dc888	ci: Set required-tests according to GH required tests this should record the current list of required tests from GH. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-20 17:25:57 +01:00
Steve Horsman	358ebf5134	Merge pull request #10558 from AdithyaKrishnan/main ci: Re-enable SNP CI	2024-11-20 10:27:41 +00:00
Steve Horsman	30bad4ee43	Merge pull request #10562 from stevenhorsman/remove-release-artifactor-skips workflows: Remove skipping of artifact uploads	2024-11-20 08:45:37 +00:00
Adithya Krishnan Kannan	2242aee099	ci: Skip the failing tests in SNP Per [Issue#10549](https://github.com/kata-containers/kata-containers/issues/10549), the following tests are failing on SNP. 1. k8s-guest-pull-image-encrypted.bats 2. k8s-guest-pull-image-authenticated.bats 3. k8s-guest-pull-image-signature.bats 4. k8s-confidential-attestation.bats Per @fidencio 's comment on [PR#10558](https://github.com/kata-containers/kata-containers/pull/10558), I am skipping the same. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2024-11-19 10:41:43 -06:00
stevenhorsman	da5f6b77c7	workflows: Remove skipping of artifact uploads Now we are downloading artifacts to create the rootfs we need to ensure they are uploaded always, even on releases Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-19 13:28:02 +00:00
Steve Horsman	817438d1f6	Merge pull request #10552 from stevenhorsman/3.11.0-release release: Bump version to 3.11.0	2024-11-19 09:44:35 +00:00
Saul Paredes	eab48c9884	Merge pull request #10545 from microsoft/cameronbaird/sync-clh-logging runtime: fix comment to accurately reflect clh behavior	2024-11-18 11:25:58 -08:00
Adithya Krishnan Kannan	ef367d81f2	ci: Re-enable SNP CI We've debugged the SNP Node and we wish to test the fixes on GHA. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2024-11-18 11:11:27 -06:00
stevenhorsman	7a8ba14959	release: Bump version to 3.11.0 Bump `VERSION` and helm-chart versions Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-18 11:13:15 +00:00
Steve Horsman	0ce3f5fc6f	Merge pull request #10514 from squarti/pause_command agent: overwrite OCI process spec when overwriting pause image	2024-11-15 18:03:58 +00:00
Fabiano Fidêncio	92f7526550	Merge pull request #10542 from Crypt0s/topic/enable-CONFIG_KEYS kernel: add CONFIG_KEYS=y to enable kernel keyring	2024-11-15 12:15:25 +01:00
Crypt0s	563a6887e2	kernel: add CONFIG_KEYS=y to enable kernel keyring KinD checks for the presence of this (and other) kernel configuration via scripts like https://blog.hypriot.com/post/verify-kernel-container-compatibility/ or attempts to directly use /proc/sys/kernel/keys/ without checking to see if it exists, causing an exit when it does not see it. Docker/it's consumers apparently expect to be able to use the kernel keyring and it's associated syscalls from/for containers. There aren't any known downsides to enabling this except that it would by definition enable additional syscalls defined in https://man7.org/linux/man-pages/man7/keyrings.7.html which are reachable from userspace. This minimally increases the attack surface of the Kata Kernel, but this attack surface is minimal (especially since the kernel is most likely being executed by some kind of hypervisor) and highly restricted compared to the utility of enabling this feature to get further containerization compatibility. Signed-off-by: Crypt0s <BryanHalf@gmail.com>	2024-11-15 09:30:06 +01:00
Shunsuke Kimura	706e8bce89	docs: change from OVMF.fd to AmdSev.fd change the build method to generate OVMF for AmdSev. This commit adds `ovmf_build=sev` env parameter. <`638c2c4164`> Fixes #10378 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2024-11-15 11:24:45 +09:00
Shunsuke Kimura	d7f6fabe65	docs: fix build-kernel.sh option `build-kernel.sh` no longer takes an argument for the -x option. <`6c3338271b`> Fixes #10378 Signed-off-by: Shunsuke Kimura <pbrehpuum@gmail.com>	2024-11-15 11:24:45 +09:00
Cameron Baird	65881ceb8a	runtime: fix comment to accurately reflect clh behavior Fix the CLH log levels description Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>	2024-11-14 23:16:11 +00:00
Silenio Quarti	42b6203493	agent: overwrite OCI process spec when overwriting pause image The PR replaces the OCI process spec of the pause container with the spec of the guest provided pause bundle. Fixes: https://github.com/kata-containers/kata-containers/issues/10537 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-11-14 13:05:16 -05:00
Fabiano Fidêncio	6a9266124b	Merge pull request #10501 from kata-containers/topic/ci-split-tests ci: tdx: Split jobs to run in 2 different machines	2024-11-14 17:24:50 +01:00
Fabiano Fidêncio	9b3fe0c747	ci: tdx: Adjust workflows to use different machines This will be helpful in order to increase the OS coverage (we'll be using both Ubuntu 24.04 and CentOS 9 Stream), while also reducing the amount spent on the tests (as one machine will only run attestation related tests, and the other the tests that do not require attestation). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-14 15:52:00 +01:00
Fabiano Fidêncio	9b1a5f2ac2	tests: Add a way to run only tests which rely on attestation We're doing this as, at Intel, we have two different kind of machines we can plug into our CI. Without going much into details, only one of those two kinds of machines will work for the attestation tests we perform with ITA, thus in order to speed up the CI and improve test coverage (OS wise), we're going to run different tests in different machines. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-14 15:51:57 +01:00
Steve Horsman	915695f5ef	Merge pull request #9407 from mrIncompetent/root-fs-clang rootfs: Install missing clang in Ubuntu docker image	2024-11-14 10:35:06 +00:00
Henrik Schmidt	57a4dbedeb	rootfs: Install missing libclang-dev in Ubuntu docker image Fixes #9444 Signed-off-by: Henrik Schmidt <mrIncompetent@users.noreply.github.com>	2024-11-14 08:48:24 +00:00
Hyounggyu Choi	5869046d04	Merge pull request #9195 from UiPath/fix/vcpus-for-static-mgmt runtime: Set maxvcpus equal to vcpus for the static resources case	2024-11-14 09:38:20 +01:00
Dan Mihai	d9977b3e75	Merge pull request #10431 from microsoft/saulparedes/add-policy-state genpolicy: add state to policy	2024-11-13 11:48:46 -08:00
Aurélien Bombo	7bc2fe90f9	Merge pull request #10521 from ncppd/osbuilder-cleanup osbuilder: remove redundant env variable	2024-11-13 12:17:09 -06:00
Steve Horsman	a947d2bc40	Merge pull request #10539 from AdithyaKrishnan/main ci: Temporarily skip SNP CI	2024-11-13 17:58:32 +00:00
Adithya Krishnan Kannan	439a1336b5	ci: Temporarily skip SNP CI As discussed in the CI working group, we are temporarily skipping the SNP CI to unblock the remaining workflow. Will revert after fixing the SNP runner. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2024-11-13 11:44:16 -06:00
Fabiano Fidêncio	02d4c3efbf	Merge pull request #10519 from fidencio/topic/relax-restriction-for-qemu-tdx Reapply "runtime: confidential: Do not set the max_vcpu to cpu"	2024-11-13 16:09:06 +01:00
Saul Paredes	c207312260	genpolicy: validate container sandbox names Make sure all container sandbox names match the sandbox name of the first container. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-11-12 15:17:01 -08:00
Saul Paredes	52d1aea1f7	genpolicy: Add state Use regorous engine's add_data method to add state to the policy. This data can later be accessed inside rego context through the data namespace. Support state modifications (json-patches) that may be returned as a result from policy evaluation. Also initialize a policy engine data slice "pstate" dedicated for storing state. Fixes #10087 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-11-12 15:16:53 -08:00
Alexandru Matei	e83f8f8a04	runtime: Set maxvcpus equal to vcpus for the static resources case Fixes: #9194 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-11-12 16:36:42 +02:00
GabyCT	06fe459e52	Merge pull request #10508 from GabyCT/topic/installartsta gha: Get artifacts when installing kata tools in stability workflow	2024-11-11 15:59:06 -06:00
Nikos Ch. Papadopoulos	ab80cf8f48	osbuilder: remove redundant env variable Remove second declaration of GO_HOME in roofs-build ubuntu script. Signed-off-by: Nikos Ch. Papadopoulos <ncpapad@cslab.ece.ntua.gr>	2024-11-11 19:49:28 +02:00
Fabiano Fidêncio	780b36f477	osbuilder: Drop Clear Linux The Clear Linux rootfs is not being tested anywhere, and it seems Intel doesn't have the capacity to review the PRs related to this (combined with the lack of interested from the rest of the community on reviewing PRs that are specific to this untested rootfs). With this in mind, I'm suggesting we drop Clear Linux support and focus on what we can actually maintain. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-11 15:22:55 +01:00
Fabiano Fidêncio	5618180e63	Merge pull request #10515 from kata-containers/sprt/ubuntu-latest-fix gha: Hardcode ubuntu-22.04 instead of latest	2024-11-10 09:54:39 +01:00
Fabiano Fidêncio	2281342fb8	Merge pull request #10513 from fidencio/topic/ci-adjust-proxy-nightmare-for-tdx ci: tdx: kbs: Ensure https_proxy is taken in consideration	2024-11-10 00:17:10 +01:00
Fabiano Fidêncio	0d8c4ce251	Merge pull request #10517 from microsoft/saulparedes/remove_manifest_v1_test tests: remove manifest v1 test	2024-11-09 23:40:51 +01:00
Fabiano Fidêncio	56812c852f	Reapply "runtime: confidential: Do not set the max_vcpu to cpu" This reverts commit `f15e16b692`, as we don't have to do this since we're relying on the `static_sandbox_resource_mgmt` feature, which gives us the correct amount of memory and CPUs to be allocated. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-09 23:20:17 +01:00
Saul Paredes	461efc0dd5	tests: remove manifest v1 test This test was meant to show support for pulling images with v1 manifest schema versions. The nginxhttps image has been modified in https://hub.docker.com/r/ymqytw/nginxhttps/tags such that we are no longer able to pull it: $ docker pull ymqytw/nginxhttps:1.5 Error response from daemon: missing signature key We may remove this test since schema version 1 manifests are deprecated per https://docs.docker.com/engine/deprecated/#pushing-and-pulling-with-image-manifest-v2-schema-1 : "These legacy formats should no longer be used, and users are recommended to update images to use current formats, or to upgrade to more current images". This schema version was used by old docker versions. Further OCI spec https://github.com/opencontainers/image-spec/blob/main/manifest.md#image-manifest-property-descriptions only supports schema version 2. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-11-08 13:38:51 -08:00
Aurélien Bombo	19e972151f	gha: Hardcode ubuntu-22.04 instead of latest GHA is migrating ubuntu-latest to Ubuntu 24 so let's hardcode the current 22.04 LTS. https://github.blog/changelog/2024-11-05-notice-of-breaking-changes-for-github-actions/ Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-11-08 11:00:15 -06:00
Greg Kurz	2bd8fde44a	Merge pull request #10511 from ldoktor/fedora-python ci.ocp: Use the official python:3 container for sanity	2024-11-08 16:31:40 +01:00
Fabiano Fidêncio	baf88bb72d	ci: tdx: kbs: Ensure https_proxy is taken in consideration Trustee's deployment must set the correct https_proxy as env var on the container that will talk to the ITA / ITTS server, otherwise the kbs service won't be able to start, causing then issues in our CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: Krzysztof Sandowicz <krzysztof.sandowicz@intel.com>	2024-11-08 16:06:16 +01:00
Steve Horsman	1f728eb906	Merge pull request #10498 from stevenhorsman/update-create-container-timeout-log tests: k8s: Update image pull timeout error	2024-11-08 10:47:39 +00:00
Steve Horsman	6112bf85c3	Merge pull request #10506 from stevenhorsman/skip-runk-ci workflow: Remove/skip runk CI	2024-11-08 09:54:06 +00:00
Steve Horsman	a5acbc9e80	Merge pull request #10505 from stevenhorsman/remove-stratovirt-metrics-tests metrics: Skip metrics on stratovirt	2024-11-08 08:53:05 +00:00
Lukáš Doktor	2f7d34417a	ci.ocp: Use the official python:3 container for sanity Fedora F40 removed python3 from the base container, to avoid such issues let's rely on the latest and greates official python container. Fixes: #10497 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-11-08 07:16:30 +01:00
Zvonko Kaiser	183bd2aeed	Merge pull request #9584 from zvonkok/kata-agent-cdi kata-agent: Add CDI support	2024-11-07 14:18:32 -05:00
Zvonko Kaiser	aa2e1a57bd	agent: Added test-case for handle_cdi_devices We are generating a simple CDI spec with device and global containerEdits to test the CDI crate. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-07 17:03:18 +00:00
Gabriela Cervantes	4274198664	gha: Get artifacts when installing kata tools in stability workflow This PR adds the get artifacts which are needed when installing kata tools in stability workflow to avoid failures saying that artifacts are missing. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-11-07 16:20:41 +00:00
stevenhorsman	a5f1a5a0ee	workflow: Remove/skip runk CI As discussed in the AC meeting, we don't have a maintainer, (or users?) of runk, and the CI is unstable, so giving we can't support it, we shouldn't waste CI cycles on it. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-07 14:16:30 +00:00
stevenhorsman	0efe9f4e76	metrics: Skip metrics on stratovirt As discussed on the AC call, we are lacking maintainers for the metrics tests. As a starting point for potentially phasing them out, we discussed starting with removing the test for stratovirt as a non-core hypervisor and a job that is problematic in leaving behind resources that need cleaning up. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-07 14:06:57 +00:00
Fabiano Fidêncio	c332e953f9	Merge pull request #10500 from squarti/fix-10499 runtime: Files are not synced between host and guest VMs	2024-11-07 08:28:53 +01:00
Silenio Quarti	be3ea2675c	runtime: Files are not synced between host and guest VMs This PR makes the root dir absolute after resolving the default root dir symlink. Fixes: https://github.com/kata-containers/kata-containers/issues/10499 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-11-06 17:31:12 -05:00
GabyCT	47cea6f3c6	Merge pull request #10493 from GabyCT/topic/katatoolsta gha: Add install kata tools as part of the stability workflow	2024-11-06 14:16:48 -06:00
Gabriela Cervantes	13e27331ef	gha: Add install kata tools as part of the stability workflow This PR adds the install kata tools step as part of the k8s stability workflow. To avoid the failures saying that certain kata components are not installed it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-11-06 20:07:06 +00:00
Fabiano Fidêncio	71c4c2a514	Merge pull request #10486 from kata-containers/topic/enable-AUTO_GENERATE_POLICY-for-qemu-coco-dev workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev	2024-11-06 21:04:45 +01:00
Zvonko Kaiser	3995fe71f9	kata-agent: Add CDI support For proper device handling add CDI support Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-06 17:50:20 +00:00
stevenhorsman	85554257f8	tests: k8s: Update image pull timeout error Currently the error we are checking for is `CreateContainerRequest timed out`, but this message doesn't always seem to be printed to our pod log. Try using a more general message that should be present more reliably. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-06 17:00:26 +00:00
Fabiano Fidêncio	a3c72e59b1	Merge pull request #10495 from littlejawa/ci/skip_nginx_connectivity_for_crio ci: skip nginx connectivity test with qemu/crio	2024-11-06 13:43:19 +01:00
Julien Ropé	da5e0c3f53	ci: skip nginx connectivity test with crio We have an error with service name resolution with this test when using crio. This error could not be reproduced outside of the CI for now. Skipping it to keep the CI job running until we find a solution. See: #10414 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-11-06 12:07:02 +01:00
Greg Kurz	5af614b1a4	Merge pull request #10496 from littlejawa/ci/expose_container_runtime ci: export CONTAINER_RUNTIME to the test scripts	2024-11-06 12:05:36 +01:00
Julien Ropé	6d0cb1e9a8	ci: export CONTAINER_RUNTIME to the test scripts This variable will allow tests to adapt their behaviour to the runtime (containerd/crio). Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-11-06 11:29:11 +01:00
Fabiano Fidêncio	72979d7f30	workflows: Use AUTO_GENERATE_POLICY for qemu-coco-dev By the moment we're testing it also with qemu-coco-dev, it becomes easier for a developer without access to TEE to also test it locally. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-06 10:47:08 +01:00
Fabiano Fidêncio	7d3f2f7200	runtime: Match TEEs for the static_sandbox_resource_mgmt option The qemu-coco-dev runtime class should be as close as possible to what the TEEs runtime classes are doing, and this was one of the options that ended up overlooked till now. Shout out to Dan Mihai for noticing that! Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-06 10:47:08 +01:00
Fabiano Fidêncio	ea8114833c	Merge pull request #10491 from fidencio/topic/fix-typo-in-the-ephemeral-handler agent: fix typo on getting EphemeralHandler size option	2024-11-06 10:31:48 +01:00
Fabiano Fidêncio	7e6779f3ad	Merge pull request #10488 from fidencio/topic/teach-our-machinery-to-deal-with-rc-kernels build: kernel: Teach our machinery to deal with -rc kernels	2024-11-05 16:19:57 +01:00
Zvonko Kaiser	a4725034b2	Merge pull request #9480 from zvonkok/build-image-suffix image: Add suffix to image or initrd depending on the NVIDIA driver version	2024-11-05 09:43:56 -05:00
Fabiano Fidêncio	77c87a0990	agent: fix typo on getting EphemeralHandler size option Most likely this was overlooked during the development / review, but we're actually interested on the size rather than on the pagesize of the hugepages. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 15:15:17 +01:00
Fabiano Fidêncio	2b16160ff1	versions: kernel-dragonball: Fix URL SSIA Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:55:34 +01:00
Fabiano Fidêncio	f7b31ccd6c	kernel: bump kata_config_version Due to the changes done in the previous commits. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:57 +01:00
Fabiano Fidêncio	a52ea32b05	build: kernel: Learn how to deal with release candidates So far we were not prepared to deal with release candidates as those: * Do not have a sha256sum in the sha256sums provided by the kernel cdn * Come from a different URL (directly from Linus) * Have a different suffix (.tar.gz, instead of .tar.xz) Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	9f2d4b2956	build: kernel: Always pass the url to the builder This doesn't change much on how we're doing things Today, but it simplifies a lot cases that may be added later on (and will be) like building -rc kernels. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	ee1a17cffc	build: kernel: Take kernel_url into consideration Let's make sure the kernel_url is actually used whenever it's passed to the function. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	9a0b501042	build: kernel: Remove tee specific function As, thankfully, we're relying on upstream kernels for TEEs. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	cc4006297a	build: kernel: Pass the yaml base path instead of the version path By doing this we can ensure this can be re-used, if needed (and it'll be needed), for also getting the URL. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	7057ff1cd5	build: kernel: Always pass -f to the kernel builder -f forces the (re)generaton of the config when doing the setup, which helps a lot on local development whilst not causing any harm in the CI builds. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 12:26:02 +01:00
Fabiano Fidêncio	910defc4cf	Merge pull request #10490 from fidencio/topic/fix-ovmf-build builds: ovmf: Workaround Zeex repo becoming private	2024-11-05 12:25:00 +01:00
Fabiano Fidêncio	aff3d98ddd	builds: ovmf: Workaround Zeex repo becoming private Let's just do a simple `sed` and not use the repo that became private. This is not a backport of https://github.com/tianocore/edk2/pull/6402, but it's a similar approach that allows us to proceed without the need to pick up a newer version of edk2. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-11-05 11:25:54 +01:00
Dan Mihai	03bf4433d7	Merge pull request #10459 from stevenhorsman/update-bats tests: k8s: Update bats	2024-11-04 12:26:58 -08:00
Aurélien Bombo	f639d3e87c	Merge pull request #10395 from Sumynwa/sumsharma/create_container agent-ctl: Add support to test kata-agent's container creation APIs.	2024-11-04 14:09:12 -06:00
GabyCT	7f066be04e	Merge pull request #10485 from GabyCT/topic/fixghast gha: Fix source for gha stability run script	2024-11-04 12:09:28 -06:00
Steve Horsman	a2b9527be3	Merge pull request #10481 from mkulke/mkulke/init-cdh-client-on-gcprocs-none agent: perform attestation init w/o process launch	2024-11-04 17:27:45 +00:00
Gabriela Cervantes	fd4d0dd1ce	gha: Fix source for gha stability run script This PR fixes the source to avoid duplication specially in the common.sh script and avoid failures saying that certain script is not in the directory. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-11-04 16:16:13 +00:00
Magnus Kulke	bf769851f8	agent: perform attestation init w/o process launch This change is motivated by a problem in peerpod's podvms. In this setup the lifecycle of guest components is managed by systemd. The current code skips over init steps like setting the ocicrypt-rs env and initialization of a CDH client in this case. To address this the launch of the processes has been isolated into its own fn. Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>	2024-11-04 13:31:07 +01:00
Steve Horsman	4fd9df84e4	Merge pull request #10482 from GabyCT/topic/fixvirtdoc docs: Update virtualization document	2024-11-04 11:51:09 +00:00
stevenhorsman	175ebfec7c	Revert "k8s:kbs: Add trap statement to clean up tmp files" This reverts commit `973b8a1d8f`. As @danmihai1 points out https://github.com/bats-core/bats-core/issues/364 states that using traps in bats is error prone, so this could be the cause of the confidential test instability we've been seeing, like it was in the static checks, so let's try and revert this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-04 09:59:37 +00:00
stevenhorsman	75cb1f46b8	tests/k8s: Add skip is setup_common fails At @danmihai1's suggestion add a die message in case the call to setup_common fails, so we can see if in the test output. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-04 09:59:33 +00:00
stevenhorsman	3f5bf9828b	tests: k8s: Update bats We've seen some issues with tests not being run in some of the Coco CI jobs (Issue #10451) and in the envrionments that are more stable we noticed that they had a newer version of bats installed. Try updating the version to 1.10+ and print out the version for debug purposes Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-11-04 09:59:33 +00:00
Steve Horsman	06d2cc7239	Merge pull request #10453 from bpradipt/remote-annotation runtime: Add GPU annotations for remote hypervisor	2024-11-04 09:10:06 +00:00
Zvonko Kaiser	3781526c94	gpu: Add VARIANT to the initrd and image build We need to know if we're building a nvidia initrd or image Additionally if we build a regular or confidential VARIANT Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-01 18:34:13 +00:00
Zvonko Kaiser	95b69c5732	build: initrd make it coherent to the image build Add -f for moving the initrd to the correct file path Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-01 18:34:13 +00:00
Zvonko Kaiser	3c29c1707d	image: Add suffix to image or initrd depending on the NVIDIA driver version Fixes: #9478 We want to keep track of the driver versions build during initrd/image build so update the artifact_name after the fact. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-11-01 18:34:13 +00:00
Sumedh Alok Sharma	4b7aba5c57	agent-ctl: Add support to test kata-agent's container creation APIs. This commit introduces changes to enable testing kata-agent's container APIs of CreateContainer/StartContainer/RemoveContainer. The changeset include: - using confidential-containers image-rs crate to pull/unpack/mount a container image. Currently supports only un-authenicated registry pull - re-factor api handlers to reduce cmdline complexity and handle request generation logic in tool - introduce an OCI config template for container creation - add test case Fixes #9707 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-11-01 22:18:54 +05:30
Fabiano Fidêncio	2efcb442f4	Merge pull request #10442 from Sumynwa/sumsharma/tools_use_ubuntu_static_build ci: Use ubuntu for static building of kata tools.	2024-11-01 16:04:31 +01:00
Gabriela Cervantes	1ca83f9d41	docs: Update virtualization document This PR updates the virtualization document by removing a url link which is not longer valid. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-31 17:28:02 +00:00
GabyCT	a3d594d526	Merge pull request #10480 from GabyCT/topic/fixstabilityrun gha: Add missing steps in Kata stability workflow	2024-10-31 09:57:33 -06:00
Fabiano Fidêncio	e058b92350	Merge pull request #10425 from burgerdev/darwin genpolicy: support darwin target	2024-10-31 12:16:44 +01:00
Markus Rudy	df5e6e65b5	protocols: only build RLimit impls on Linux The current version of the oci-spec crate compiles RLimit structs only for Linux and Solaris. Until this is fixed upstream, add compilation conditions to the type converters for the affected structs. Fixes: #10071 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-10-31 09:50:36 +01:00
Markus Rudy	091a410b96	kata-sys-util: move json parsing to protocols crate The parse_json_string function is specific to parsing capability strings out of ttRPC proto definitions and does not benefit from being available to other crates. Moving it into the protocols crate allows removing kata-sys-util as a dependency, which in turn enables compiling the library on darwin. Fixes: #10071 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-10-31 09:41:07 +01:00
Markus Rudy	8ab4bd2bfc	kata-sys-util: remove obsolete cgroups dependency The cgroups.rs source file was removed in `234d7bca04`. With cgroups support handled in runtime-rs, the cgroups dependency on kata-sys-util can be removed. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-10-31 09:41:07 +01:00
Sumedh Alok Sharma	0adf7a66c3	ci: Use ubuntu for static building of kata tools. This commit introduces changes to use ubuntu for statically building kata tools. In the existing CI setup, the tools currently build only for x86_64 architecture. It also fixes the build error seen for agent-ctl PR#10395. Fixes #10441 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-10-31 13:19:18 +05:30
Gabriela Cervantes	c4089df9d2	gha: Add missing steps in Kata stability workflow This PR adds missing steps in the gha run script for the kata stability workflow. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-30 19:13:15 +00:00
Xuewei Niu	1a216fecdf	Merge pull request #10225 from Chasing1020/main runtime-rs: Add basic boilerplate for remote hypervisor	2024-10-30 17:02:50 +08:00
Hyounggyu Choi	dca69296ae	Merge pull request #10476 from BbolroC/switch-to-kubeadm-s390x gha: Switch KUBERNETES from k3s to kubeadm on s390x	2024-10-30 09:52:06 +01:00
GabyCT	9293931414	Merge pull request #10474 from GabyCT/topic/removeunvarb packaging: Remove kernel config repo variable as it is unused	2024-10-29 12:52:07 -06:00
Gabriela Cervantes	69ee287e50	packaging: Remove kernel config repo variable as it is unused This PR removes the kernel config repo variable at the build kernel script as it is not used. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-29 17:09:52 +00:00
GabyCT	8539cd361a	Merge pull request #10462 from GabyCT/topic/increstress tests: Increase time to run stressng k8s tests	2024-10-29 11:08:47 -06:00
Chasing1020	425f6ad4e6	runtime-rs: add oci spec for prepare_vm method The cloud-api-adaptor needs to support different types of pod VM instance. We needs to pass some annotations like machine_type, default_vcpus and default_memory to prepare the VMs. Signed-off-by: Chasing1020 <643601464@qq.com>	2024-10-30 01:01:28 +08:00
Chasing1020	f1167645f3	runtime-rs: support for remote hypervisors type This patch adds the support of the remote hypervisor type for runtime-rs. The cloud-api-adaptor needs the annotations and network namespace path to create the VMs. The remote hypervisor opens a UNIX domain socket specified in the config file, and sends ttrpc requests to a external process to control sandbox VMs. Fixes: #10350 Signed-off-by: Chasing1020 <643601464@qq.com>	2024-10-30 00:54:17 +08:00
Pradipta Banerjee	6f1ba007ed	runtime: Add GPU annotations for remote hypervisor Add GPU annotations for remote hypervisor to help with the right instance selection based on number of GPUs and model Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>	2024-10-29 10:28:21 -04:00
Steve Horsman	68225b53ca	Merge pull request #10475 from stevenhorsman/revert-10452 Revert "tests: Add trap statement in kata doc script"	2024-10-29 13:58:00 +00:00
Hyounggyu Choi	aeef28eec2	gha: Switch to kubeadm for run-k8s-tests-on-zvsi Last November, SUSE discontinued support for s390x, leaving k3s on this platform stuck at k8s version 1.28, while upstream k8s has since reached 1.31. Fortunately, kubeadm allows us to create a 1.30 Kubernetes cluster on s390x. This commit switches the KUBERNETES option from k3s to kubeadm for s390x and removes a dedicated cluster creation step. Now, cluster setup and teardown occur in ACTIONS_RUNNER_HOOK_JOB_{STARTED,COMPLETED}. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-10-29 14:27:32 +01:00
Hyounggyu Choi	238f67005f	tests: Add `kubeadm` option for KUBERNETES in gha-run.sh When creating a k8s cluster via kubeadm, the devmapper setup for containerd requires a different configuration. This commit introduces a new `kubeadm` option for the KUBERNETES variable and adjusts the path to the containerd config file for devmapper setup. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-10-29 14:19:42 +01:00
stevenhorsman	b1cffb4b09	Revert "tests: Add trap statement in kata doc script" This reverts commit `093a6fd542`. as it is breaking the static checks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-29 09:57:18 +00:00
Aurélien Bombo	eb04caaf8f	Merge pull request #10074 from koct9i/log-vm-start-error runtime: log vm start error before cleanup	2024-10-28 14:39:00 -05:00
Fabiano Fidêncio	e675e233be	Merge pull request #10473 from fidencio/topic/build-cache-fix-shim-v2-root_hash.txt-location build: cache: Ensure shim-v2-root_hash.txt is in "${workdir}"	2024-10-28 16:53:06 +01:00
Fabiano Fidêncio	f19c8cbd02	build: cache: Ensure shim-v2-root_hash.txt is in "${workdir}" All the oras push logic happens from inside `${workdir}`, while the root_hash.txt extraction and renaming was not taking this into consideration. This was not caught during the manually triggered runs as those do not perform the oras push. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 15:17:16 +01:00
Steve Horsman	51bc71b8d9	Merge pull request #10466 from kata-containers/topic/ensure-shim-v2-sets-the-measured-rootfs-parameters-to-the-config re-enable measured rootfs build & tests	2024-10-28 13:11:50 +00:00
Fabiano Fidêncio	b70d7c1aac	tests: Enable measured rootfs tests for qemu-coco-dev Then it's on pair with what's being tested with TEEs using a rootfs image. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:54 +01:00
Fabiano Fidêncio	d23d057ac7	runtime: Enable measured rootfs for qemu-coco-dev Let's make sure we are prepared to test this with non-TEE environments as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	7d202fc173	tests: Re-enable measured_rootfs test for TDX As we're now building everything needed to test TDX with measured rootfs support, let's bring this test back in (for TDX only, at least for now). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	d537932e66	build: shim-v2: Ensure MEASURED_ROOTFS is exported The approach taken for now is to export MEASURED_ROOTFS=yes on the workflow files for the architectures using confidential stuff, and leave the "normal" build without having it set (to avoid any change of expectation on the current bevahiour). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	9c8b20b2bf	build: shim-v2: Rebuild if root_hashes do not match Let's make sure we take the root_hashes into consideration to decide whether the shim-v2 should or should not be used from the cached artefacts. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	9c84998de9	build: cache: Cache root_hash.txt used by the shim-v2 Let's cache the root_hash.txt from the confidential image so we can use them later on to decide whether there was a rootfs change that would require shim-v2 to be rebuilt. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	d2d9792720	build: Don't leave cached component behind if it can't be used Let's ensure we remove the component and any extra tarball provided by ORAS in case the cached component cannot be used. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	ef29824db9	runtime: Don't do measured rootfs for "vanilla" kernel We may decide to add this later on, but for now this is only targetting TEEs and the confidential image / initrd. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	a65946bcb0	workflows: build: Ensure rootfs is present for shim-v2 build Let's ensure that we get the already built rootfs tarball from previous steps of the action at the time we're building the shim-v2. The reason we do that is because the rootfs binary tarballs has a root_hash.txt file that contains the information needed the shim-v2 build scripts to add the measured rootfs arguments to the shim-v2 configuration files. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	6ea0369878	workflows: build: Ensure rootfs is built before shim-v2 As the rootfs will have what we need to add as part of the shim-v2 configuration files for measured rootfs, we must ensure this is built before shim-v2. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	13ea082531	workflows: Build rootfs after its deps are built By doing this we can just re-use the dependencies already built, saving us a reasonable amount of time. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:53 +01:00
Fabiano Fidêncio	eb07a809ce	tests: Add a helper script to use prebuild components This is a helper script that does basically what's already being done by the s390x CI, which is: * Move a folder with the components that we were stored / downloaded during the GHA execution to the expected `build` location * Get rid of the dependencies for a specific asset, as the dependencies are already pulled in from previous GHA steps For now this script is only being added but not yet executed anywhere, and that will come as the next step in this series. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:52 +01:00
Fabiano Fidêncio	c2b18f9660	workflows: Store rootfs dependencies So far we haven't been storing the rootfs dependencies as part of our workflows, but we better do it to re-use them as part of the rootfs build. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 12:43:52 +01:00
Steve Horsman	b5f503b0b5	Merge pull request #10471 from fidencio/topic/possibly-fix-release-workflow workflows: Possibly fix the release workflow	2024-10-28 11:38:33 +00:00
Konstantin Khlebnikov	ee50582848	runtime: log vm start error before cleanup Return of proper error to the initiator is not guaranteed. Method StopVM could kill shim process together with VM pieces. Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>	2024-10-28 11:21:21 +01:00
Fabiano Fidêncio	a8fad6893a	workflows: Possibly fix the release workflow The only reason we had this one passing for amd64 is because the check was done using the wrong variable (`matrix.stage`, while in the other workflows the variable used is `inputs.stage`). The commit that broke the release process is `67a8665f51`, which blindly copy & pasted the logic from the matrix assets. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-28 11:15:53 +01:00
Steve Horsman	ad5749fd6b	Merge pull request #10467 from stevenhorsman/release-3.10.1 release: Bump version to 3.10.1	2024-10-25 20:19:23 +01:00
stevenhorsman	b22d4429fb	release: Bump version to 3.10.1 Fix release to pick up #10463 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-25 17:16:09 +01:00
Steve Horsman	19ac0b24f1	Merge pull request #10463 from skaegi/rustjail_filemode_perm_fix agent: Correct rustjail device filemode permission typo	2024-10-25 14:27:50 +01:00
Fabiano Fidêncio	cc815957c0	Merge pull request #10461 from kata-containers/topic/workflows-follow-up-on-manually-triggered-job workflows: devel: Follow-up on the manually triggered jobs	2024-10-25 08:31:14 +02:00
Simon Kaegi	322846b36f	agent: Correct rustjail device filemode permission typo Corrects device filemode permissions typo/regression in rustjail to `666` instead of `066`. `666` is the standard and expected value for these devices in containers. Fixes: #10454 Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>	2024-10-24 16:46:40 -04:00
GabyCT	a9af46ccd2	Merge pull request #10452 from GabyCT/topic/katadoctemp tests: Add trap statement in kata doc script	2024-10-24 13:21:11 -06:00
Gabriela Cervantes	a3ef8c0a16	tests: Increase time to run stressng k8s tests This PR increase the time to run the stressng k8s tests for the CoCo stability CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-24 16:34:17 +00:00
Fabiano Fidêncio	475ad3e06b	workflows: devel: Allow running more than one at once More than one developer can and should be able to run this workflow at the same time, without cancelling the job started by another developer. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-24 15:38:35 +02:00
Fabiano Fidêncio	8f634ceb6b	workflows: devel: Adjust the pr-number Let's use "dev" instead of "manually-triggered" as it avoids the name being too long, which results in failures to create AKS clusters. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-24 15:38:31 +02:00
GabyCT	41d1178e4a	Merge pull request #10438 from GabyCT/topic/fixspellreadme docs: Fix misspelling in CI documentation	2024-10-23 13:34:52 -06:00
Steve Horsman	c5c389f473	Merge pull request #10449 from kata-containers/topic/add-workflows-specifically-for-testing Add a specific workflow for testing the CI, without messing up with the "nightly" weather	2024-10-23 19:03:49 +01:00
Gabriela Cervantes	093a6fd542	tests: Add trap statement in kata doc script This PR adds the trap statement into the kata doc script to clean up properly the temporary files. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-23 15:56:58 +00:00
Gabriela Cervantes	701891312e	docs: Fix misspelling in CI documentation This PR fixes a misspelling in CI documentation readme. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-23 15:42:08 +00:00
Fabiano Fidêncio	829415dfda	workflows: Remove the possibility to manually trigger the nightly CI As a new workflow was added for the cases where developers want to test their changes in the workflow itself, let's make sure we stop allowing manual triggers on this workflow, which can lead to a polluted / misleading weather of the CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-23 13:19:45 +02:00
Fabiano Fidêncio	cc093cdfdb	workflows: Add a manually trigger "devel" workflow for the CI This workflow is intended to replace the `workflow_dispatch` trigger currently present as part of the `ci-nightly.yaml`. The reasoning behind having this done in this way is because of our good and old GHA behaviour for `pull_request_target`, which requires a PR to be merged in order to check the changes in the workflow itself, which leads to: * when a change in a workflow is done, developers (should) do: * push their branch to the kata-containers repo * manually trigger the "nightly" CI in order to ensure the changes don't break anything * this can result in the "nightly" CI weather being polluted * we don't have the guarantee / assurance about the last n nightly runs anymore Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-23 13:14:50 +02:00
Greg Kurz	378f454fb9	Merge pull request #10208 from wtootw/main runtime: Failed to clean up resources when QEMU is terminated	2024-10-23 12:11:57 +02:00
Fabiano Fidêncio	ca416d8837	Merge pull request #10446 from kata-containers/topic/re-work-shim-v2-build-as-part-of-the-ci-and-release workflows: Ensure shim-v2 is built as the last asset	2024-10-23 09:27:29 +02:00
Fabiano Fidêncio	c082b99652	Merge pull request #10439 from microsoft/mahuber/azl-cfg-var tools: Change PACKAGES var for cbl-mariner	2024-10-23 08:39:49 +02:00
Manuel Huber	a730cef9cf	tools: Change PACKAGES var for cbl-mariner Change the PACKAGES variable for the cbl-mariner rootfs-builder to use the kata-packages-uvm meta package from packages.microsoft.com to define the set of packages to be contained in the UVM. This aligns the UVM build for the Azure Linux distribution with the UVM build done for the Kata Containers offering on Azure Kubernetes Services (AKS). Signed-off-by: Manuel Huber <mahuber@microsoft.com>	2024-10-22 23:11:42 +00:00
Fabiano Fidêncio	67a8665f51	workflows: Ensure shim-v2 is built as the last asset By doing this we can ensure that whenever the rootfs changes, we'll be able to get the new root_hash.txt and use it. This is the very first step to bring the measured rootfs tests back. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-22 14:56:37 +02:00
Greg Kurz	3de6d09a86	Merge pull request #10443 from gkurz/release-3.10.0 release: Bump VERSION to 3.10.0	2024-10-22 14:46:30 +02:00
Greg Kurz	3037303e09	release: Bump VERSION to 3.10.0 Let's start the 3.10.0 release. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-10-22 11:28:15 +02:00
wangyaqi54	cf4b81344d	runtime: Failed to clean up resources when QEMU is terminated by signal 15 When QEMU is terminated by signal 15, it deletes the PidFile. Upon detecting that QEMU has exited, the shim executes the stopVM function. If the PidFile is not found, the PID is set to 0. Subsequently, the shim executes `kill -9 0`, which terminates the current process group. This prevents any further logic from being executed, resulting in resources not being cleaned up. Signed-off-by: wangyaqi54 <wangyaqi54@jd.com>	2024-10-22 17:04:46 +08:00
Fabiano Fidêncio	4c34cfb0ab	Merge pull request #10420 from pmores/add-support-for-virtio-scsi runtime-rs: support virtio-scsi device in qemu-rs	2024-10-22 11:00:33 +02:00
Pavel Mores	8cdd968092	runtime-rs: support virtio-scsi device in qemu-rs Semantics are lifted straight out of the go runtime for compatibility. We introduce DeviceVirtioScsi to represent a virtio-scsi device and instantiate it if block device driver in the configuration file is set to virtio-scsi. We also introduce ObjectIoThread which is instantiated if the configuration file additionally enables iothreads. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-22 08:55:54 +02:00
Greg Kurz	91b874f18c	Merge pull request #10421 from Apokleos/hostname-bugfix kata-agent: fixing bug of unable setting hostname correctly.	2024-10-22 00:26:51 +02:00
alex.lyn	b25538f670	ci: Introduce CI to validate pod hostname Fixes #10422 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-10-21 16:32:56 +01:00
alex.lyn	3dabe0f5f0	kata-agent: fixing bug of unable setting hostname correctly. When do update_container_namespaces updating namespaces, setting all UTS(and IPC) namespace paths to None resulted in hostnames set prior to the update becoming ineffective. This was primarily due to an error made while aligning with the oci spec: in an attempt to match empty strings with None values in oci-spec-rs, all paths were incorrectly set to None. Fixes #10325 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-10-21 16:32:56 +01:00
Steve Horsman	98886a7571	Merge pull request #10437 from mkulke/mkulke/dont-parse-oci-image-for-cached-artifacts ci: don't parse oci image for cached artifacts	2024-10-21 16:31:23 +01:00
Magnus Kulke	e27d70d47e	ci: don't parse oci image for cached artifacts Moved the parsing of the oci image marker into its own step, since we only need to perform that for attestation purposes and some cached images might not have that file in the tarball. Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>	2024-10-21 14:50:00 +02:00
Magnus Kulke	9a33a3413b	Merge pull request #10433 from mkulke/mkulke/add-provenance-attestation-for-agent-builds ci: add provenance attestation for agent artifact	2024-10-18 15:00:18 +02:00
Anastassios Nanos	68d539f5c5	Merge pull request #10435 from nubificus/fix_fc_machineconfig runtime-rs: Use vCPU and memory values from config	2024-10-18 13:41:20 +01:00
Magnus Kulke	b93f5390ce	ci: add provenance attestation for agent artifact This adds provenance attestation logic for agent binaries that are published to an oci registry via ORAS. As a downstream consumer of the kata-agent binary the Peerpod project needs to verify that the artifact has been built on kata's CI. To create an attestation we need to know the exact digest of the oci artifact, at the point when the artifact was pushed. Therefore we record the full oci image as returned by oras push. The pushing and tagging logic has been slightly reworked to make this task less repetetive. The oras cli accepts multiple tags separated by comma on pushes, so a push can be performed atomically instead of iterating through tags and pushing each individually. This removes the risk of partially successful push operations (think: rate limits on the oci registry). So far the provenance creation has been only enabled for agent builds on amd64 and xs390x. Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>	2024-10-18 10:24:00 +02:00
Anastassios Nanos	23f5786cca	runtime-rs: Use vCPU and memory values from config Use values from the config for the setup of the microVM. Fixes: #10434 Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2024-10-17 23:17:02 +01:00
GabyCT	4ae9317675	Merge pull request #10430 from GabyCT/topic/ciaz docs: Update CI documentation	2024-10-17 15:09:24 -06:00
GabyCT	b00203ba9b	Merge pull request #10428 from GabyCT/topic/archk8sc gha: Use a arch_to_golang variable to have uniformity	2024-10-17 11:00:59 -06:00
Chengyu Zhu	cca77f0911	Merge pull request #10412 from stevenhorsman/agent-config-rstest agent: config: Use rstest for unit tests	2024-10-17 23:01:21 +08:00
Gabriela Cervantes	e3efad8ed2	docs: Update CI documentation This PR updates the CI documentation referring to the several tests and in which kind of instances is running them. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-16 19:23:19 +00:00
stevenhorsman	4adb454ed0	agent: config: Use rstest for unit tests Use rstest for unit test rather than TestData arrays where possible to make the code more compact, easier to read and open the possibility to enhance test cases with a description more easily. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-16 16:55:44 +01:00
Gabriela Cervantes	f0e0c74fd4	gha: Use a arch_to_golang variable to have uniformity This PR replaces the arch uname -m to use the arch_to_golang variable in the script to have a better uniformity across the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-15 20:03:09 +00:00
Dan Mihai	69509eff33	Merge pull request #10417 from microsoft/danmihai1/k8s-inotify.bats tests: k8s-inotify.bats improvements	2024-10-15 11:22:53 -07:00
Dan Mihai	ece0f9690e	tests: k8s-inotify: longer pod termination timeout inotify-configmap-pod.yaml is using: "inotifywait --timeout 120", so wait for up to 180 seconds for the pod termination to be reported. Hopefully, some of the sporadic errors from #10413 will be avoided this way: not ok 1 configmap update works, and preserves symlinks waitForProcess "${wait_time}" "$sleep_time" "${command}" failed Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-15 16:01:25 +00:00
Dan Mihai	ccfb7faa1b	tests: k8s-inotify.bats: don't leak configmap Delete the configmap if the test failed, not just on the successful path. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-15 16:01:25 +00:00
Aurélien Bombo	f13d13c8fa	Merge pull request #10416 from microsoft/danmihai1/mariner_static_sandbox_resource_mgmt ci: static_sandbox_resource_mgmt for cbl-mariner	2024-10-15 10:40:17 -05:00
Aurélien Bombo	c371b4e1ce	Merge pull request #10426 from 3u13r/fix/genpolicy/handle-config-map-binary-data genpolicy: read binaryData value as String	2024-10-14 21:31:23 -05:00
Leonard Cohnen	c06bf2e3bb	genpolicy: read binaryData value as String While Kubernetes defines `binaryData` as `[]byte`, when defined in a YAML file the raw bytes are base64 encoded. Therefore, we need to read the YAML value as `String` and not as `Vec<u8>`. Fixes: #10410 Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-10-14 20:03:11 +02:00
Aurélien Bombo	f9b7a8a23c	Merge pull request #10402 from Sumynwa/sumsharma/agent-ctl-dependencies ci: Install build dependencies for building agent-ctl with image pull.	2024-10-14 10:28:32 -05:00
Sumedh Alok Sharma	bc195d758a	ci: Install build dependencies for building agent-ctl with image pull. Adds dependencies of 'clang' & 'protobuf' to be installed in runners when building agent-ctl sources having image pull support. Fixes #10400 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-10-14 10:36:04 +05:30
Aurélien Bombo	614e21ccfb	Merge pull request #10415 from GabyCT/topic/egreptim tools/osbuilder/tests: Remove egrep in test images script	2024-10-11 13:47:30 -05:00
Gabriela Cervantes	aae654be80	tools/osbuilder/tests: Remove egrep in test images script This PR removes egrep command as it has been deprecated and it replaces by grep in the test images script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-11 17:23:35 +00:00
Dan Mihai	3622b5e8b4	ci: static_sandbox_resource_mgmt for cbl-mariner Use the configuration used by AKS (static_sandbox_resource_mgmt=true) for CI testing on Mariner hosts. Hopefully pod startup will become more predictable on these hosts - e.g., by avoiding the occasional hotplug timeouts described by #10413. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-10 22:17:39 +00:00
Fabiano Fidêncio	02f5fd94bd	Merge pull request #10409 from fidencio/topic/ci-add-ita_image-and-ita_image_tag kbs: ita: Ensure the proper image / image_tag is used for ITA	2024-10-10 11:46:26 +02:00
Fabiano Fidêncio	cf5d3ed0d4	kbs: ita: Ensure the proper image / image_tag is used for ITA When dealing with a specific release, it was easier to just do some adjustments on the image that has to be used for ITA without actually adding a new entry in the versions.yaml. However, it's been proven to be more complicated than that when it comes to dealing with staged images, and we better explicitly add (and update) those versions altogether to avoid CI issues. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-10 10:01:33 +02:00
Steve Horsman	0c4a7c8771	Merge pull request #10406 from ChengyuZhu6/fix-unit agent:cdh: fix unit tests about sealed secret	2024-10-10 08:57:28 +01:00
Fabiano Fidêncio	3f7ce1d620	Merge pull request #10401 from stevenhorsman/kbs-deploy-overlays-update Kbs deploy overlays update	2024-10-10 09:50:19 +02:00
Fabiano Fidêncio	036b04094e	Merge pull request #10397 from fidencio/topic/build-remove-initrd-mariner-target build: mariner: Remove the ability to build the marine initrd	2024-10-10 09:44:36 +02:00
ChengyuZhu6	65ecac5777	agent:cdh: fix unit tests about sealed secret The root cause is that the CDH client is a global variable, and unit tests `test_unseal_env` and `test_unseal_file` share this lock-free global variable, leading to resource contention and destruction. Merging the two unit tests into one test_sealed_secret will resolve this issue. Fixes: #10403 Signed-off-by: ChengyuZhu6 <zhucy0405@gmail.com>	2024-10-10 08:38:06 +08:00
ChengyuZhu6	a992feb7f3	Revert "Revert "agent:cdh: unittest for sealed secret as file"" This reverts commit `b5142c94b9`. Signed-off-by: ChengyuZhu6 <zhucy0405@gmail.com>	2024-10-10 08:37:06 +08:00
GabyCT	0cda92c6d8	Merge pull request #10407 from GabyCT/topic/fixbuildk packaging: Remove unused variable in build kernel script	2024-10-09 16:53:45 -06:00
Gabriela Cervantes	616eb8b19b	packaging: Remove unused variable in build kernel script This PR removes an unused variable in the build kernel script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-09 20:02:56 +00:00
Fabiano Fidêncio	652ba30d4a	build: mariner: Remove the ability to build the marine initrd As mariner has switched to using an image instead of an initrd, let's just drop the abiliy to build the initrd and avoid keeping something in the tree that won't be used. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 21:42:55 +02:00
Fabiano Fidêncio	59e3ab07e4	Merge pull request #10396 from fidencio/topic/ci-mariner-test-using-mariner-image-instead-of-initrd ci: mariner: Use the image instead of the initrd	2024-10-09 21:39:44 +02:00
stevenhorsman	b2fb19f8f8	versions: Bump KBS version Bump to the commit that had the overlays changes we want to adapt to. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-09 17:49:21 +01:00
Fabiano Fidêncio	01a957f7e1	ci: mariner: Stop building mariner initrd As the mariner image is already in place, and the tests were modified to use them (as part of this series), let's just stop building it as part of the CI. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 18:23:35 +02:00
Fabiano Fidêncio	091ad2a1b2	ci: mariner: Ensure kernel_params can be set The reason we're doing this is because mariner image uses, by default, cgroups default-hierarchy as `unified` (aka, cgroupsv2). In order to keep the same initrd behaviour for mariner, let's enforce that `SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1 systemd.legacy_systemd_cgroup_controller=yes systemd.unified_cgroup_hierarchy=0` is passed to the kernel cmdline, at least for now. Other tests that are setting `kernel_params` are not running on mariner, then we're safe taking this path as it's done as part of this PR. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 18:23:35 +02:00
Fabiano Fidêncio	3bbf3c81c2	ci: mariner: Use the image instead of the initrd As an image has been added for mariner as part of the commit `63c1f81c2`, let's start using it in the CI, instead of using the initrd. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 18:23:32 +02:00
Fabiano Fidêncio	9c0c159b25	Merge pull request #10404 from fidencio/topic/rever-sealed-secrets-tests Revert "agent:cdh: unittest for sealed secret as file"	2024-10-09 18:09:09 +02:00
GabyCT	2035d638df	Merge pull request #10388 from GabyCT/topic/testimtemp tools/osbuilder/tests: Add trap statement in test images script	2024-10-09 09:49:45 -06:00
Fabiano Fidêncio	b5142c94b9	Revert "agent:cdh: unittest for sealed secret as file" This reverts commit `31e09058af`, as it's breaking the agent unit tests CI. This is a stop gap till Chengyu Zhu finds the time to properly address the issue, avoiding the CI to be blocked for now. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-09 16:06:09 +02:00
stevenhorsman	8763880e93	tests/k8s: kbs: Update overlays logic In https://github.com/confidential-containers/trustee/pull/521 the overlays logic was modified to add non-SE s390x support and simplify non-ibm-se platforms. We need to update the logic in `kbs_k8s_deploy` to match and can remove the dummying of `IBM_SE_CREDS_DIR` for non-SE now Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-10-09 09:39:41 +01:00
Gabriela Cervantes	e08749ce58	tools/osbuilder/tests: Add trap statement in test images script This PR adds the trap statement in the test images script to clean up tmp files. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-08 19:54:23 +00:00
Fabiano Fidêncio	80196c06ad	Merge pull request #10390 from microsoft/danmihai1/new-rootfs-image-mariner local-build: add ability to build rootfs-image-mariner	2024-10-08 21:40:43 +02:00
Fabiano Fidêncio	083b2f24d8	Merge pull request #10363 from ChengyuZhu6/secret-as-volume Support Confidential Sealed Secrets (as volume)	2024-10-08 19:23:40 +02:00
Dan Mihai	63c1f81c23	local-build: add rootfs-image-mariner Kata CI will start testing the new rootfs-image-mariner instead of the older rootfs-initrd-mariner image. The "official" AKS images are moving from a rootfs-initrd-mariner format to the rootfs-image-mariner format. Making the same change in Kata CI is useful to keep this testing in sync with the AKS settings. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-08 17:15:56 +00:00
GabyCT	7a38cce73c	Merge pull request #10383 from kata-containers/topic/imagevar image-builder: Remove unused variable	2024-10-08 10:27:03 -06:00
Aurélien Bombo	e56af7a370	Merge pull request #10389 from emanuellima1/fix-agent-policy build: Fix RPM build fail due to AGENT_POLICY	2024-10-08 09:59:21 -05:00
ChengyuZhu6	a94024aedc	tests: add test for sealed file secrets add a test for sealed file secrets. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-10-08 16:01:48 +08:00
ChengyuZhu6	fe307303c8	agent:rpc: Refactor CDH-related operations Refactor CDH-related operations into the cdh_handler function to make the `create_container` code clearer. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-10-08 16:01:48 +08:00
ChengyuZhu6	31e09058af	agent:cdh: unittest for sealed secret as file add unittest for sealed secret as file. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-10-08 16:01:48 +08:00
ChengyuZhu6	974d6b0736	agent:cdh: initialize cdhclient with the input cdh socket uri Refactor cdh code to initialize cdhclient with the input cdh socket uri. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-10-08 14:58:07 +08:00
ChengyuZhu6	1f33fd4cd4	agent:rpc: handle the sealed secret in createcontainer Users must set the mount path to `/sealed/<path>` for kata agent to detect the sealed secret mount and handle it in createcontainer stage. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-10-08 14:58:07 +08:00
ChengyuZhu6	da281b4444	agent:cdh: support to unseal secret as file Introduced `unseal_file` function to unseal secret as files: - Implemented logic to handle symlinks and regular files within the sealed secret directory. - For each entry, call CDH to unseal secrets and the unsealed contents are written to a new file, and a symlink is created to replace the sealed symlink. Fixes: #8123 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-10-08 14:58:07 +08:00
Fabiano Fidêncio	71d0c46e0a	Merge pull request #10384 from microsoft/danmihai1/virtio-fs-policy tests: k8s: AUTO_GENERATE_POLICY=yes for local testing	2024-10-07 21:25:52 +02:00
Emanuel Lima	e989e7ee4e	build: Fix RPM build fail due to AGENT_POLICY By checking for AGENT_POLICY we ensure we only try to read allow-all.rego if AGENT_POLICY is set to "yes" Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-10-07 15:43:23 -03:00
Dan Mihai	6d5fc898b8	tests: k8s: AUTO_GENERATE_POLICY=yes for local testing The behavior of Kata CI doesn't change. For local testing using kubernetes/gha-run.sh and AUTO_GENERATE_POLICY=yes: 1. Before these changes users were forced to use: - SEV, SNP, or TDX guests, or - KATA_HOST_OS=cbl-mariner 2. After these changes users can also use other platforms that are configured with "shared_fs = virtio-fs" - e.g., - KATA_HOST_OS=ubuntu + KATA_HYPERVISOR=qemu Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-04 18:26:00 +00:00
Dan Mihai	5aaef8e6eb	Merge pull request #10376 from microsoft/danmihai1/auto-generate-just-for-ci gha: enable AUTO_GENERATE_POLICY where needed	2024-10-04 10:52:31 -07:00
Gabriela Cervantes	4cd737d9fd	image-builder: Remove unused variable This PR removes an unused variable in the image builder script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-04 15:56:28 +00:00
Greg Kurz	77c5db6267	Merge pull request #9637 from ldoktor/selective-ci CI: Select jobs by touched code	2024-10-04 11:29:05 +02:00
GabyCT	2d089d9695	Merge pull request #10381 from GabyCT/topic/archrootfs osbuilder: Remove duplicated arch variable definition	2024-10-03 14:48:08 -06:00
Wainer Moschetta	b9025462fb	Merge pull request #10134 from ldoktor/ci-sort-range ci.ocp: Sort images according to git	2024-10-03 15:08:41 -03:00
Chelsea Mafrica	9138f55757	Merge pull request #10375 from GabyCT/topic/mktempkbs k8s:kbs: Add trap statement to clean up tmp files	2024-10-03 12:32:30 -04:00
Gabriela Cervantes	d7c2b7d13c	osbuilder: Remove duplicated arch variable definition This PR removes duplicated arch variable definition in the rootfs script as this variable and its value is already defined at the top of the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-03 16:22:27 +00:00
Greg Kurz	96336d141b	Merge pull request #10165 from pmores/add-network-device-hotplugging runtime-rs: add network device hotplugging to qemu-rs	2024-10-03 17:44:50 +02:00
Pavel Mores	23927d8a94	runtime-rs: plug in netdev hotplugging functionality and actually call it add_device() now checks if QEMU is running already by checking if we have a QMP connection. If we do a new function hotplug_device() is called which hotplugs the device if it's a network one. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:23:10 +02:00
Pavel Mores	ac393f6316	runtime-rs: implement netdev hotplugging for qemu-rs With the helpers from previous commit, the actual hotplugging implementation, though lengthy, is mostly just assembling a QMP command to hotplug the network device backend and then doing the same for the corresponding frontend. Note that hotplug_network_device() takes cmdline_generator types Netdev and DeviceVirtioNet. This is intentional and aims to take advantage of the similarity between parameter sets needed to coldplug and hotplug devices reuse and simplify our code. To enable using the types from qmp, accessors were added as needed. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:20:02 +02:00
Pavel Mores	4eb7e2966c	runtime-rs: add netdev hotplugging helpers to qemu-rs Before adding network device hotplugging functionality itself we add a couple of helpers in a separate commit since their functionality is non-trivial. To hotplug a device we need a free PCI slot. We add find_free_slot() which can be called to obtain one. It looks for PCI bridges connected to the root bridge and looks for an unoccupied slot on each of them. The first found is returned to the caller. The algorithm explicitly doesn't support any more complex bridge hierarchies since those are never produced when coldplugging PCI bridges. Sending netdev queue and vhost file descriptors to QEMU is slightly involved and implemented in pass_fd(). The actual socket has to be passed in an SCM_RIGHTS socket control message (also called ancillary data, see man 3 cmsg) so we have to use the msghdr structure and sendmsg() call (see man 2 sendmsg) to send the message. Since qapi-rs doesn't support sending messages with ancillary data we have to do the sending sort of "under it", manually, by retrieving qapi-rs's socket and using it directly. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:15:31 +02:00
Pavel Mores	3f46dfcf2f	runtime-rs: don't treat NetworkConfig::index as unique in qemu-rs NetworkConfig::index has been used to generate an id for a network device backend. However, it turns out that it's not unique (it's always zero as confirmed by a comment at its definition) so it's not suitable to generate an id that needs to be unique. Use the host device name instead. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:12:37 +02:00
Pavel Mores	cda04fa539	runtime-rs: factor setup of network device out of QemuCmdLine Network device hotplugging will use the same infrastructure (Netdev, DeviceVirtioNet) as coldplugging, i.e. QemuCmdLine. To make the code of network device setup visible outside of QemuCmdLine we factor it out to a non-member function `get_network_device()` and make QemuCmdLine just delegate to it. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:03:32 +02:00
Pavel Mores	efc8e93bfe	runtime-rs: factor bus_type() out of QemuCmdLine The function takes a whole QemuCmdLine but only actually uses HypervisorConfig. We increase callability of the function by limiting its interface to what it needs. This will come handy shortly. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:03:32 +02:00
Pavel Mores	720265c2d8	runtime-rs: support adding PCI bridges to qemu VM At least one PCI bridge is necessary to hotplug PCI devices. We only support PCI (at this point at least) since that's what the go runtime does (note that looking at the code in virtcontainers it might seem that other bus types are supported, however when the bridge objects are passed to govmm, all but PCI bridges are actually ignored). The entire logic of bridge setup is lifted from runtime-go for compatibility's sake. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-10-03 11:03:32 +02:00
Lukáš Doktor	63b6e8a215	ci: Ensure we check the latest workflow run in gatekeeper with multiple iterations/reruns we need to use the latest run of each workflow. For that we can use the "run_id" and only update results of the same or newer run_ids. To do that we need to store the "run_id". To avoid adding individual attributes this commit stores the full job object that contains the status, conclussion as well as other attributes of the individual jobs, which might come handy in the future in exchange for slightly bigger memory overhead (still we only store the latest run of required jobs only). Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:10:45 +02:00
Lukáš Doktor	2ae090b44b	ci: Add extra gatekeeper debug output to stderr which might be useful to assess the amount of querries. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:08:35 +02:00
Lukáš Doktor	2440a39c50	ci: Check required lables before checking tests in gatekeeper some tests require certain labels before they are executed. When our PR is not labeled appropriately the gatekeeper detects skipped required tests and reports a failure. With this change we add "required-labeles" to the tests mapping and check the expected labels first informing the user about the missing labeles before even checking the test statuses. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:08:35 +02:00
Lukáš Doktor	dd2878a9c8	ci: Unify character for separating items the test names are using `;` and regexps were designed to use `,` but during development simply joined the expressions by `\|`. This should work but might be confusing so let's go with the semi-colon separator everywhere. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:08:35 +02:00
Wainer dos Santos Moschetta	fdcfac0641	workflows/gatekeeper: export COMMIT_HASH variable The Github SHA of triggering PR should be exported in the environment so that gatekeeper can fetch the right workflows/jobs. Note: by default github will export GITHUB_SHA in the job's environment but that value cannot be used if the gatekeeper was triggered from a pull_request_target event, because the SHA correspond to the push branch. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-10-03 09:08:35 +02:00
Wainer dos Santos Moschetta	4abfc11b4f	workflows/gatekeeper: configure concurrency properly This will allow to cancel-in-progress the gatekeeper jobs. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:08:35 +02:00
Lukáš Doktor	5c1cea1601	ci: Select jobs by touched code to allow selective testing as well as selective list of required tests let's add a mapping of required jobs/tests in "skips.py" and a "gatekeaper" workflow that will ensure the expected required jobs were successful. Then we can only mark the "gatekeaper" as the required job and modify the logic to suit our needs. Fixes: #9237 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-03 09:08:33 +02:00
Dan Mihai	1a4928e710	gha: enable AUTO_GENERATE_POLICY where needed The behavior of Kata CI doesn't change. For local testing using kubernetes/gha-run.sh: 1. Before these changes: - AUTO_GENERATE_POLICY=yes was always used by the users of SEV, SNP, TDX, or KATA_HOST_OS=cbl-mariner. 2. After these changes: - Users of SEV, SNP, TDX, or KATA_HOST_OS=cbl-mariner must specify AUTO_GENERATE_POLICY=yes if they want to auto-generate policy. - These users have the option to test just using hard-coded policies (e.g., using the default policy built into the Guest rootfs) by using AUTO_GENERATE_POLICY=no. AUTO_GENERATE_POLICY=no is the default value of this env variable. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-02 23:20:33 +00:00
Gabriela Cervantes	973b8a1d8f	k8s:kbs: Add trap statement to clean up tmp files This PR adds the trap statement in the confidential kbs script to clean up temporary files and ensure we are leaving them. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-10-02 19:59:08 +00:00
Steve Horsman	8412c09143	Merge pull request #10371 from fidencio/topic/k8s-tdx-re-enable-empty-dir-tests k8s: tests: Re-enable empty-dirs tests for TDX / coco-qemu-dev	2024-10-02 18:41:19 +01:00
Dan Mihai	9a8341f431	Merge pull request #10370 from microsoft/danmihai1/k8s-policy-rc tests: k8s-policy-rc: remove default UID from YAML	2024-10-02 09:32:17 -07:00
GabyCT	a1d380305c	Merge pull request #10369 from GabyCT/topic/egrepfastf metrics: Update fast footprint script to use grep	2024-10-02 10:10:12 -06:00
Fabiano Fidêncio	b3ed7830e4	k8s: tests: Re-enable empty-dirs tests for TDX / coco-qemu-dev The tests is disabled for qemu-coco-dev / qemu-tdx, but it doesn't seen to actually be failing on those. Plus, it's passing on SEV / SNP, which means that we most likely missed re-enabling this one in the past. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-10-01 20:51:01 +02:00
Hyounggyu Choi	b179598fed	Merge pull request #10374 from BbolroC/skip-block-volume-qemu-runtime-rs tests: Skip k8s-block-volume.bats for qemu-runtime-rs	2024-10-01 19:45:10 +02:00
Lukáš Doktor	820e000f1c	ci.ocp: Sort images according to git The quay.io registry returns the tags sorted alphabetically and doesn't seem to provide a way to sort it by age. Let's use "git log" to get all changes between the commits and print all tags that were actually pushed. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-10-01 16:08:00 +02:00
Hyounggyu Choi	4ccf1f29f9	tests: Skip k8s-block-volume.bats for qemu-runtime-rs Currently, `qemu-runtime-rs` does not support `virtio-scsi`, which causes the `k8s-block-volume.bats` test to fail. We should skip this test until `virtio-scsi` is supported by the runtime. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-10-01 09:09:47 +02:00
Dan Mihai	3b24219310	tests: k8s-policy-rc: remove default UID from YAML The nginx container seems to error out when using UID=123. Depending on the timing between container initialization and "kubectl wait", the test might have gotten lucky and found the pod briefly in Ready state before nginx errored out. But on some of the nodes, the pod never got reported as Ready. Also, don't block in "kubectl wait --for=condition=Ready" when wrapping that command in a waitForProcess call, because waitForProcess is designed for short-lived commands. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-10-01 00:10:30 +00:00
Saul Paredes	94bc54f4d2	Merge pull request #10340 from microsoft/saulparedes/validate_create_sandbox_storages genpolicy: validate create sandbox storages	2024-09-30 14:24:56 -07:00
Aurélien Bombo	b49800633d	Merge pull request #7165 from sprt/k8s-block-volume-test tests: Add `k8s-block-volume` test to GHA CI	2024-09-30 13:26:18 -07:00
Dan Mihai	7fe44d3a3d	genpolicy: validate create sandbox storages Reject any unexpected values from the CreateSandboxRequest storages field. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-30 11:31:12 -07:00
Gabriela Cervantes	52ef092489	metrics: Update fast footprint script to use grep This PR updates the fast footprint script to remove the use of egrep as this command has been deprecated and change it to use grep command. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-30 17:43:08 +00:00
Aurélien Bombo	c037ac0e82	tests: Add k8s-block-volume test This imports the k8s-block-volume test from the tests repo and modifies it slightly to set up the host volume on the AKS host. This is a follow-up to #7132. Fixes: #7164 Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com> Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-09-30 10:58:30 -05:00
Alex Lyn	dfd0ca9bfe	Merge pull request #10312 from sidneychang/configurable-build-dragonball runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs	2024-09-29 22:33:54 +08:00
GabyCT	6a9e3ccddf	Merge pull request #10305 from GabyCT/topic/ita ci:tdx: Use an ITA key for TDX	2024-09-27 16:44:53 -06:00
Fabiano Fidêncio	66bcfe7369	k8s: kbs: Properly delete ita kustomization The ita kustomization for Trustee, as well as previously used one (DCAP), doesn't have a $(uname -m) directory after the deployment directory name. Let's follow the same logic used for the deploy-kbs script and clean those up accordingly. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-27 21:47:29 +02:00
Gabriela Cervantes	bafa527be0	ci: tdx: Test attestation with ITTS Intel Tiber Trust Services (formerly known as Intel Trust Authority) is Intel's own attestation service, and we want to take advantage of the TDX CI in order to ensure ITTS works as expected. In order to do so, let's replace the former method used (DCAP) to use ITTS instead. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-27 21:47:25 +02:00
GabyCT	36750b56f1	Merge pull request #10342 from GabyCT/topic/updevguide docs: Remove qemu information not longer valid	2024-09-27 11:15:11 -06:00
Fabiano Fidêncio	86b8c53d27	Merge pull request #10357 from fidencio/topic/add-ita-secret gha: Add ita_key as a github secret	2024-09-27 17:40:41 +02:00
Gabriela Cervantes	d91979d7fa	gha: Add ita_key as a github secret This PR adds ita_key as a github secret at the kata coco tests yaml workflow. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-27 17:15:22 +02:00
Xuewei Niu	ad0f2b2a55	Merge pull request #10219 from sidneychang/decouple-runtime-rs-from-dragonball runtime-rs: Port TAP implementation from dragonball	2024-09-27 11:17:55 +08:00
Xuewei Niu	11b1a72442	Merge pull request #10349 from lifupan/main_nsandboxapi sandbox: refactor the sandbox init process	2024-09-27 11:10:45 +08:00
Xuewei Niu	3911bd3108	Merge pull request #10351 from lifupan/main_agent agent: fix the issue of setup sandbox pidns	2024-09-27 10:49:47 +08:00
Fupan Li	f7bc627a86	sandbox: refactor the sandbox init process Inorder to support sandbox api, intorduce the sandbox_config struct and split the sandbox start stage from init process. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-26 23:50:24 +08:00
Hyounggyu Choi	b1275bed1b	Merge pull request #10346 from BbolroC/minor-improvement-k8s-tests tests: Minor improvement k8s tests	2024-09-26 17:01:32 +02:00
Hyounggyu Choi	01d460ac63	tests: Add teardown_common() to tests_common.sh There are many similar or duplicated code patterns in `teardown()`. This commit consolidates them into a new function, `teardown_common()`, which is now called within `teardown()`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-26 13:56:36 +02:00
Hyounggyu Choi	e8d1feb25f	tests: Validate node name for exec_host() The current `exec_host()` accepts a given node name and creates a node debugger pod, even if the name is invalid. This could result in the creation of an unnecessary pending pod (since we are using nodeAffinity; if the given name does not match any actual node names, the pod won’t be scheduled), which wastes resources. This commit introduces validation for the node name to prevent this situation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-26 13:20:50 +02:00
Xuewei Niu	3a7f9595b6	Merge pull request #10318 from lsc2001/ci-add-docker ci: Enable basic docker tests for runtime-rs	2024-09-26 17:41:09 +08:00
Xuewei Niu	cb5a2b30e9	Merge pull request #10293 from lsc2001/solve-docker-compatibility runtime-rs: Notify containerd when process exits	2024-09-26 14:51:20 +08:00
Sicheng Liu	e4733748aa	ci: Enable basic docker tests for runtime-rs This commit enables basic amd64 tests of docker for runtime-rs by adding vmm types "dragonball" and "cloud-hypervisor". Signed-off-by: Sicheng Liu <lsc2001@outlook.com>	2024-09-26 06:27:05 +00:00
Sicheng Liu	08eb5fc7ff	runtime-rs: Notify containerd when process exits Docker cannot exit normally after the container process exits when used with runtime-rs since it doesn't receive the exit event. This commit enable runtime-rs to send TaskExit to containerd after process exits. Also, it moves "system_time_into" and "option_system_time_into" from crates/runtimes/common/src/types/trans_into_shim.rs to a new utility mod. Signed-off-by: Sicheng Liu <lsc2001@outlook.com>	2024-09-26 02:52:50 +00:00
Fupan Li	71afeccdf1	agent: fix the issue of setup sandbox pidns When the sandbox api was enabled, the pasue container wouldn't be created, thus the shared sandbox pidns should be fallbacked to the first container's init process, instead of return any error here. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-26 10:21:25 +08:00
Xuewei Niu	857222af02	Merge pull request #10330 from lifupan/main_sandboxapi Some prepared work for sandbox api support	2024-09-26 09:47:47 +08:00
Hyounggyu Choi	caf3b19505	Merge pull request #10348 from BbolroC/delete-node-debugger-by-trap tests: Delete custom node debugger pod on EXIT	2024-09-25 23:39:43 +02:00
Hyounggyu Choi	57e8cbff6f	tests: Delete custom node debugger pod on EXIT It was observed that the custom node debugger pod is not cleaned up when a test times out. This commit ensures the pod is cleaned up by triggering the cleanup on EXIT, preventing any debugger pods from being left behind. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-25 20:36:05 +02:00
Fabiano Fidêncio	edf4ca4738	Merge pull request #10345 from ldoktor/kata-webhook ci: Reorder webhook deployment	2024-09-25 18:16:46 +02:00
Fabiano Fidêncio	09ed9c5c50	Merge pull request #10328 from BbolroC/improve-negative-tests tests: Improve k8s negative tests	2024-09-25 18:16:28 +02:00
Xuewei Niu	e1825c2ef3	Merge pull request #9977 from l8huang/dan-2-vfio runtime: add DAN support for VFIO network device in Go kata-runtime	2024-09-25 10:11:38 +08:00
Lei Huang	39b0e9aa8f	runtime: add DAN support for VFIO network device in Go kata-runtime When using network adapters that support SR-IOV, a VFIO device can be plugged into a guest VM and claimed as a network interface. This can significantly enhance network performance. Fixes: #9758 Signed-off-by: Lei Huang <leih@nvidia.com>	2024-09-24 09:53:28 -07:00
Hyounggyu Choi	c70588fafe	tests: Use custom-node-debugger pod With #10232 merged, we now have a persistent node debugger pod throughout the test. As a result, there’s no need to spawn another debugger pod using `kubectl debug`, which could lead to false negatives due to premature pod termination, as reported in #10081. This commit removes the `print_node_journal()` call that uses `kubectl debug` and instead uses `exec_host()` to capture the host journal. The `exec_host()` function is relocated to `tests/integration/kubernetes/lib.sh` to prevent cyclical dependencies between `tests_common.sh` and `lib.sh`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-24 17:25:24 +02:00
Lukáš Doktor	8355eee9f5	ci: Reorder webhook deployment in `b9d88f74ed` the `runtime_class` CM was added which overrides the one we previously set. Let's reorder our logic to first deploy webhook and then override the default CM in order to use the one we really want. Since we need to change dirs we also have to use realpath to ensure the files are located well. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-09-24 17:01:28 +02:00
Hyounggyu Choi	2c2941122c	tests: Fail fast in assert_pod_fail() `assert_pod_fail()` currently calls `k8s_create_pod()` to ensure that a pod does not become ready within the default 120s. However, this delays the test's completion even if an error message is detected earlier in the journal. This commit removes the use of `k8s_create_pod()` and modifies `assert_pod_fail()` to fail as soon as the pod enters a failed state. All failing pods end up in one of the following states: - CrashLoopBackOff - ImagePullBackOff The function now polls the pod's state every 5 seconds to check for these conditions. If the pod enters a failed state, the function immediately returns 0. If the pod does not reach a failed state within 120 seconds, it returns 1. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-24 16:09:20 +02:00
Gabriela Cervantes	6a8b137965	docs: Remove qemu information not longer valid This PR removes some qemu information which is not longer valid as this is referring to the tests repository and to kata 1.x. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-23 16:58:24 +00:00
Aurélien Bombo	e738054ddb	Merge pull request #10311 from pawelpros/pproskur/fixyq ci: don't require sudo for yq if already installed	2024-09-23 08:57:11 -07:00
Alex Lyn	6b94cc47a8	Merge pull request #10146 from Apokleos/intro-cdi Introduce cdi in runtime-rs	2024-09-23 21:45:42 +08:00
Alex Lyn	b8ba346e98	runtime-rs: Add test for container devices with CDI. Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-23 17:20:22 +08:00
Steve Horsman	0e0cb24387	Merge pull request #10329 from Bickor/webhook-check tools.kata-webhook: Specify runtime class using configMap	2024-09-23 09:59:12 +01:00
Steve Horsman	6f0b3eb2f9	Merge pull request #10337 from stevenhorsman/update-release-process-post-3.9.0 doc: Update the release process	2024-09-23 09:55:57 +01:00
Hyounggyu Choi	8a893cd4ee	Merge pull request #10232 from BbolroC/fix-loop-device-for-exec_host tests: Fix loop device handling for exec_host()	2024-09-23 08:15:03 +02:00
Fupan Li	f1f5bef9ef	Merge pull request #10339 from lifupan/main_fix runtime-rs: fix the issue of using block_on	2024-09-23 09:28:40 +08:00
Fupan Li	52397ca2c1	sandbox: rename the task_service to service rename the task_service to service, in order to incopperate with the following added sandbox services. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-22 14:44:19 +08:00
Fupan Li	20b4be0225	runtime-rs: rename the Request/Response to TaskRequest/TaskResponse In order to make different from sandbox request/response, this commit changed the task request/response to TaskRequest/TaskResponse. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-22 14:44:11 +08:00
Fupan Li	ba94eed891	sandbox: fix the issue of hypervisor's wait_vm Since the wait_vm would be called before calling stop_vm, which would take the reader lock, thus blocking the stop_vm getting the writer lock, which would trigge the dead lock. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-22 14:44:03 +08:00
Fupan Li	fb27de3561	runtime-rs: fix the issue of using block_on Since the block_on would block on the current thread which would prevent other async tasks to be run on this worker thread, thus change it to use the async task for this task. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-09-22 14:40:44 +08:00
Aurélien Bombo	79a3b4e2e5	Merge pull request #10335 from kata-containers/sprt/fix-kata-deploy-docs kata-deploy: clean up and fix docs for k0s	2024-09-20 13:33:14 -07:00
stevenhorsman	4f745f77cb	doc: Update the release process - Reflect the need to update the versions in the Helm Chart - Add the lock branch instruction - Add clarity about the permissions needed to complete tasks Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-20 19:04:33 +01:00
Aurélien Bombo	78c63c7951	kata-deploy: clean up and fix docs for k0s * Clarifies instructions for k0s. * Adds kata-deploy step for each cluster type. * Removes the old kata-deploy-stable step for vanilla k8s. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-09-20 11:59:40 -05:00
sidney chang	456e13db98	runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs rename DEFAULT_HYPERVISOR to HYPERVISOR in Makefile Fixes #10310 Signed-off-by: sidney chang <2190206983@qq.com>	2024-09-20 05:41:34 -07:00
sidneychang	b85a886694	runtime-rs: Add Configurable Compilation for Dragonball in Runtime-rs This PR introduces support for selectively compiling Dragonball in runtime-rs. By default, Dragonball will continue to be compiled into the containerd-shim-kata-v2 executable, but users now have the option to disable Dragonball compilation. Fixes #10310 Signed-off-by: sidney chang <2190206983@qq.com>	2024-09-20 05:38:59 -07:00
Hyounggyu Choi	2d6ac3d85d	tests: Re-enable guest-pull-image tests for qemu-coco-dev Now that the issue with handling loop devices has been resolved, this commit re-enables the guest-pull-image tests for `qemu-coco-dev`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	c6b86e88e4	tests: Increase timeouts for qemu-coco-dev in trusted image storage tests Timeouts occur (e.g. `create_container_timeout` and `wait_time`) when using qemu-coco-dev. This commit increases these timeouts for the trusted image storage test cases Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	9cff9271bc	tests: Run all commands in _loop_device() using exec_host() If the host running the tests is different from the host where the cluster is running, the _loop_device() functions do not work as expected because the device is created on the test host, while the cluster expects the device to be local. This commit ensures that all commands for the relevant functions are executed via exec_host() so that a device should be handled on a cluster node. Additionally, it modifies exec_host() to return the exit code of the last executed command because the existing logic with `kubectl debug` sometimes includes unexpected characters that are difficult to handle. `kubectl exec` appears to properly return the exit code for a given command to it. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	374b8d2534	tests: Create and delete node debugger pod only once Creating and deleting a node debugger pod for every `exec_host()` call is inefficient. This commit changes the test suite to create and delete the pod only once, globally. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Hyounggyu Choi	aedf14b244	tests: Mimic node debugger with full privileges This commit addresses an issue with handling loop devices via a node debugger due to restricted privileges. It runs a pod with full privileges, allowing it to mount the host root to `/host`, similar to the node debugger. This change enables us to run tests for trusted image storage using the `qemu-coco-dev` runtime class. Fixes: #10133 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-20 14:37:43 +02:00
Alex Lyn	63b25e8cb0	runtime-rs: Introduce cdi devices in container creation Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-20 09:28:51 +08:00
Alex Lyn	03735d78ec	runtime-rs: add cdi devices definition and related methods Add cdi devices including ContainerDevice definition and annotation_container_device method to annotate vfio device in OCI Spec annotations which is inserted into Guest with its mapping of vendor-class and guest pci path. Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-20 09:28:51 +08:00
Alex Lyn	020e3da9b9	runtime-rs: extend DeviceVendor with device class We need vfio device's properties device, vendor and class, but we can only get property device and vendor. just extend it with class is ok. Fixes #10145 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-20 09:28:51 +08:00
Fabiano Fidêncio	77c844da12	Merge pull request #10239 from fidencio/topic/remove-acrn acrn: Drop support	2024-09-19 23:10:29 +02:00
GabyCT	6eef58dc3e	Merge pull request #10336 from GabyCT/topic/extendtimeout gha: Increase timeout to run k8s tests on TDX	2024-09-19 13:12:55 -06:00
Martin	b9d88f74ed	tools.kata-webhook: Specify runtime class using configMap The kata webhook requires a configmap to define what runtime class it should set for the newly created pods. Additionally, the configmap allows others to modify the default runtime class name we wish to set (in case the handler is kata but the name of the runtimeclass is different). Finally, this PR changes the webhook-check to compare the runtime of the newly created pod against the specific runtime class in the configmap, if said confimap doesn't exist, then it will default to "kata". Signed-off-by: Martin <mheberling@microsoft.com>	2024-09-19 11:51:38 -07:00
Fabiano Fidêncio	51dade3382	docs: Fix spell checker tokio is not a valid word, it seeems, so let's use `tokio`. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 20:25:21 +02:00
Gabriela Cervantes	49b3a0faa3	gha: Increase timeout to run k8s tests on TDX This PR increases the timeout to run k8s tests for Kata CoCo TDX to avoid the random failures of timeout. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-19 17:15:47 +00:00
Fabiano Fidêncio	31438dba79	docs: Fix qemu link Otherwise static checks will fail, as we woke up the dogs with changes on the same file. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 16:05:43 +02:00
Fabiano Fidêncio	fefcf7cfa4	acrn: Drop support As we don't have any CI, nor maintainer to keep ACRN code around, we better have it removed than give users the expectation that it should or would work at some point. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 16:05:43 +02:00
Fabiano Fidêncio	cdaaf708a1	Merge pull request #10334 from emanuellima1/bump-version release: Bump version to 3.9.0	2024-09-19 15:27:50 +02:00
Emanuel Lima	a6ee15c5c7	release: Bump VERSION to 3.9.0 Starting the v3.9.0 release Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-09-19 10:14:55 -03:00
Fabiano Fidêncio	e9593b53a4	Merge pull request #10234 from pmores/add-support-for-disabled-guest-selinux runtime-rs: add support for disabled guest selinux	2024-09-19 15:03:24 +02:00
Fabiano Fidêncio	4d11fecc2d	Merge pull request #10274 from ajaypvictor/remote_image-os_types runtime: Enable Image annotation for remote hypervisor	2024-09-19 13:39:20 +02:00
Fabiano Fidêncio	3d5f48e02e	Merge pull request #10283 from alexman-stripe/alexman-stripe/fix-kata-shim-not-reporting-inactive-file-cgroup-v2 shim: Fix memory usage reporting for cgroup v2	2024-09-19 12:50:36 +02:00
Pavel Mores	5e5eb9759f	runtime-rs: handle disabled guest selinux in virtiofsd This is just a port of functionality existing in the golang runtime. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Pavel Mores	8c92f3bfec	runtime-rs: enable/disable selinux in guest based on disable_guest_selinux This change technically affects the path for enabled guest selinux as well, however since this is not implemented in runtime-rs anyway nothing should break. When guest selinux support is added this change will come handy. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Pavel Mores	204ee21bc8	runtime-rs: handle disabled guest selinux in OCI spec If guest selinux is off the runtime has to ensure that container OCI spec contains no selinux labels for the container rootfs and process. Failure to do so causes kata agent to try and apply the labels which fails since selinux is not enabled in guest, which in turn causes container launch to fail. This is largely inspired by golang runtime() with a slight deviation in ordering of checks. This change simply checks the disable_guest_selinux config setting and if it's true it clears both rootfs and process label if necessary. Golang runtime, on the other hand, seems to first check if process label is non-empty and only then it checks the config setting, meaning that if process label is empty the rootfs label is not reset even if it's non-empty. Frankly, this looks like a potential bug though probably unlikely to manifest since it can be assumed that the labels are either both empty, or both non-empty. () `4fd4b02f2e/src/runtime/virtcontainers/kata_agent.go (L1005)` Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Pavel Mores	eb1227f47d	runtime-rs: parse the disable_guest_selinux config key In order to handle the setting we have to first parse it and make its value available to the rest of the program. The yes() function is added to comply with serde which seems to insist on default values being returned from functions. Long term, this is surely not the best place for this function to live, however given that this is currently the first and only place where it's used it seems appropriate to put it near its use. If it ends up being reused elsewhere a better place will surely emerge. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-09-19 12:47:10 +02:00
Steve Horsman	8789551fe6	Merge pull request #10333 from fidencio/topic/ci-bump-ubuntu-20.04-runners-to-22.04 ci: Bump ubuntu 20.04 runners to 22.04	2024-09-19 11:44:33 +01:00
Fabiano Fidêncio	35c7f8d1ba	ci: Bump ubuntu 20.04 runners to 22.04 Azure internal mirrors for Ubuntu 20.04 have gone awry, leading to a situation where dependencies cannot be installed (such as libdevmapper-dev), blocking then our CI. Let's bump the runners to 22.04 regardless, even knowing it'll cause an issue with the runk tests, as the agent check tests are considered more crucial to the project at this point. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-19 12:29:20 +02:00
Fabiano Fidêncio	eccdffebf7	Merge pull request #10243 from katexochen/nydus-overlayfs-path virtcontainers: allow specifying nydus-overlayfs binary by path	2024-09-19 11:35:45 +02:00
Ajay Victor	a19f2eacec	runtime: Enable ImageName annotation for remote hypervisor Enables ImageName to support multiple VM images in remote hypervisor scenario Fixes https://github.com/kata-containers/kata-containers/issues/10240 Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>	2024-09-19 14:48:46 +05:30
Alex Man	27f8f69195	shim: Fix memory usage reporting for cgroup v2 kata-shim was not reporting `inactive_file` in memory stat. This memory is deducted by containerd when calculating the size of container working set, as it can be paged out by the operating system under memory pressure. Without reporting `inactive_file`, containerd will over report container memory usage. [Here](https://github.com/containerd/containerd/blob/v1.7.22/pkg/cri/server/container_stats_list_linux.go#L117) is where containerd deducts `inactive_file` from memory usage. Note that kata-shim correctly reports `total_inactive_file` for cgroup v1, but this was not implemented for cgroup v2. This commit: - Adds code in kata-shim to report "inactive_file" memory for cgroup v2 - Implements reporting of all available cgroup v2 memory stats to containerd - Uses defensive coding to avoid assuming existence of any memory.stat fields The list of available cgroup v2 memory stats defined by containerd can be found [here](https://pkg.go.dev/github.com/containerd/cgroups/v2/stats#MemoryStat). Fixes #10280 Signed-off-by: Alex Man <alexman@stripe.com>	2024-09-18 14:04:24 -07:00
Fabiano Fidêncio	1597f8ba00	Merge pull request #10279 from alexman-stripe/alexman-stripe/fix-cgroup-v2-wrong-cpu-usage-unit agent: Fix CPU usage reporting for cgroup v2 in kata-agent	2024-09-18 21:36:52 +02:00
Fabiano Fidêncio	593cbb8710	Merge pull request #10306 from microsoft/danmihai1/more-security-contexts genpolicy: get UID from PodSecurityContext	2024-09-18 21:33:39 +02:00
Aurélien Bombo	5402f2c637	Merge pull request #10308 from Sumynwa/sumsharma/add_setpolicy_agent_ctl agent-ctl: Add SetPolicy support	2024-09-18 10:09:07 -07:00
Pawel Proskurnicki	b63d49b34a	ci: don't require sudo for yq if already installed Yq installation shouldn't force to use sudo in case yq is already installed in correct version. Signed-off-by: Pawel Proskurnicki <pawel.proskurnicki@intel.com>	2024-09-18 11:01:07 +02:00
Sumedh Alok Sharma	18c887f055	agent-ctl: Add SetPolicy support This patch adds support to call kata agents SetPolicy API. Also adds tests for SetPolicy API using agent-ctl. Fixes #9711 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-18 10:53:49 +05:30
GabyCT	28d430ec42	Merge pull request #10324 from GabyCT/topic/fixinlib ci: Fix indentation of install libseccomp script	2024-09-17 14:21:24 -06:00
Fabiano Fidêncio	da2377346d	Merge pull request #10323 from stevenhorsman/update-kubectl-release-url kata-deploy: Switch Kubernetes URL	2024-09-17 20:47:17 +02:00
Gabriela Cervantes	096f32cc52	ci: Fix indentation of install libseccomp script This PR fixes the indentation of the install libseccomp script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-17 16:38:53 +00:00
Aurélien Bombo	9d29ce460d	Merge pull request #10303 from Sumynwa/sumsharma/agent_policy_set_env agent: add support to provide default agent policy via env	2024-09-17 09:04:11 -07:00
stevenhorsman	c0d35a66aa	ci: kata-deploy: Update kubectil install URL The `deploy_k0s` and `deploy_k3s` kubectl installs aren't failing yet, but let get ahead of this and bump them as well Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-17 15:35:42 +01:00
stevenhorsman	1abeffdac6	kata-deploy: Switch Kubernetes URL The payload build is failing with: ``` ERROR: failed to solve: process "/bin/sh -c apk --no-cache add bash curl && ARCH=$(uname -m) && if [ \"${ARCH}\" = \"x86_64\" ]; then ARCH=amd64; fi && if [ \"${ARCH}\" = \"aarch64\" ]; then ARCH=arm64; fi && DEBIAN_ARCH=${ARCH} && if [ \"${DEBIAN_ARCH}\" = \"ppc64le\" ]; then DEBIAN_ARCH=ppc64el; fi && curl -fL --progress-bar -o /usr/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/ \ $(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/${ARCH}/kubectl && chmod +x /usr/bin/kubectl && curl -fL --progress-bar -o /usr/bin/jq https://github.com/jqlang/jq/releases/download/jq-1.7.1/jq-linux-${DEBIAN_ARCH} && chmod +x /usr/bin/jq && mkdir -p ${DESTINATION} && tar xvf ${WORKDIR}/${KATA_ARTIFACTS} -C ${DESTINATION} && rm -f ${WORKDIR}/${KATA_ARTIFACTS} && apk del curl && apk --no-cache add py3-pip && pip install --no-cache-dir yq==3.2.3" did not complete successfully: exit code: 22 ``` Looking into this, the problem is that https://storage.googleapis.com/kubernetes-release/release/v1.31.1/bin/linux/amd64/kubectl doesn't exist. The [kubectl install doc](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/#install-kubectl-on-linux) recommends the `dl.k8s.io` site, so let's switch to this. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-17 15:35:42 +01:00
Steve Horsman	5448f7fbbf	Merge pull request #10321 from BbolroC/fix-build-boot-image-se local-build: Fix unbound variable issue for lib_se.sh	2024-09-17 15:35:04 +01:00
Hyounggyu Choi	72471d1a18	local-build: Fix unbound variable for lib_se.sh As #10315 introduced an `unbound variable` error, this is a hot-fix for it. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-17 10:01:14 +02:00
Hyounggyu Choi	72df3004e8	gha: Rebase build-secure-image-se atop of latest target branch This commit adds a step called `Rebase atop of the latest target branch` to the job named `build-asset-boot-image-se` which can test the PR properly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-17 09:54:51 +02:00
Hyounggyu Choi	03cd02a006	Merge pull request #10315 from BbolroC/update-ibm-se-doc doc: Update how-to-run-kata-containers-with-SE-VMs.md	2024-09-16 15:12:18 +02:00
Sumedh Alok Sharma	cefba08903	agent: add support to provide default agent policy via env agent built with policy feature initializes the policy engine using a policy document from a default path, which is installed & linked during UVM rootfs build. This commit adds support to provide a default agent policy as environment variable. This targets development/testing scenarios where kata-agent is wanted to be started as a local process. Fixes #10301 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-16 18:05:21 +05:30
Hyounggyu Choi	8d609e47fb	doc: Update how-to-run-kata-containers-with-SE-VMs.md The following changes have been made: - Remove unnecessary `sudo` - Add an error message where an incorrect host key document is used - Add a missing artifact `kernel-confidential-modules` - Make a variable `kernel_version` and replace it with relevant hits Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-16 12:53:30 +02:00
Fabiano Fidêncio	fc5a631791	Merge pull request #10009 from Xynnn007/feat-cosign Merge to main: supporting pull cosign signed images	2024-09-16 12:08:26 +02:00
stevenhorsman	aa9f21bd19	test: Add support for s390x in cosign testing We've added s390x test container image, so add support to use them based on the arch the test is running on Fixes: #10302 Signed-off-by: stevenhorsman <steven@uk.ibm.com> fixuop	2024-09-16 09:20:57 +01:00
stevenhorsman	3087ce17a6	tests: combined pod yaml creation for CoCo tests This commit brings some public parts of image pulling test series like encrypted image pulling, pulling images from authenticated registry and image verification. This would help to reduce the cost of maintainance. Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-16 09:20:57 +01:00
Xynnn007	c80c8d84c3	test: add cosign signature verificaton tests Close #8120 Case 1 Create a pod from an unsigned image, on an insecureAcceptAnything registry works. Image: quay.io/prometheus/busybox:latest Policy rule: ``` "default": [ { "type": "insecureAcceptAnything" } ] ``` Case 2 Create a pod from an unsigned image, on a 'restricted registry' is rejected. Image: ghcr.io/confidential-containers/test-container-image-rs:unsigned Policy rule: ``` "quay.io/confidential-containers/test-container-image-rs": [ { "type": "sigstoreSigned", "keyPath": "kbs:///default/cosign-public-key/test" } ] ``` Case 3 Create a pod from a signed image, on a 'restricted registry' is successful. Image: ghcr.io/confidential-containers/test-container-image-rs:cosign-signed Policy rule: ``` "ghcr.io/confidential-containers/test-container-image-rs": [ { "type": "sigstoreSigned", "keyPath": "kbs:///default/cosign-public-key/test" } ] ``` Case 4 Create a pod from a signed image, on a 'restricted registry', but with the wrong key is rejected Image: ghcr.io/confidential-containers/test-container-image-rs:cosign-signed-key2 Policy: ``` "ghcr.io/confidential-containers/test-container-image-rs": [ { "type": "sigstoreSigned", "keyPath": "kbs:///default/cosign-public-key/test" } ] ``` Case 5 Create a pod from an unsigned image, on a 'restricted registry' works if enable_signature_verfication is false Image: ghcr.io/kata-containers/confidential-containers:unsigned image security enable: false Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-16 09:20:57 +01:00
Xynnn007	9606e7ac8b	agent: Set image-rs image security policy Add two parameters for enabling cosign signature image verification. - `enable_signature_verification`: to activate signature verification - `image_policy`: URI of the image policy config Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-09-16 09:20:57 +01:00
Xynnn007	653bc3973f	agent: fix make test for kata-agent of dependency anyhow new version of the anyhow crate has changed the backtrace capture thus unit tests of kata-agent that compares a raised error with an expected one would fail. To fix this, we need only panics to have backtraces, thus set `RUST_BACKTRACE=1` and `RUST_LIB_BACKTRACE=0` for tests due to document https://docs.rs/anyhow/latest/anyhow/ Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-09-16 09:20:57 +01:00
Fabiano Fidêncio	dfcb41b5cc	Merge pull request #10313 from stevenhorsman/coco-components-0.10-bump CoCo: Bump Coco components to 0.10 releases	2024-09-14 21:43:28 +02:00
stevenhorsman	705e469696	rootf: Change initrd alpine mirror The rootfs-initrd build is failing with: ``` fetch https://mirror.math.princeton.edu/pub/alpinelinux//v3.18/main/aarch64/APKINDEX.tar.gz 6684368:error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed:ssl/statem/statem_clnt.c:1914: ERROR: https://mirror.math.princeton.edu/pub/alpinelinux//v3.18/main: Permission denied ``` so try bumping to a newer version of alpine to see if that helps the issue Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-14 18:47:45 +02:00
Dan Mihai	5777869cf4	tests: k8s-policy-rc: add unexpected UID test Change pod runAsUser value of a Replication Controller after generating the RC's policy, and verify that the RC pods get rejected due to this change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-13 22:05:31 +00:00
Dan Mihai	6773f14667	tests: k8s-policy-job: add unexpected UID test Change pod runAsUser value of a Job after generating the Job's policy, and verify that the Job gets rejected due to this change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-13 22:05:31 +00:00
Dan Mihai	124f01beb3	tests: k8s-policy-deployment: add bad UID test Change pod runAsUser value of a Deployment after generating the Deployment's policy, and verify that the Deployment fails due to this change. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-13 22:05:31 +00:00
Dan Mihai	16f5ebf5f9	genpolicy: get UID from PodSecurityContext Get UID from PodSecurityContext for other k8s resource types too, not just for Pods. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-13 22:05:31 +00:00
Dan Mihai	5badc30a69	Merge pull request #10316 from microsoft/danmihai1/k8s-inotify tests: k8s-inotify: pod termination polling	2024-09-13 15:02:38 -07:00
GabyCT	6f363bba18	Merge pull request #10304 from GabyCT/topic/fixcricont tests: Fix indentation in the cri containerd tests	2024-09-13 14:49:12 -06:00
Dan Mihai	d3127af9c5	tests: k8s-inotify: pod termination polling Poll/wait for pod termination instead of sleeping 2 minutes. This change typically saves ~90 seconds in my test cluster. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-13 17:12:55 +00:00
sidney chang	5a7d0ed3ad	runtime-rs: introduce tap in hypervisor by extrating it from dragonball It's a prerequisite PR to make built-in vmm dragonball compilation options configurable. Extract TAP device-related code from dragonball's dbs_utils into a separate library within the runtime-rs hypervisor module. To enhance functionality and reduce dependencies, the extracted code has been reimplemented using the libc crate and the ifreq structure. Fixes #10182 Signed-off-by: sidney chang <2190206983@qq.com>	2024-09-13 07:32:14 -07:00
Fabiano Fidêncio	b09eba8c46	Merge pull request #10309 from BbolroC/helm-install-with-retry tests: Introduce retry mechanism for helm install	2024-09-13 15:08:46 +02:00
stevenhorsman	00e657cdb7	agent: image-rs: Update to v0.10.0 release Update image-rs to use the latest release of guest-components Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-09-13 13:29:54 +01:00
stevenhorsman	5e03890562	versions: Bump trustee and guest-components Bump to the v0.10.1 release of trustee and v0.10.0 release of guest-components Signed-off-by: stevenhorsman <steven@uk.ibm.com> fixup	2024-09-13 13:28:54 +01:00
Hyounggyu Choi	0aae847ae5	tests: Update secure boot image verification for IBM SE In the latest `s390-tools`, there has been update on how to verify a secure boot image. A host key revocation list (CRL), which was optinoal, now becomes mandatory for verification. This commit updates the relevant scripts and documentation accordingly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-13 14:14:02 +02:00
Hyounggyu Choi	4c933a5611	tests: Introduce retry mechanism for helm install Kata-deploy often fails due to a transiently unreachable k8s cluster for the qemu-coco-dev test on s390x. (e.g. https://github.com/kata-containers/kata-containers/actions/runs/10831142906/job/30058527098?pr=10009) This commit introduces a retry mechanism to mitigate these failures by retrying the command two more times with a 10-second interval as a workaround. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-13 14:03:44 +02:00
Dan Mihai	e937cb1ded	Merge pull request #10291 from microsoft/danmihai1/user-name-to-uid genpolicy: fix and re-enable create container UID verification	2024-09-12 15:47:59 -07:00
Dan Mihai	0c5ac042e7	tests: k8s-policy-pod: add workaround for #10297 If the CI platform being tested doesn't support yet the prometheus container image: - Use busybox instead of prometheus. - Skip the test cases that depend on the prometheus image. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-12 18:26:38 +00:00
Gabriela Cervantes	0346b32a90	tests: Fix indentation in the cri containerd tests This PR fixes the indentation in the cri containerd tests as we have in several places a misalignment in the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-12 16:18:34 +00:00
Dan Mihai	94d95fc055	tests: k8s-policy-pod: test container UID changes Add test cases for changing container UID after generating the policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-11 22:38:20 +00:00
Dan Mihai	db1ca4b665	tests: k8s-policy-pod: remove UID workaround Remove the workaround for #9928, now that genpolicy is able to convert user names from container images into the corresponding UIDs from these images. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-11 22:38:20 +00:00
Dan Mihai	d2d8d2e519	genpolicy: remove default UID/GID values Remove the recently added default UID/GID values, because the genpolicy design is to initialize those fields before this new code path gets executed. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-11 22:38:20 +00:00
Hernan Gatta	871476c3cb	genpolicy: pull UID:GID values from /etc/passwd Some container images are configured such that the user (and group) under which their entrypoint should run is not a number (or pair of numbers), but a user name. For example, in a Dockerfile, one might write: > USER 185 indicating that the entrypoint should run under UID=185. Some images, however, might have: > RUN groupadd --system --gid=185 spark > RUN useradd --system --uid=185 --gid=spark spark > ... > USER spark indicating that the UID:GID pair should be resolved at runtime via /etc/passwd. To handle such images correctly, read through all /etc/passwd files in all layers, find the latest version of it (i.e., the top-most layer with such a file), and, in so doing, ensure that whiteouts of this file are respected (i.e., if one layer adds the file and some subsequent layer removes it, don't use it). Signed-off-by: Hernan Gatta <hernan.gatta@opaque.co>	2024-09-11 22:38:20 +00:00
Hernan Gatta	f9249b4476	genpolicy: add tar dependency Used to read /etc/passwd from tar files. Signed-off-by: Hernan Gatta <hernan.gatta@opaque.co>	2024-09-11 22:38:20 +00:00
Dan Mihai	eb7f747df1	genpolicy: enable create container UID verification Disabling the UID Policy rule was a workaround for #9928. Re-enable that rule here and add a new test/CI temporary workaround for this issue. This new test workaround will be removed after fixing #9928. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-11 22:38:20 +00:00
Dan Mihai	71ede4ea3f	tests: k8s-policy-pod: use prometheus container Change quay.io/prometheus/busybox to quay.io/prometheus/prometheus in this test. The prometheus image will be helpful for testing the future fix for #9928 because it specifies user = "nobody". Also, change: sh -c "ls -l /" to: echo -n "readinessProbe with space characters" as the test readinessProbe command line. Both include a command line argument containing space characters, but "sh -c" behaves differently when using the prometheus container image (causes the readinessProbe to time out, etc.). Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-09-11 22:38:20 +00:00
GabyCT	614328f342	Merge pull request #10295 from GabyCT/topic/removeimgvar metrics: Remove unused remove img var in common script	2024-09-11 15:02:39 -07:00
GabyCT	095c5ed961	Merge pull request #10289 from GabyCT/topic/enablestresst tests: Enable stressng k8s stability test for Kata CoCo CI	2024-09-11 10:47:33 -07:00
Fabiano Fidêncio	97ecdabde9	Merge pull request #10294 from fidencio/topic/bring-ita-support Bump guest-components / trustee to a version that supports ITA	2024-09-11 19:45:48 +02:00
Gabriela Cervantes	fdaf12d16c	metrics: Remove unused remove img var in common script This PR removes the remove_img variable in the metrics common script as it is not being used. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-11 17:45:18 +00:00
Gabriela Cervantes	04d1122a46	tests: Decrease iterations in soak test This PR decreases the number of iterations in the kubernetes soak test as this is already taking more than 2 hours for the kata coco ci stability. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-11 17:39:06 +00:00
Gabriela Cervantes	c48c6f974e	tests: Enable stressng k8s stability test for Kata CoCo CI This PR enables the stressng k8s stability test for Kata CoCo CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-11 17:38:13 +00:00
Alex Man	7e400f7bb2	agent: Fix CPU usage reporting for cgroup v2 in kata-agent kata-agent incorrectly reports CPU time for cgroup v2, causing 1000x underreporting. For cgroup v2, kata-agent reads the cpu.stat file, which reports the time consumed by the processes in the cgroup in µs. However, there was a bug in kata-agent where it returned this value in µs without converting it to ns. This commit adds the necessary µs to ns conversion for cgroup v2, aligning it with v1 behavior and kata-shim's expectations. This fixes #10278 Signed-off-by: Alex Man <alexman@stripe.com>	2024-09-11 10:29:03 -07:00
Fabiano Fidêncio	1178fe20e9	tests: Adapt error parser for failed image decryption With an older version of image-rs, we were getting the following error: ``` Message: failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key no suitable key found for decrypting layer key: ``` However, with the version of image-rs we are bumping to, the error comes as: ``` Message: failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key Caused by: no suitable key found for decrypting layer key: keyprovider: failed to unwrap key by ttrpc ``` Due to this change, I'm splitting the check in two different ones. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-11 17:07:56 +02:00
Dan Mihai	66dda37877	Merge pull request #10271 from Sumynwa/sumsharma/agent_ctl_issue_9689_local agent-ctl: Refactor CopyFile Handler	2024-09-11 07:35:09 -07:00
Fabiano Fidêncio	f6cfc33314	Merge pull request #10292 from fidencio/topic/ci-tdx-adapt-how-we-get-the-host-ip ci: tdx: Adapt how we get the host IP	2024-09-11 14:42:22 +02:00
Fabiano Fidêncio	e2200f0690	versions: trustee: Update to a version that supports ITA ITA stands for Intel Trust Authority, which is in the process to being renamed to ITTS (Intel Tiber Trust Services). Proper ITA / ITTS support on Trustee was finished as part of: * `6f767fa15f` Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-11 13:39:35 +02:00
Fabiano Fidêncio	d3e3ee7755	versions: guest-components: Update to a version that supports ITA ITA stands for Intel Trust Authority, which is in the process to being renamed to ITTS (Intel Tiber Trust Services). As we've bumped guest-components on trustee, let's make sure we also bump image-rs to the commit that brings ITA support in: * https://github.com/confidential-containers/guest-components/commit/1db6c3a87665dde58d0efa56f4e4af5fc Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-11 13:36:56 +02:00
Fabiano Fidêncio	f94d80783d	agent: image-rs: Update to a version that supports ITA ITA stands for Intel Trust Authority, which is in the process to being renamed to ITTS (Intel Tiber Trust Services). As we've bumped guest-components on trustee, let's make sure we also bump image-rs to the commit that brings ITA support in: * `1db6c3a876` The reason we need to bump the dependency here is to avoid kbs_protocol mismatch between the version used by the agent and the trustee one. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-11 13:36:46 +02:00
Fabiano Fidêncio	3946aa7283	ci: tdx: Adapt how we get the host IP In the process of switching the TDX CI machine we've noticed that `hostname -i` in one of the machines returns an one and only IP address, while in another machine it returns a full list of IPs. As we're only interested in the first one, let's adapt the code to always return the first one. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-11 09:31:43 +02:00
Sumedh Alok Sharma	b4bbbf65c6	ci: Do not start CDH/attestation procs with kata-agent as local process. Since CDH/attestation related processes and its dependencies are not fully available, the setup fails to start kata-agent as local process. This fix removes these procs to prevent kata-agent from trying to start them. Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-11 11:53:59 +05:30
Sumedh Alok Sharma	8045a7a2ba	ci: Install policy document on host to run kata-agent as local process. The test setup starts kata-agent as a local process without the UVM. The agent policy initialization fails due to missing policy document at `/etc/kata-opa/default-policy.rego`. The fix - installs a relaxed `allow-all.rego` policy document - cleans up the install during exit Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-11 11:25:08 +05:30
Sumedh Alok Sharma	822f898433	ci: Install bats as dependencies Install bats as part of dependencies for running the tests. Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-11 10:57:15 +05:30
Sumedh Alok Sharma	2c774fb207	ci: Add tests for CopyFile api. This commit introduces test cases for testing CopyFile API using kata-agent-ctl with improved command semantics and handling. - copy a file to /run/kata-containers - copy symlink to /run/kata-containers - copy directory to /run/kata-containers - copy file to /tmp - copy large file to /run/kata-containers Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-11 10:54:01 +05:30
Sumedh Alok Sharma	2af1113426	agent-ctl: Refactor CopyFile handler In the existing implementation for the CopyFile subcommand, - cmd line argument list is too long, including various metadata information. - in case of a regular file, passing the actual data as bytes stream adds to the size and complexity of the input. - the copy request will fail when the file size exceeds that of the allowed ttrpc max data length limit of 4Mb. This change refactors the CopyFile handler and modifies the input to a known 'source' 'destination' syntax. Fixes #9708 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-11 10:54:01 +05:30
Alex Lyn	d0968032f7	Merge pull request #10276 from Apokleos/fix-runtime-cdi runtime: Fix runtime/cdi panic with assignment to entry in nil map	2024-09-11 09:00:11 +08:00
Alex Lyn	3f541aff4a	Merge pull request #10282 from teawater/dup runtime-rs: configuration-dragonball.toml.in: Remove duplication	2024-09-10 11:46:40 +08:00
Hui Zhu	dfea12bc53	runtime-rs: configuration-dragonball.toml.in: Remove duplication Remove duplicated description of enable_balloon_f_reporting from configuration-dragonball.toml.in. Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-09-10 07:34:29 +08:00
David Esparza	6f8897249b	Merge pull request #10277 from GabyCT/topic/fixsk tests: Increase timeout to wait for soak stability test deployment	2024-09-09 14:07:10 -06:00
Gabriela Cervantes	5a52fe1a75	tests: Increase timeout to wait for soak stability test deployment This PR increases the timeout to wait that the deployment for the soak stability test is ready in order to avoid random failures saying that the deployment is not ready yet. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-09 16:13:40 +00:00
Alex Lyn	1684c1962c	runtime: Fix runtime/cdi panic with assignment to entry in nil map It will panic when users do GPU vfio passthrough with cdi in runtime. The root cause is that CustomSpec.Annotations is nil when new element added. To address this issue, initialization is introduced when it's nil. Fixes #10266 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-09-09 20:15:10 +08:00
Alex Lyn	f31839af63	Merge pull request #10253 from teawater/enable_balloon_f_reporting Add support of dragonball virtio-balloon free page reporting	2024-09-09 17:37:52 +08:00
Fabiano Fidêncio	026a4d92a9	Merge pull request #10272 from fidencio/topic/add-tdx-mrconfigid-mrowner-mrownerconfig-support runtime: qemu: tdx: Add support for setting mrconfigid / mrowner / mrownerconfig	2024-09-08 14:11:30 +02:00
Fabiano Fidêncio	51ee4c381a	Merge pull request #10257 from fidencio/topic/kata-deploy-remove-unused-vars-for-cleanup kata-deploy: Remove kata-cleanup unneeded vars	2024-09-07 11:27:14 +02:00
Chengyu Zhu	3a37652d01	Merge pull request #10213 from ChengyuZhu6/device Refine device management for kata-agent	2024-09-07 12:02:32 +08:00
ChengyuZhu6	75816d17f1	agent: switch to new device subsystem Switch to new device subsystem to handle various devices in kata-agent. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	df55f37dfe	agent: Move unit tests about vfio device to vfio_device_handler Move unit tests about vfio device to vfio_device_handler. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	41c2d81fd3	agent: Move unit tests about scsi device to scsi_device_handler Move unit tests about scsi device to scsi_device_handler. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	f45129cb44	agent: Move unit tests about network device to network_device_handler Move unit tests about network device to network_device_handler. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	52203db760	agent: Move unit tests about block device to block_device_handler Move unit tests about block device to block_device_handler. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	e1afb92a28	agent: Move common unit tests about device Move common unit tests about device to mod.rs Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:43 +08:00
ChengyuZhu6	25bd04c02a	agent: Use DeviceHandlerManager to handle various devices Use DeviceHandlerManager to handle various devices. Fixes: #10218 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:45:42 +08:00
ChengyuZhu6	5fc645c869	agent: Move network device code to network_device_handler Move network device code to network_device_handler to simplify the code. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	07f104085a	agent: Move vfio device code to vfio_device_handler Move vfio device code to vfio_device_handler to simplify the code. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	0cb87767ae	agent: Move device code with virtio scsi driver to scsi_device_handler Move scsi device code to scsi_device_handler to simplify the code. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	0738d75a92	agent: Move device code with nvdimm driver to nvdimm_device_handler Move device code with nvdimm driver to nvdimm_device_handler, including nvdimm device and pmem device. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	bbf934161b	agent: Move virtio-block device handlers to block_device_handler Move virtio-block device handlers to block_device_handler to simplify the code. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	4e33665be8	kata-types: Move device driver constants to kata-types Move device driver constants and add DeviceHandlerManager type alias. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 09:40:30 +08:00
ChengyuZhu6	0b3ad2f830	kata-types: Replace StorageHandlerManager with type alias Removed the `StorageHandlerManager` struct and its associated implementations and introduced a type alias `StorageHandlerManager` for `HandlerManager` to simplify the code. The new type alias maintains the same functionality while reducing redundancy. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 07:53:31 +08:00
ChengyuZhu6	281f0d7f29	kata-types: Add HandlerManager to manage registered handlers Introduced `HandlerManager` struct to manage registered handlers, which will be used to storage and device management for kata-agent. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-07 07:51:48 +08:00
GabyCT	b05811587e	Merge pull request #10245 from ChengyuZhu6/handler-manager agent: Refactor storage handler registration	2024-09-06 09:45:39 -06:00
GabyCT	37ddb837c4	Merge pull request #10267 from GabyCT/topic/updatemlcomments metrics: Update openVINO and oneDNN tests references	2024-09-06 09:42:21 -06:00
Fabiano Fidêncio	65a4562050	runtime: qemu: tdx: Add `omitempty` to QuoteGenerationSocket I know right now we're always passing a value for that, but this doesn't really have to be set unless attestation is used. Thus, let's also omit it in case it's empty. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-06 15:05:55 +02:00
Fabiano Fidêncio	7818484120	runtime: qemu: tdx: Support mrconfigid / mrowner/ mrownerconfig This is a quick and simple pre-req for supporting initData, which will take advantage of the mrconfigid in the TDX case. While already adding mrconfigid, which is hardcoded empty right now, let's do the same for mrowner and mrownerconfig, and leave it prepared for future expansions. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-06 15:05:54 +02:00
Fabiano Fidêncio	8285957678	runtime: qemu: Rename prepareObjectWithTDXQgs to prepareTDXObject The reason we're relying on yet another function to do so is because the TDX object will be used in its qom / qapi json format. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-06 14:36:09 +02:00
Fabiano Fidêncio	29ce2205a1	Merge pull request #10268 from microsoft/saulparedes/pdb-support genpolicy: add support for PodDisruptionBudget yaml	2024-09-06 09:53:36 +02:00
Dan Mihai	1885478e2e	Merge pull request #10270 from Sumynwa/sumsharma/enable_agent_tests_in_ci ci: Enable kata agent API tests	2024-09-05 14:24:49 -07:00
Archana Choudhary	f2625b0014	genpolicy: add support for PodDisruptionBudget yaml Prevent panic for PDB specs Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-09-05 11:33:47 -07:00
Sumedh Alok Sharma	e1ac2f4416	ci: Enable kata agent api tests This commit enables running tests for kata agent apis. The 'api-tests' directory will contain bats test files for individual APIs. Fixes #10269 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-06 00:02:55 +05:30
GabyCT	4b257bcbb6	Merge pull request #10255 from Sumynwa/sumsharma/metrics_ci_kill_kata_components ci: send SIGKILL to kill kata components	2024-09-05 12:04:57 -06:00
Aurélien Bombo	cc9aeee81a	Merge pull request #10263 from Sumynwa/sumsharma/add_ci_workflow ci: Add workflow to run kata-agent api tests using kata-agent-ctl	2024-09-05 09:32:34 -07:00
Dan Mihai	7ab95b56f1	Merge pull request #10251 from microsoft/saulparedes/support_readonly_hostpath genpolicy: support readonly hostpath	2024-09-05 09:27:15 -07:00
GabyCT	deb6d12ff6	Merge pull request #10237 from GabyCT/topic/k8soakcoco tests: Enable k8s soak stability test for Kata CoCo CI	2024-09-05 09:56:48 -06:00
Gabriela Cervantes	fcc35dd3a7	metrics: Update openVINO and oneDNN tests references This PR updates the machine learning tests references or urls for the openVINO and oneDNN scripts as currently they are refering to a different performance benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-05 15:39:21 +00:00
GabyCT	bb5d8bbcb5	Merge pull request #10229 from GabyCT/topic/ufcv versions: Update firecracker version to 1.8.0	2024-09-05 09:19:36 -06:00
Fabiano Fidêncio	70491ff29f	Merge pull request #10244 from BbolroC/turn-on-kbs-qemu-coco-dev-s390x gha: Turn on KBS for qemu-coco-dev on s390x	2024-09-05 13:02:42 +02:00
Sumedh Alok Sharma	ad66f4dfc9	ci: Add workflow to run kata-agent api tests using kata-agent-ctl enable CI to add test cases for testing kata-agent APIs. This commit introduces: - a workflow to run tests - setup scripts to prepare the test environment Fixes #10262 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-05 14:38:29 +05:30
Saul Paredes	24c2d13fd3	genpolicy: support readonly emptyDir mount Set emptyDir access based on volume mount readOnly value Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-09-04 15:05:44 -07:00
Saul Paredes	36a4104753	genpolicy: support readonly hostpath Set hostpath access based on volume mount readOnly value Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-09-04 14:55:22 -07:00
Fabiano Fidêncio	7d048f5963	Merge pull request #10254 from fidencio/topic/remove-amd-specific-warning-from-non-amd-systems runtime: Don't error out about SNP cert path on non SNP platforms	2024-09-04 23:42:32 +02:00
Fabiano Fidêncio	d44d66ddf6	kata-deploy: Remove kata-cleanup unneeded vars As kata-cleanup will only call `reset_runtime()`, there's absolutely no need to export the other set of environment variables in its yaml file. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-04 19:09:02 +02:00
Steve Horsman	f66e8c41a1	Merge pull request #10250 from squarti/remote-machine-type-default runtime: fix bad default machine_type for remote hypervisor	2024-09-04 17:34:04 +01:00
Sumedh Alok Sharma	4025468e27	ci: send SIGKILL to kill kata components metrics tests sometimes fail with kata components still running. sending SIGKILL and waiting for the processes to reap. Fixes #8651 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2024-09-04 18:58:17 +05:30
Fabiano Fidêncio	b10256a7ca	runtime: Don't error out about SNP cert path on non SNP platforms This error is specific to SNP platforms, so let's make sure we only error this out when an SNP platform is used. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-04 11:54:52 +02:00
Hui Zhu	447a7feccf	runtime-rs: configuration-dragonball.toml.in: Add config for balloon Add enable_balloon_f_reporting config to configuration-dragonball.toml.in. Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-09-04 17:25:38 +08:00
Hui Zhu	9c1b5238b3	kernel/configs: Add ballon and f_reporting to dragonball-experimental Add CONFIG_PAGE_REPORTING, CONFIG_BALLOON_COMPACTION and CONFIG_VIRTIO_BALLOON to dragonball-experimental configs to open dragonball function and free page reporting function. Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-09-04 17:25:30 +08:00
Hui Zhu	ad9968ce2d	runtime-rs: Add enable_balloon_f_reporting for dragonball Under normal circumstances, the virtual machine only requests memory from the host and does not actively release it back to host when it is no longer needed, leading to a waste of memory resources. Free page reporting is a sub-feature of virtio-balloon. When this feature is enabled, the Linux guest kernel will send information about released pages to dragonball via virtio-balloon, and dragonball will then release these pages. This commit adds an option enable_balloon_f_reporting to runtime-rs. When this option is enabled, runtime-rs will insert a virtio-balloon device with the f_reporting option enabled during the Dragonball virtual machine startup. Signed-off-by: Hui Zhu <teawater@antgroup.com>	2024-09-04 16:38:13 +08:00
Fabiano Fidêncio	13517cf9c1	Merge pull request #10192 from fidencio/topic/helm-add-post-delete-job helm: Several fixes, including some reasonable re-work on kata-deploy.sh script	2024-09-04 09:34:57 +02:00
Paul Meyer	3be719c805	virtcontainers: allow specifying nydus-overlayfs binary by path ...or by using a binary with additional suffix. This allows having multiple versions of nydus-overlayfs installed on the host, telling nydus-snapshotter which one to use while still detecting Nydus is used. Signed-off-by: Paul Meyer <49727155+katexochen@users.noreply.github.com>	2024-09-04 08:29:40 +02:00
Chengyu Zhu	f0066568eb	Merge pull request #10233 from ChengyuZhu6/cdh-instance agent:cdh: Refactor CDHClient usage and initialization	2024-09-04 13:34:36 +08:00
Silenio Quarti	9e1388728e	runtime: fix bad default machine_type for remote hypervisor Fixes: #10249 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-09-03 20:53:19 -04:00
GabyCT	c2774b09dd	Merge pull request #10247 from GabyCT/topic/removereportm metrics: Remove metrics report for Kata Containers	2024-09-03 15:10:04 -06:00
Fabiano Fidêncio	bb9bcd886a	kata-deploy: Add reset_cri_runtime() This will help to avoid code duplication on what's needed on the helm and non-helm cases. The reason it's not been added as part of the commit which adds the post-delete hook is simply for helping the reviewer (as the diff would be less readable with this change). Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 23:08:22 +02:00
Fabiano Fidêncio	a773797594	ci: Pass --debug to helm Just to make ourlives a little bit easier. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 23:08:22 +02:00
Fabiano Fidêncio	64ccb1645d	helm: Add a post-delete hook Instead of using a lifecycle.preStop hook, as done when we're using using the helm chat, let's add a post-delete hook to take care of properly cleaning up the node during when uninstalling kata-deploy. The reason why the lifecyle.preStop hook would never work on our case is simply because each helm chart operation follows the Kuberentes "declarative" approach, meaning that an operation won't wait for its previous operation to successfully finish before being called, leading to us trying to access content that's defined by our RBAC, in an operation that was started before our RBAC was deleted, but having the RBAC being deleted before the operation actually started. Unfortunately this hook brings in some code duplicatioon, mainly related to the RBAC parts, but that's not new as the same happens with our deamonset. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-09-03 23:08:22 +02:00
Wainer dos Santos Moschetta	3b23d62635	tests/k8s: fix wait for pods on deploy-kata action On commit `51690bc157` we switched the installation from kubectl to helm and used its `--wait` expecting the execution would continue when all kata-deploy Pods were Ready. It turns out that there is a limitation on helm install that won't wait properly when the daemonset is made of a single replica and maxUnavailable=1. In order to fix that issue, let's revert the changes partially to keep using kubectl and waitForProcess to the exection while Pods aren't Running. Fixes #10168 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-09-03 23:08:22 +02:00
Fabiano Fidêncio	40f8aae6db	Reapply "ci: make cleanup_kata_deploy really simple" This reverts commit `21f9f01e1d`, as the pacthes for helm are coming as part of this series. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 23:08:22 +02:00
Fabiano Fidêncio	cfe6e4ae71	Reapply "ci: Use helm to deploy kata-deploy" (partially) This reverts commit `36f4038a89`, as the pacthes for helm are coming as part of this series. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 23:08:22 +02:00
Fabiano Fidêncio	424347bf0e	Reapply "kata-deploy: Add Helm Chart" (partially) This reverts commit `b18c3dfce3`, as the pacthes for helm are coming as part of this series. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 23:08:22 +02:00
ChengyuZhu6	77521cc8d2	agent:cdh: introduce a function to check initialization of cdh client introduce a function to check initialization of cdh client. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-04 04:52:50 +08:00
ChengyuZhu6	07e0e843e8	agent:cdh: switch to the new method for initializing cdh client Decouple the cdh client from AgentService and refactor cdh client usage and initialization. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-04 04:51:55 +08:00
ChengyuZhu6	bc8156c3ae	agent:cdh: Refactor cdh client methods for better integration Move `unseal_env` and `secure_mount` functions on the global `CDH_CLIENT` instance to access the CDH client. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-04 04:51:54 +08:00
ChengyuZhu6	0ad35dc91b	agent:cdh: Initialize CDH client as a global asynchronous instance Introduced a global `CDH_CLIENT` instance to hold the cdh client and implemented `init_cdh_client` function to initialize the cdh client if not already set. Fixes: #10231 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-04 04:49:54 +08:00
Gabriela Cervantes	5b0ab7f17c	metrics: Remove metrics report for Kata Containers This PR removes the metrics report which is not longer being used in Kata Containers. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-03 16:11:07 +00:00
Hyounggyu Choi	1cefa48047	gha: Add necessary steps for KBS enablement The following steps are required for enabling KBS: - Set environment variables `KBS` and `KBS_INGRESS` - Uninstall and install `kbs-client` - Deploy KBS This commit adds the above stpes to the existing workflow for `qemu-coco-dev`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-03 16:26:12 +02:00
Hyounggyu Choi	b0a912b8b4	tests: Enable KBS deployment for qemu-coco-dev on s390x To deploy KBS on s390x, the environment variable `IBM_SE_CREDS_DIR` must be exported, and the corresponding directory must be created. This commit enables KBS deployment for `qemu-coco-dev`, in addition to the existing `qemu-se` support on the platform. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-09-03 15:51:18 +02:00
Fabiano Fidêncio	057612f18f	Merge pull request #10238 from fidencio/topic/remove-stdio-test ci: Remove stdio tests	2024-09-03 14:50:46 +02:00
ChengyuZhu6	0d519162b5	agent:storage: Refactor storage handler registration - Added `driver_types` method to `StorageHandler` trait to return driver types managed by each handler. - Implemented driver_types method for all storage handlers. - Updated `STORAGE_HANDLERS` initialization to use `driver_types` for handler registration. Fixes: #10242 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-03 18:38:52 +08:00
ChengyuZhu6	e47eb0d7d4	kata-types:mount: support registering multiple IDs to a single handler - Updated the `add_handler` function in `StorageHandlerManager` to accept a slice of IDs (`&[&str]`) instead of a single ID (`&str`). This change allows a single handler to be registered for multiple storage device types. - Refactored calls to `add_handler` in `Storage` of kata-agent to use the new function, passing arrays of storage drivers instead of single driver. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-09-03 18:38:36 +08:00
Fabiano Fidêncio	e8657c502d	Revert "CI: Add tests for stdio" This reverts commit `704da86e9b`, as the tests never became stable to run. This was discussed and agreed with the maintainer. Conflicts: .github/workflows/basic-ci-amd64.yaml tests/integration/stdio/gha-run.sh Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-03 11:52:30 +02:00
Greg Kurz	4698235e59	Merge pull request #10204 from fidencio/topic/kata-deploy-add-installation-prefix kata-deploy: helm: Add INSTALLATION_PREFIX	2024-09-03 09:26:51 +02:00
Fabiano Fidêncio	e1d3fb8c00	Merge pull request #10236 from fidencio/topic/bump-image-rs-to-properly-handle-gzip-whiteouts agent: Update image-rs to 02af65abc	2024-09-02 21:43:19 +02:00
Fabiano Fidêncio	0cb93ed1bb	kata-deploy: helm: Add INSTALLATION_PREFIX option This will allow users to properly set the INSTALLATION_PREFIX when deploying Kata Containers. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-02 20:25:22 +02:00
Gabriela Cervantes	c2aa288498	gha: Increase time to run Kata CoCo stability tests This PR increases the time to run the Kata CoCo stability tests as this tests are design to run for more than 2 hours. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-02 16:40:47 +00:00
Gabriela Cervantes	825cb2d22e	tests: Enable k8s soak stability test for Kata CoCo CI This PR enables the k8s soak stability test to run on the weekly Kata CoCo stability CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-09-02 16:30:44 +00:00
Fabiano Fidêncio	1309c49c09	agent: Update image-rs to 02af65abc As this brings in proper support to handle gzip whiteouts. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-02 14:15:04 +02:00
Fabiano Fidêncio	7be77ebee5	kata-deploy: helm: Stop mounting /opt/kata It's simply easier if we just use /host/opt/kata instead in our scripts, which will simplify a lot the logic of adding an INSTALLATION_PREFIX later on. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-02 09:38:51 +02:00
Fabiano Fidêncio	6ce5e62c48	kata-deploy: Add a $dest_dir var As we build our binaries with the `/opt/kata` prefix, that's the value of $dest_dir. Later in thise series it'll become handy, as we'll introduce a way to install the Kata Containers artefacts in a different location. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-09-02 09:36:33 +02:00
Fabiano Fidêncio	ef5a5ea26e	Merge pull request #10038 from sprt/move-free-runner-iii ci: Transition GARM tests to free runners, pt. III	2024-08-31 01:29:08 +02:00
Gabriela Cervantes	19d8f11345	versions: Update firecracker version to 1.8.0 This PR updates the firecracker version to 1.8.0 which includes the following changes: - Added ACPI support to Firecracker for x86_64 microVMs. Currently, we pass ACPI tables with information about the available vCPUs, interrupt controllers, VirtIO and legacy x86 devices to the guest. This allows booting kernels without MPTable support. Please see our kernel policy documentation for more information regarding relevant kernel configurations. - Added support for the Virtual Machine Generation Identifier (VMGenID) device on x86_64 platforms. VMGenID is a virtual device that allows VMMs to notify guests when they are resumed from a snapshot. Linux includes VMGenID support since version 5.18. It uses notifications from the device to reseed its internal CSPRNG. Please refer to snapshot support and random for clones documention for more info on VMGenID. VMGenID state is part of the snapshot format of Firecracker. As a result, Firecracker snapshot version is now 2.0.0. - Changed T2CL template to pass through bit 27 and 28 of MSR_IA32_ARCH_CAPABILITIES (RFDS_NO and RFDS_CLEAR) since KVM consider they are able to be passed through and T2CL isn't designed for secure snapshot migration between different processors. - Avoid setting kvm_immediate_exit to 1 if are already handling an exit, or if the vCPU is stopped. This avoids a spurious KVM exit upon restoring snapshots. - Changed T2S template to set bit 27 of MSR_IA32_ARCH_CAPABILITIES (RFDS_NO) to 1 since it assumes that the fleet only consists of processors that are not affected by RFDS. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-30 20:49:29 +00:00
Aurélien Bombo	886b3047ac	Merge pull request #10222 from microsoft/danmihai1/log-level-false-positives agent: avoid policy.txt log without debug enabled	2024-08-30 10:09:04 -07:00
Alex Lyn	4fd4b02f2e	Merge pull request #10228 from GabyCT/topic/removeionednn metrics: Remove unused variable in oneDNN benchmark	2024-08-30 09:31:14 +08:00
Gabriela Cervantes	aa8635727d	metrics: Remove unused variable in oneDNN benchmark This PR removes an unused variable in oneDNN metrics benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-29 15:52:47 +00:00
Alex Lyn	8241423ba5	Merge pull request #10224 from amshinde/update-image-rs-xattr agent: image-rs: check xattrs for image unpacking	2024-08-29 09:33:22 +08:00
GabyCT	dd9f41547c	Merge pull request #10160 from microsoft/saulparedes/support_priority_class genpolicy: add priorityClassName as a field in PodSpec interface	2024-08-28 14:36:20 -06:00
GabyCT	394480e7ff	Merge pull request #10221 from GabyCT/topic/addopendmmread docs: Add oneDNN benchmark information to metrics README	2024-08-28 14:22:22 -06:00
GabyCT	83b031ca7a	Merge pull request #10214 from GabyCT/topic/ciweekly gha: Add GHA workflow to run Kata CoCo stability tests	2024-08-28 11:46:29 -06:00
Archana Shinde	c747852bce	agent: image-rs: check xattrs for image unpacking This commit includes a fix for pulling an image on platforms that do not support xattr. Some platforms/file-systems do not support xattrs, this would make the image pull fail because of failing to set xattr. This commit will check whether the target path supports xattr. If yes, the unpacking will maintain xattrs; if not, it will not set xattrs. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-08-28 00:02:46 -07:00
Archana Choudhary	ae2cdedba8	genpolicy: add priorityClassName as a field in PodSpec interface This allows generation of policy for pods specifying priority classes. Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-08-27 19:54:02 -07:00
Dan Mihai	aa8bdbde5a	agent: avoid policy.txt log without debug enabled slog's is_enabled() is documented as: - "best effort", and - Sometime resulting in false positives. Use AGENT_CONFIG.log_level.as_usize() instead, to avoid those false positives. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-08-28 02:33:56 +00:00
Aurélien Bombo	de98e467b4	ci: Use `ubuntu-22.04` instead of `ubuntu-latest` 22.04 is the default today: `23da668261/README.md` Being more specific will avoid unexpected errors when Github updates the default. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-08-27 16:44:39 +00:00
Aurélien Bombo	ceab66b1ce	ci: Run `build-checks-depending-on-kvm` for free Also keeps the Rust installation step even though it's preinstalled, so that we use the version specified in versions.yaml. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-08-27 16:43:59 +00:00
Aurélien Bombo	b4ce84b9d2	ci: Move `run-runk` to free runner No change other than switching the runner - no dependency issue expected. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-08-27 16:43:33 +00:00
Aurélien Bombo	645aaa6f7f	ci: Move `run-monitor` to free runner No change other than switching the runner - no dependency issue expected. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-08-27 16:43:33 +00:00
Gabriela Cervantes	3affde5b28	docs: Add oneDNN benchmark information to metrics README This PR adds the oneDNN benchmark information to the machine learning metrics README. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-27 16:32:50 +00:00
Dan Mihai	9f6f5dac4b	Merge pull request #10037 from sprt/reinstate-mariner-host ci: reinstate Mariner host and guest kernel	2024-08-27 08:24:51 -07:00
Alex Lyn	f24983b3cf	Merge pull request #10210 from l8huang/cold-vf runtime: check if cold_plug_vfio is enabled before create PhysicalEndpoint	2024-08-27 15:23:55 +08:00
Alex Lyn	3a749cfb44	Merge pull request #10212 from squarti/remote-machine-type runtime: Allow machine_type in kata config for remote hypervisors	2024-08-27 14:05:36 +08:00
Aurélien Bombo	a3dba3e82b	ci: reinstate Mariner host GH-9592 addressed a bug in a previous version of the AKS Mariner host kernel that blocked the CH v39 upgrade. This bug has now been fixed so we undo that PR. Note we also specify a different OCI version for Mariner as it differs from Ubuntu's. Fixes: #9594 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-08-26 21:07:25 +00:00
Gabriela Cervantes	3a14b04621	gha: Fix entry for ci coco stability yaml This PR fixes the entry or use of the ci weekly GHA workflow to run properly the weekly k8s tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-26 17:14:35 +00:00
Gabriela Cervantes	95f6246858	gha: Add GHA workflow to run Kata CoCo stability tests This PR adds a GHA workflow to run Kata CoCo weekly stablity tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-26 17:05:21 +00:00
Silenio Quarti	11ba8f05ca	runtime: Allow machine_type in kata config for remote hypervisors Fixes: #10211 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-08-26 10:17:40 -04:00
Lei Huang	70168a467d	runtime: check if cold_plug_vfio is enabled before create PhysicalEndpoint PhysicalEndpoint unbinds its VF interface and rebinds it as a VFIO device, then cold-plugs the VFIO device into the guest kernel. When `cold_plug_vfio` is set to "no-port", cold-plugging the VFIO device will fail. This change checks if `cold_plug_vfio` is enabled before creating PhysicalEndpoint to avoid unnecessary VFIO rebind operations. Fixes: #10162 Signed-off-by: Lei Huang <leih@nvidia.com>	2024-08-23 15:42:17 -07:00
GabyCT	6b0272d6bf	Merge pull request #10193 from GabyCT/topic/k8ssoak stability: Add kubernetes parallel test	2024-08-23 15:51:01 -06:00
GabyCT	83177efb9b	Merge pull request #10201 from GabyCT/topic/readmeopenvino metrics: Add OpenVINO general information into README	2024-08-23 14:11:26 -06:00
Bo Chen	a0bd78b358	Merge pull request #10205 from likebreath/0819/upgrade_clh_v41.0 Upgrade to Cloud Hypervisor v41.0	2024-08-23 10:01:41 -07:00
Hyounggyu Choi	169b4490d2	Merge pull request #10209 from fidencio/topic/kata-manager-avoid-rate-pull-limit kata-manager: Avoid docker rate-limit	2024-08-23 12:52:14 +02:00
Fabiano Fidêncio	7f0289de60	kata-manager: Avoid docker rate-limit To do so, use a test image from quay.io instead of docker.io. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-23 11:56:09 +02:00
Fabiano Fidêncio	45f69373a6	Merge pull request #10199 from BbolroC/make-cdh-api-timeout-configurable agent/config: Make CDH_API_TIMEOUT configurable	2024-08-23 11:04:10 +02:00
Hyounggyu Choi	4cd83d2b98	Merge pull request #10202 from BbolroC/fix-k8s-tests-s390x tests: Fix k8s test issues on s390x	2024-08-23 09:51:11 +02:00
Fabiano Fidêncio	11bb9231c2	Merge pull request #10207 from amshinde/remove-image-check-cc Revert "tests: add image check before running coco tests"	2024-08-23 09:33:39 +02:00
Alex Lyn	44bf7ccb46	Merge pull request #10141 from soulfy/fix-delete-failed agent: kill child process when console socket closed	2024-08-23 14:00:53 +08:00
Archana Shinde	b0be03a93f	Revert "tests: add image check before running coco tests" This reverts commit `41b7577f08`. We were seeing a lot of issues in the TDX CI of the nature: "Error: failed to create containerd container: create instance 470: object with key "470" already exists: unknown" With the TDX CI, we moved to having the nydus snapsotter pre-installed. Essentially the `deploy-snapshotter` step was performed once before any actual CI runs. We were seeing failures related to the error message above. On reverting this change, we are no longer seeing errors related to "key exists" with the TDX CI passing now. The change reverted here is related to downloading incomplete images, but this seems to be messing up TDX CI. It is possible to pass --snapshotter to `ctr image check` but that does not seem to have any effect on the data set returned. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org> Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-08-22 18:05:42 -07:00
Bo Chen	254f8bca74	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v41.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #10203 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-08-22 11:05:54 -07:00
Bo Chen	e69535326d	versions: Upgrade to Cloud Hypervisor v41.0 Details of this release can be found in our roadmap project as iteration v41.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #10203 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-08-22 11:02:26 -07:00
Gabriela Cervantes	2fa8e85439	metrics: Add OpenVINO general information into README This PR adds the OpenVINO benchmark general information into the machine learning README metrics information. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-22 16:08:06 +00:00
Hyounggyu Choi	274de8c6af	tests: Introduce wait_time to k8s_create_pod() In certain environments (e.g., those with lower performance), `k8s_create_pod()` may require additional wait time, especially when dealing with large images. Since `k8s_wait_pod_be_ready()` — which is called by `k8s_create_pod()` — already accepts `wait_time` as a second argument, it makes sense to introduce `wait_time` to `k8s_create_pod()` and propagate it to the callee. This commit adds `wait_time` to `k8s_create_pod()` as the 2nd (optional) argument. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 17:46:53 +02:00
Hyounggyu Choi	5d7397cc69	tests: Load confidential_kbs.sh in k8s-guest-pull-iamge.bats Some of the tests call set_metadata_annotation() for updating the kernel parameters. For `kata-qemu-se`, repack_secure_image() is called which is defined in `lib_se.sh` and sourced by `confidential_kbs.sh`. This commit ensures that the function call chain for the relevant `KATA_HYPERVISOR` is properly handled. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 17:33:38 +02:00
Fabiano Fidêncio	890fa26767	Merge pull request #10196 from fidencio/topic/ci-commit-message-take-reapply-into-consideration ci: commit-message-check: Take re-revert into consideration	2024-08-22 17:31:27 +02:00
Fabiano Fidêncio	2f6edc4b9b	Merge pull request #10194 from fidencio/topic/kata-deploy-re-work-logic kata-deploy: Rework the logic a little bit	2024-08-22 16:46:36 +02:00
Hyounggyu Choi	baa8af3f8e	doc: Update how-to-set-sandbox-config-kata.md This commit add a row for `cdh_api_timeout` to the agent options in how-to-set-sandbox-config-kata.md. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 14:50:51 +02:00
Hyounggyu Choi	7d0aba1a24	runtime: Enable to get cdh_api_timeout from configuration file This commit allows `cdh_api_timeout` to be configured from the configuration file. The configuration is commented out with specifying a default value (50s) because the default value is configured in the agent. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 14:47:37 +02:00
Hyounggyu Choi	8615516823	agent: Add agent.cdh_api_timeout to README This commit adds an explanation for `cdh_api_timeout` to the README file. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 14:47:37 +02:00
Fabiano Fidêncio	a9a1345a31	kata-deploy: Print the action the script was invoked with This increases debuggability. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-22 14:32:33 +02:00
Fabiano Fidêncio	ab493b6028	kata-deploy: Move general logic to the correct actions therwise we may end up running into unexpected issues when calling the cleanup option, as the same checks would be done, and files could end up being copied again, overwriting the original content which was backked up by the install option. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-22 14:32:29 +02:00
Fabiano Fidêncio	6596012956	kata-deploy: Simplify check for runtime Let's write the runtime check in a shorter and simpler to read form. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-22 14:32:02 +02:00
Hyounggyu Choi	2512ddeab2	agent/cdh: Use AGENT_CONFIG.cdh_api_timeout for CDH_API_TIMEOUT This commit updates CDH_API_TIMEOUT to use AGENT_CONFIG.cdh_api_timeout and changes it from a `const` to `lazy_static` to accommodate runtime-determined values. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 10:09:16 +02:00
Hyounggyu Choi	6139e253a0	agent/config: Add cdh_api_timeout to AgentConfig To make the `cdh_api_timeout` variable configurable, it has been added to the `AgentConfig` structure. This change includes storing the variable as a `time::Duration` type and generalizing the existing `hotplug_timeout` code to handle both timeouts. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-22 10:09:16 +02:00
GabyCT	3fd108b09a	Merge pull request #10198 from GabyCT/topic/remvaropenvino metrics: Remove unused variable in openvino script	2024-08-21 15:48:56 -06:00
Dan Mihai	8ccc8a8d0b	Merge pull request #9911 from microsoft/saulparedes/mounts genpolicy: deny UpdateEphemeralMountsRequest	2024-08-21 10:12:28 -07:00
Gabriela Cervantes	59e31baaee	metrics: Remove unused variable in openvino script This PR removes an unused variable in the openvino script for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-21 16:05:55 +00:00
Greg Kurz	09a13da8ec	Merge pull request #10197 from beraldoleal/release-3.8 release: Bump VERSION to 3.8.0	2024-08-21 17:50:10 +02:00
Beraldo Leal	55bdb380fb	release: Bump VERSION to 3.8.0 Let's start the 3.8.0 release. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-08-21 10:24:07 -04:00
Gabriela Cervantes	27d5539954	stability: Add pod deployment yaml for soak test This PR adds the pod deployment yaml for soak test which is part of the stability k8s tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-21 14:23:22 +00:00
Fabiano Fidêncio	3fd021a9b3	ci: commit-message-check: Take re-revert into consideration `Reapply "` should be taken into sonsideration as well. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-21 14:19:16 +02:00
Fabiano Fidêncio	f071c8cada	Merge pull request #10191 from fidencio/topic/ci-temporarily-revert-helm-usage ci: Let's temporarily revert the helm charts usage in our CI	2024-08-21 10:52:23 +02:00
Dan Mihai	6654491cc3	genpolicy: deny UpdateEphemeralMountsRequest * genpolicy: deny UpdateEphemeralMountsRequest Deny UpdateEphemeralMountsRequest by default, because paths to critical Guest components can be redirected using such request. Signed-off-by: Dan Mihai <Daniel.Mihai@microsoft.com>	2024-08-20 18:28:17 -07:00
Gabriela Cervantes	c04a805215	stability: Add kubernetes parallel test This PR adds a kubernetes parallel test that will launch multiple replicas from a kubernetes deployment and we will iterate this multiple times to verify that we are able to do this using CoCo Kata. This test will be part of the CoCo Kata stability CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-20 23:24:22 +00:00
Fabiano Fidêncio	b18c3dfce3	Revert "kata-deploy: Add Helm Chart" (partially) This partially reverts commit `94b3348d3c`, as there's more work needed in order to have this one done in a robust way, and we are taking the safer path of reverting for now, and adding it back as soon as the release is cut out. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-21 00:09:11 +02:00
Fabiano Fidêncio	36f4038a89	Revert "ci: Use helm to deploy kata-deploy" (partially) This partially reverts commit `51690bc157`, as there's more work needed in order to have this one done in a robust way, and we are taking the safer path of reverting for now, and adding it back as soon as the release is cut out. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-21 00:09:11 +02:00
Fabiano Fidêncio	21f9f01e1d	Revert "ci: make cleanup_kata_deploy really simple" This reverts commit `1221ab73f9`, as there's more work needed in order to have this one done in a robust way, and we are taking the safer path of reverting for now, and adding it back as soon as the release is cut out. Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>	2024-08-21 00:09:11 +02:00
GabyCT	e0bff7ed14	Merge pull request #10177 from GabyCT/topic/cocoghas gha: Add k8s stability Kata CoCo GHA workflow	2024-08-20 15:12:29 -06:00
Gabriela Cervantes	ca3d778479	gha: Add Kata CoCo Stability workflow This PR adds the Kata CoCo Stability workflow that will setup the environment to run the k8s tests on a non-tee environment. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-20 16:34:33 +00:00
Gabriela Cervantes	3ebaa5d215	gha: Add Kata CoCo stability weekly yaml This PR adds the Kata CoCo stability weekly yaml that will trigger weekly the k8s stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-20 16:32:03 +00:00
Fabiano Fidêncio	aeb6f54979	Merge pull request #10180 from fidencio/topic/ci-ensure-the-key-was-created-on-kbs ci: Ensure the KBS resources are created	2024-08-20 09:07:56 +02:00
Fabiano Fidêncio	40d385d401	Merge pull request #10188 from wainersm/kbs_key tests/k8s: check and save kbs.key	2024-08-19 23:29:10 +02:00
Fabiano Fidêncio	c0d7222194	ci: Ensure the KBS resources are created Otherwise we may have tests failing due to the resource not being created yet. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-19 23:27:06 +02:00
Wainer dos Santos Moschetta	e014eee4e8	tests/k8s: check and save kbs.key The deploy-kbs.sh script generates the kbs.key that's used to install KBS. This same file is used lately by kbs-client to authenticate. This ensures that the file was created, otherwise fail. Another problem solved here is that on bare-metal machines the key doesn't survive a reboot as it is created in a temporary directory (/tmp/trustee). So let's save the file to a non-temporary location. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-19 16:03:03 -03:00
Wainer Moschetta	6a982930e2	Merge pull request #10183 from fidencio/topic/kata-deploy-use-runtime_path kata-deploy: Stop symlinking into /usr/local/bin	2024-08-19 13:17:21 -03:00
Fabiano Fidêncio	42d48efcc2	Merge pull request #10181 from fidencio/topic/ci-fix-stdio-typo ci: stdio: Fix typo on getting the containerd version	2024-08-18 16:05:42 +02:00
Fabiano Fidêncio	e0ae398a2e	Merge pull request #10151 from squarti/rootdir2 runtime: Files are not synced between host and guest VMs	2024-08-18 12:32:52 +02:00
Fabiano Fidêncio	d03b72f19b	kata-deploy: Stop linking binaries to /usr/local/bin Neither CRI-O nor containerd requires that, and removing such symlinks makes everything less intrusive from our side. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-18 01:25:12 +02:00
Fabiano Fidêncio	c2393dc467	kata-deploy: Use shim's absolute path for crio's runtime_path This will allow us, in the future, not have to do symlinks here and there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-18 01:25:12 +02:00
Fabiano Fidêncio	58623723b1	kata-deploy: Use runtime_path for containerd It's already being used with CRi-O, let's simplify what we do and also use this for containerd, which will allow us to do further cleanups in the coming patches. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-18 01:25:12 +02:00
Fabiano Fidêncio	e75c149dec	ci: stdio: Properly start running the test "gha-run.sh" requires a `run` argument in order to run the tests, which seems to be forgotten when the test was added. This PR needs to get merged before the test can successfully run. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-17 14:41:44 +02:00
Fabiano Fidêncio	dd2d9e5524	ci: stdio: Fix typo on getting the containerd version I assume the PR that introduced this was based on an older version of yq, and as the test couldn't run before it got merged we never noticed the error. However, this test has been failing for a reasonable amount of time, which makes me think that we either need a maintainer for it, or just remove it completely, but that's a discussion for another day. For now, let's make it, at least, run. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-17 14:06:24 +02:00
Fabiano Fidêncio	7113490cb1	Merge pull request #10179 from fidencio/topic/switch-nginx-image ci: k8s: Replace nginx alpine images	2024-08-17 13:07:31 +02:00
Fabiano Fidêncio	0831081399	ci: k8s: Replace nginx alpine images The previous ones are gone, so let's switch to our own multi-arch image for the tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-17 12:19:33 +02:00
Fabiano Fidêncio	a78d82f4f1	Merge pull request #10159 from squarti/main agent: Handle EINVAL error when umounting container rootfs	2024-08-16 22:07:50 +02:00
Dan Mihai	79c1d0a806	Merge pull request #10136 from microsoft/danmihai1/docker-image-volume2 genpolicy: add bind mounts for image volumes	2024-08-16 13:07:01 -07:00
Fabiano Fidêncio	28aa4314ba	Merge pull request #10175 from ChengyuZhu6/error_message runtime: Add specific error message for gRPC request timeouts	2024-08-16 22:06:49 +02:00
Fabiano Fidêncio	720edbe3fc	Merge pull request #10174 from ChengyuZhu6/install_script tools: install luks-encrypt-storage script by guest-components	2024-08-16 22:04:56 +02:00
Fabiano Fidêncio	7b5da45059	Merge pull request #10178 from fidencio/topic/revert-trustee-bump Revert "version: bump trustee version"	2024-08-16 21:48:30 +02:00
Gabriela Cervantes	6ea34f13e1	gha: Add k8s stability Kata CoCo GHA workflow This PR adds the k8s stability Kata CoCo GHA workflow to run weekly the k8s stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-16 16:14:15 +00:00
Fabiano Fidêncio	45f43e2a6a	Revert "version: bump trustee version" This reverts commit `d35320472c`. Although the commit in question does solve an issue related to the usage of busybox from docker.io, as it's reasonably easy to hit the rate limit, the commit also brings in functionalities that are causing issues in, at least, the TDX CI, such as: ```sh [2024-08-16T16:03:52Z INFO actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/attest HTTP/1.1" 401 259 "-" "attestation-agent-kbs-client/0.1.0" 0.065266 [2024-08-16T16:03:53Z INFO kbs::http::attest] Auth API called. [2024-08-16T16:03:53Z INFO actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/auth HTTP/1.1" 200 74 "-" "attestation-agent-kbs-client/0.1.0" 0.000169 [2024-08-16T16:03:54Z INFO kbs::http::attest] Attest API called. [2024-08-16T16:03:54Z INFO verifier::tdx] Quote DCAP check succeeded. [2024-08-16T16:03:54Z INFO verifier::tdx] MRCONFIGID check succeeded. [2024-08-16T16:03:54Z INFO verifier::tdx] CCEL integrity check succeeded. [2024-08-16T16:03:54Z ERROR kbs::http::error] Attestation failed: Verifier evaluate failed: TDX Verifier: failed to parse AA Eventlog from evidence Caused by: at least one line should be included in AAEL ``` Let's revert this for now, and then once we get this one fixed on trustee side we'll update again. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-16 18:10:38 +02:00
Dan Mihai	c22ac4f72c	genpolicy: add bind mounts for image volumes Add bind mounts for volumes defined by docker container images, unless those mounts have been defined in the input K8s YAML file too. For example, quay.io/opstree/redis defines two mounts: /data /node-conf Before these changes, if these mounts were not defined in the YAML file too, the auto-generated policy did not allow this container image to start. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-08-16 15:11:05 +00:00
Fabiano Fidêncio	b203f715e5	Merge pull request #10170 from beraldoleal/deploy-reset-fix kata-deploy: fix kata-deploy reset	2024-08-16 16:51:14 +02:00
Fabiano Fidêncio	8d63723910	Merge pull request #10161 from microsoft/saulparedes/ignore_role_resource genpolicy: ignore Role resource	2024-08-16 16:50:16 +02:00
Fabiano Fidêncio	6c58ae5b95	Merge pull request #10171 from fidencio/topic/ci-treat-nydus-snapshotter-as-a-dep ci: nydus: Treat the snapshotter as a dependency	2024-08-16 16:39:48 +02:00
ChengyuZhu6	1eda6b7237	tests: update error message with guest pulling image timeout update error message with guest pulling image timeout. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-16 20:26:33 +08:00
ChengyuZhu6	ca05aca548	runtime: Add specific error message for gRPC request timeouts Improved error handling to provide clearer feedback on request failures. For example: Improve createcontainer request timeout error message from "Error: failed to create containerd task: failed to create shim task:context deadline exceed" to "Error: failed to create containerd task: failed to create shim task: CreateContainerRequest timed out: context deadline exceed". Fixes: #10173 -- part II Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-16 20:24:48 +08:00
Beraldo Leal	b3a4cd1a06	Merge pull request #10172 from deagon/fix-typo osbuilder: fix typo in ubuntu rootfs depends	2024-08-16 08:01:59 -04:00
Beraldo Leal	b843b236e4	kata-deploy: improve kata-deploy script For the rare cases where containerd_conf_file does not exist, cp could fail and let the pod in Error state. Let's make it a little bit more robust. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-08-16 07:52:38 -04:00
ChengyuZhu6	aa31a9d3c4	tools: install luks-encrypt-storage script by guest-components Install luks-encrypt-storage script by guest-components. So that we can maintain a single source and prevent synchronization issues. Fixes: #10173 -- part I Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-16 16:28:20 +08:00
Chengyu Zhu	ba3c484d12	Merge pull request #9999 from ChengyuZhu6/trusted-storage Trusted image storage	2024-08-16 15:39:50 +08:00
Fabiano Fidêncio	0f3eb2451e	Merge pull request #10169 from fidencio/topic/revert-reset_runtime-to-cleanup Revert "ci: add reset_runtime to cleanup"	2024-08-16 07:29:58 +02:00
Aurélien Bombo	e1775e4719	Merge pull request #10164 from BbolroC/make-exec_host-stable tests: Ensure exec_host() consistently captures command output	2024-08-15 21:43:32 -07:00
Guoqiang Ding	1d21ff9864	osbuilder: fix typo in ubuntu rootfs depends Remove the duplicate package "xz-utils". Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-08-16 11:33:55 +08:00
Silenio Quarti	5d815ffde1	runtime: Files are not synced between host and guest VMs This PR resolves the default kubelet root dir symbolic link and uses it as the absolute path for the fs watcher regexs Fixes: https://github.com/kata-containers/kata-containers/issues/9986 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-08-15 23:19:08 -04:00
Silenio Quarti	0dd16e6b25	agent: Handle EINVAL error when umounting container rootfs Container/Sandbox clean up should not fail if root FS is not mounted. This PR handles EINVAL errors when umount2 is called. Fixes: #10166 Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-08-15 19:41:46 -04:00
Fabiano Fidêncio	3733266a60	ci: nydus: Treat the snapshotter as a dependency Instead of deploying and removing the snapshotter on every single run, let's make sure the snapshotter is always deploy on the TDX case. We're doing this as an experiment, in order to see if we'll be able to reduce the failures we've been facing with the nydus snapshotter. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-15 22:44:30 +02:00
Hyounggyu Choi	ba3e5f6b4a	Revert "tests: Disable k8s file volume test" This reverts commit `e580e29246`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-15 21:10:39 +02:00
Hyounggyu Choi	758e650a28	tests: Ensure exec_host() consistently captures command output The `exec_host()` function often fails to capture the output of a given command because the node debugger pod is prematurely terminated. To address this issue, the function has been refactored to ensure consistent output capture by adjusting the `kubectl debug` process as follows: - Keep the node debugger pod running - Wait until the pod is fully ready - Execute the command using `kubectl exec` - Capture the output and terminate the pod This commit refactors `exec_host()` to implement the above steps, improving its reliability. Fixes: #10081 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-08-15 21:10:39 +02:00
Beraldo Leal	74662a0721	Merge pull request #10137 from hex2dec/fix-image-warning tools: Fix container image build warning	2024-08-15 14:45:41 -04:00
Dan Mihai	905c76bd47	Merge pull request #10153 from microsoft/saulparedes/support_cron_job genpolicy: Add support for cron jobs	2024-08-15 11:11:00 -07:00
Aurélien Bombo	0223eedda5	Merge pull request #10050 from burgerdev/request-hardening genpolicy: hardening some agent requests	2024-08-15 08:31:21 -07:00
Fabiano Fidêncio	1f6a8baaf1	Revert "ci: add reset_runtime to cleanup" This reverts commit `8d9bec2e01`, as it causes issues in the operator and kata-deploy itself, leading to the node to be NotReady. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-15 16:09:34 +02:00
ChengyuZhu6	5f4209e008	agent:README: add secure_image_storage_integrity to agent's README add secure_image_storage_integrity to agent's README. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 20:32:44 +08:00
ChengyuZhu6	6ecb2b8870	tests: skip test trusted storage in qemu-coco-dev I can't set up loop device with `exec_host`, which the command is necessary for qemu-coco-dev. See issue #10133. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 20:32:44 +08:00
ChengyuZhu6	51b9d20d55	tests: update error message in pulling image encrypted tests Update error message in pulling image encrypted to "failed to get decrypt key no suitable key found for decrypting layer key". Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 20:32:44 +08:00
ChengyuZhu6	b4d10e7655	version: update the version of coco-guest-components update the version of coco-guest-components. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 20:32:43 +08:00
Fupan Li	365df81d5e	Merge pull request #10148 from lifupan/main_sandboxapi runtime-rs: Add the wait_vm support for hypervisors	2024-08-15 17:08:38 +08:00
ChengyuZhu6	a9b436f788	agent:cdh: Introduces secure_mount API in cdh Introduces `secure_mount` API in the cdh. It includes: - Adding the `SecureMountServiceClient`. - Implementing the `secure_mount` function to handle secure mounting requests. - Updating the confidential_data_hub.proto file to define SecureMountRequest and SecureMountResponse messages and adding the SecureMountService service. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 13:55:23 +08:00
ChengyuZhu6	1528d543b2	agent:cdh: Rename sealed_secret API namespace to confidential_data_hub renames the sealed_secret.proto file to confidential_data_hub.proto and updates the corresponding API namespace from sealed_secret to confidential_data_hub. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 13:55:23 +08:00
ChengyuZhu6	37bd2406e0	docs: add content about how to pull large image Add content about how to pull large image in the guest with trust storage. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 13:55:22 +08:00
ChengyuZhu6	c5a973e68c	tests:k8s: add tests for guest pull with configured timeout add tests for guest pull with configured timeout: 1) failed case: Test we cannot pull a large image that pull time exceeds a short creatcontainer timeout(10s) inside the guest 2) successful case: Test we can pull a large image inside the guest with increasing createcontainer timeout(120s) Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 13:55:22 +08:00
ChengyuZhu6	6c506cde86	tests:k8s: add tests for pull images in the guest using trusted storage add tests for pull images in the guest using trusted storage: 1) failed case: Test we cannot pull an image that exceeds the memory limit inside the guest 2) successful case: Test we can pull an image inside the guest using trusted ephemeral storage. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-15 13:55:22 +08:00
GabyCT	ecfbc9515a	Merge pull request #10158 from GabyCT/topic/k8sstabil tests: Add kubernetes stability test	2024-08-14 14:44:49 -06:00
Saul Paredes	5ad47b8372	genpolicy: ignore Role resource Ignore Role resources because they don't need a Policy. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-08-14 12:57:06 -07:00
Gabriela Cervantes	d48ad94825	tests: Add kubernetes stability test This PR adds a k8s stability test that will be part of the CoCo Kata stability tests that will run weekly. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-14 15:30:49 +00:00
Fupan Li	cadcf5f92d	runtime-rs: Add the wait_vm support for hypervisors Add the wait_vm method for hypervisors. This is a prerequisite for sandbox api support. Fixes: #7043 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-08-14 12:01:34 +08:00
Fupan Li	506977b102	Merge pull request #10156 from GabyCT/topic/disablevolume tests: Disable k8s file volume test	2024-08-14 12:00:47 +08:00
GabyCT	b0b6a1baea	Merge pull request #10154 from GabyCT/topic/stressk8s tests: Add kubernetes stress-ng tests	2024-08-13 15:09:59 -06:00
Gabriela Cervantes	e580e29246	tests: Disable k8s file volume test This PR disables the k8s file volume test as we are having random failures in multiple GHA CIs mainly because the exec_host function sometimes does it not work properly. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-13 20:50:18 +00:00
Saul Paredes	af598a232b	tests: add test for cron job support Add simple test for cron job support Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-08-13 10:47:42 -07:00
Saul Paredes	88451d26d0	genpolicy: add support for cron jobs Add support for cron jobs Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-08-13 10:47:42 -07:00
Gabriela Cervantes	bdca5ca145	tests: Add kubernetes stress-ng tests This PR adds kubernetes stress-ng tests as part of the stability testing for kata. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-13 16:23:52 +00:00
Fabiano Fidêncio	99730256a2	Merge pull request #10149 from fidencio/topic/kata-manager-relax-opt-check kata-manager: Only check files when tarball is not passed	2024-08-13 16:26:16 +02:00
Markus Rudy	bce5cb2ce5	genpolicy: harden CreateSandboxRequest checks Hooks are executed on the host, so we don't expect to run hooks and thus require that no hook paths are set. Additional Kernel modules expand the attack surface, so require that none are set. If a use case arises, modules should be allowlisted via settings. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-08-13 09:01:58 +02:00
Markus Rudy	aee23409da	genpolicy: harden CopyFileRequest checks CopyFile is invoked by the host's FileSystemShare.ShareFile function, which puts all files into directories with a common pattern. Copying files anywhere else is dangerous and must be prevented. Thus, we check that the target path prefix matches the expected directory pattern of ShareFile, and that this directory is not escaped by .. traversal. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-08-13 09:01:58 +02:00
soulfy	722b576eb3	agent: kill child process when console socket closed when use debug console, the shell run in child process may not be exited, in some scenes. eg. directly Ctrl-C in the host to terminate the kata-runtime process, that will block the task handling the console connection，while waiting for the child to exit. Signed-off-by: soulfy <liukai254@jd.com>	2024-08-13 10:18:03 +08:00
Steve Horsman	91084058ae	Merge pull request #10007 from wainersm/run_k8s_on_free_runners ci: Transition GARM tests to free runners, pt. II	2024-08-12 18:12:18 +01:00
Fabiano Fidêncio	5fe65e9fc2	kata-manager: Only check files when tarball is not passed Only do the checking in case the tarball was not explicitly passed by the user. We have no control of what's passed and we cannot expect that all the files are going to be under /opt. Fixes: #10147 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-12 13:54:24 +02:00
ChengyuZhu6	c3a0ab4b93	tests:k8s: Re-enable and refactor the tests with guest pull Currently, setting `io.containerd.cri.runtime-handler` annotation in the yaml is not necessary for pulling images in the guest. All TEE hypervisors are already running tests with guest-pulling enabled. Therefore, we can remove some duplicate tests and re-enable the guest-pull test for running different runtime pods at the same time. While considering to support different containerd version, I recommend to keep setting "io.containerd.cri.runtime-handler". Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-12 16:36:54 +08:00
ChengyuZhu6	47be9c7c01	osbuilder:rootfs: install init_trusted_storage script Install init_trusted_storage script if enable MEASURED_ROOTFS. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: Anand Krishnamoorthi <anakrish@microsoft.com>	2024-08-12 16:36:54 +08:00
ChengyuZhu6	df993b0f88	agent:rpc: initialize trusted storage device Initialize the trusted stroage when the device is defined as "/dev/trusted_store" with shell script as first step. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Wang, Arron <arron.wang@intel.com>	2024-08-12 16:36:54 +08:00
ChengyuZhu6	94347e2537	agent:config: Support secure_storage_integrity option for trusted storage After enable secure storage integrity for trusted storage, the initialize time will take more times, the default value will be NOT enabled but add this config to allow the user to enable if they care more strict security. Fixes: #8142 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Wang, Arron <arron.wang@intel.com>	2024-08-12 16:36:54 +08:00
GabyCT	775f6bdc5c	Merge pull request #10142 from GabyCT/topic/updatestress tests: Update ubuntu image for stress Dockerfile	2024-08-09 16:11:35 -06:00
Gabriela Cervantes	5e5fc145cd	tests: Update ubuntu image for stress Dockerfile This PR updates the ubuntu image for stress Dockerfile. The main purpose is to have a more updated image compared with the one that is in libpod which has not been updated in a while. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-09 15:29:10 +00:00
Steve Horsman	e4c023a9fa	Merge pull request #10140 from stevenhorsman/kata-version-in-artefact-version ci: cache: Include kata version in artefact versions	2024-08-09 11:37:09 +01:00
Fabiano Fidêncio	44b08b84b0	Merge pull request #10113 from Freax13/fix/no-scsi-off qemu: don't emit scsi parameter	2024-08-08 16:23:36 +02:00
stevenhorsman	b6a3a3f8fe	ci: cache: Include kata version in artefact versions - At the moment we aren't factoring in the kata version on our caches, so it means that when we bump this just before release, we don't rebuilt components that pull in the VERSION content, so the release build ends up with incorrect versions in it's binaries Fixes: #10092 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-08-08 14:58:58 +01:00
GabyCT	584d7a265e	Merge pull request #10127 from GabyCT/topic/execimage tests:k8s: Update image in kubectl debug for the exec host function	2024-08-07 17:00:52 -06:00
Archana Shinde	1012449141	Merge pull request #10129 from hex2dec/qemu-aio-native tools: Support for building qemu with linux aio	2024-08-07 14:32:52 -07:00
Archana Shinde	a6a736eeaf	Merge pull request #10089 from amshinde/enable-nerdctl-clh ci: Enable nerdctl tests for clh	2024-08-07 12:13:00 -07:00
Wainer dos Santos Moschetta	374405aed1	workflows/run-k8s-tests-on-amd64: remove 'instance' from matrix The jobs are all executed on ubuntu-22.04 so it's invariant and can be removed from the matrix (this will shrink the jobs names). Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-07 16:00:39 -03:00
Wainer dos Santos Moschetta	d11ce129ac	workflows: merge run-k8s-tests-on-garm and run-k8s-tests-with-crio-on-garm Created the run-k8s-tests-on-amd64.yaml which is a merge of run-k8s-tests-on-garm.yaml and run-k8s-tests-with-crio-on-garm.yaml ps: renamed the job from 'run-k8s-tests' to 'run-k8s-tests-on-amd64' to it is easier to find on Github UI and be distinguished from s390x, ppc64le, etc... Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-07 15:50:43 -03:00
Wainer dos Santos Moschetta	ed0732c75d	workflows: migrate run-k8s-tests-with-crio-on-garm to free runners Switch to Github managed runners just like the run-k8s-tests-on-garm workflow. See: #9940 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-07 15:20:42 -03:00
Wainer dos Santos Moschetta	3d053a70ab	workflows: migrate run-k8s-tests-on-garm to free runners Switched to Github managed runners. The instance_type parameter was removed and K8S_TEST_HOST_TYPE is set to "all" which combine the tests of "small" and "normal". This way it will reduze to half of the jobs. See: #9940 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-07 15:20:42 -03:00
Wainer dos Santos Moschetta	dfb92e403e	tests/k8s: add "deploy-kata"/"cleanup" actions to gh-run.sh These new "kata-deploy" and "cleanup" actions are equivalent to "kata-deploy-garm" "cleanup-garm", respectively, and should be used on the workflows being migrated from GARM to Github's managed runners. Eventually "kata-deploy-garm" and "cleanup-garm" won't be used anymore then we will be able to remove them. See: #9940 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-08-07 15:20:23 -03:00
Zhiwei Huang	7270a7ba48	tools: Fix container image build warning All commands within the Dockerfile should use the same casing (either upper or lower).[1] [1]: https://docs.docker.com/reference/build-checks/consistent-instruction-casing/ Signed-off-by: Zhiwei Huang <ai.william@outlook.com>	2024-08-07 15:49:01 +08:00
Dan Mihai	2da77c6979	Merge pull request #10068 from burgerdev/genpolicy-test genpolicy: add crate-scoped integration test	2024-08-06 16:10:46 -07:00
GabyCT	fb166956ab	Merge pull request #10132 from fidencio/topic/support-image-pull-with-nerdctl runtime: image-pull: Make it work with nerdctl	2024-08-06 15:33:40 -06:00
Gabriela Cervantes	d0ca43162d	tests:k8s: Update image in kubectl debug for the exec host function This PR updates the image that we are using in the kubectl debug command as part of the exec host function, as the current alpine image does not allow to create a temporary file for example and creates random kubernetes failures. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-06 21:13:46 +00:00
Fabiano Fidêncio	63802ecdd9	Merge pull request #9880 from zvonkok/helm-chart kata-deploy: Add Helm Chart	2024-08-06 22:55:31 +02:00
Archana Shinde	ba884aac13	ci: Enable nerdctl tests for clh A recent fix should resolve some the issues seen earlier with clh with the go runtime. Enabling this test to check if the issue is still seen. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-08-06 10:41:42 -07:00
Fabiano Fidêncio	f33f2d09f7	runtime: image-pull: Make it work with nerdctl Our code for handling images being pulled inside the guest relies on a containerType ("sandbox" or "container") being set as part of the container annotations, which is done by the CRI Engine being used, and depending on the used CRI Engine we check for a specfic annotation related to the image-name, which is then passed to the agent. However, when running kata-containers without kubernetes, specifically when using `nerdctl`, none of those annotations are set at all. One thing that we can do to allow folks to use `nerdctl`, however, is to take advantage of the `--label` flag, and document on our side that users must pass `io.kubernetes.cri.image-name=$image_name` as part of the label. By doing this, and changing our "fallback" so we can always look for such annotation, we ensure that nerdctl will work when using the nydus snapshotter, with kata-containers, to perform image pulling inside the pod sandbox / guest. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-06 17:07:45 +02:00
Zvonko Kaiser	8d9bec2e01	ci: add reset_runtime to cleanup Adding reset_cleanup to cleanup action so that it is done automatically without the need to run yet another DS just to reset the runtime. This is now part of the lifecycle hook when issuing kata-deploy.sh cleanup Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-08-06 11:57:04 +02:00
Zvonko Kaiser	1221ab73f9	ci: make cleanup_kata_deploy really simple Remove the unneeded logic for cleanup the values are encapsulated in the deployed helm release Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-08-06 11:57:04 +02:00
Zvonko Kaiser	51690bc157	ci: Use helm to deploy kata-deploy Rather then modifying the kata-depoy scripts let's use Helm and create a values.yaml that can be used to render the final templates Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-08-06 11:57:04 +02:00
Zvonko Kaiser	94b3348d3c	kata-deploy: Add Helm Chart For easier handling of kata-deploy we can leverage a Helm chart to get rid of all the base and overlays for the various components Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-08-06 11:57:04 +02:00
Zhiwei Huang	d455883b46	tools: Support for building qemu with linux aio The kata containers hypervisior qemu configuration supports setting block_device_aio="native", but the kata static build of qemu does not add the linux aio feature. The libaio-dev library is a necessary dependency for building qemu with linux aio. Fixes: #10130 Signed-off-by: Zhiwei Huang <ai.william@outlook.com>	2024-08-06 14:30:45 +08:00
Markus Rudy	69535e5458	genpolicy: add crate-scoped integration test Provides a test runner that generates a policy and validates it with canned requests. The initial set of test cases is mostly for illustration and will be expanded incrementally. In order to enable both cross-compilation on Ubuntu test runners as well as native compilation on the Alpine tools builder, it is easiest to switch to the vendored openssl-src variant. This builds OpenSSL from source, which depends on Perl at build time. Adding the test to the Makefile makes it execute in CI, on a variety of architectures. Building on ppc64le requires a newer version of the libz-ng-sys crate. Fixes: #10061 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-08-05 11:52:01 +02:00
Markus Rudy	4d1416529d	genpolicy: fix clippy v1.78.0 warnings cargo clippy has two new warnings that need addressing: - assigning_clones These were fixed by clippy itself. - suspicious_open_options I added truncate(false) because we're opening the file for reading. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-08-05 11:48:30 +02:00
Fabiano Fidêncio	43dca8deb4	Merge pull request #10121 from microsoft/saulparedes/add_version_flag genpolicy: add --version flag	2024-08-03 21:22:10 +02:00
Fabiano Fidêncio	3b2173c87a	Merge pull request #10124 from fidencio/topic/ci-enable-encrypted-image-tests-for-tees ci: Enable encrypted image tests for TEEs	2024-08-03 11:39:51 +02:00
Fabiano Fidêncio	89f1581e54	ci: Enable encrypted image tests for TEEs After experimenting a little bit with those tests, they seem to be passing on all the available TEE machines. With this in mind, let's just enable them for those machines. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-03 09:27:32 +02:00
Fabiano Fidêncio	3b896cf3ef	Merge pull request #10125 from fidencio/topic/un-break-ci ci: Remove jobs that are not running	2024-08-03 09:27:04 +02:00
Fabiano Fidêncio	62a086937e	ci: Remove jobs that are not running When re-enabling those we'll need a smart way to do so, as this limit of 20 workflows referenced is just ... weird. However, for now, it's more important to add the jobs related to the new platforms than keep the ones that are actively disabled. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-03 09:24:05 +02:00
GabyCT	76af5a444b	Merge pull request #10075 from microsoft/saulparedes/hooks genpolicy: reject create custom hook settings	2024-08-02 15:36:34 -06:00
GabyCT	aadde2c25b	Merge pull request #10120 from kata-containers/fix_metrics_json_results_file Fix metrics json results file	2024-08-02 11:29:02 -06:00
Fabiano Fidêncio	b93a0642e0	Merge pull request #10123 from fidencio/topic/re-enable-arm-ci ci: re-enable arm CI	2024-08-02 17:48:35 +02:00
Dan Mihai	2628b34435	Merge pull request #10098 from microsoft/danmihai1/allow-failing agent: fix the AllowRequestsFailingPolicy functionality	2024-08-02 08:42:47 -07:00
GabyCT	8da5f7a72f	Merge pull request #10102 from ChengyuZhu6/fix-debug tests: Fix error with `kubectl debug`	2024-08-02 09:25:13 -06:00
Fabiano Fidêncio	551e0a6287	Merge pull request #10116 from GabyCT/topic/kbsdependencies tests: kbs: Add missing dependencies to install kbs cli	2024-08-02 14:22:28 +02:00
Fabiano Fidêncio	ed57ef0297	ci; aarch64: Enable builders as part of the CI As we have new runners added, let's enable the builders so we can prevent build failures happening after something gets merged. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-02 14:13:53 +02:00
Fabiano Fidêncio	388b5b0e58	Revert "ci: Temporarily remove arm64 builds" This reverts commit `e9710332e7`, as there are now 2 arm64-builders (to be expanded to 4 really soon). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-02 13:53:50 +02:00
Fabiano Fidêncio	08be9c3601	Revert "ci: Temporarily remove arm64 builds -- part II" This reverts commit `c5dad991ce`, as there are now 2 arm64-builders (to be expanded to 4 really soon). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-02 13:52:53 +02:00
Tom Dohrmann	322c80e7c8	qemu: don't emit scsi parameter This parameter has been deprecated for a long time and QEMU 9.1.0 finally removes it. Fixes: kata-containers#10112 Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>	2024-08-02 07:30:39 +02:00
Tom Dohrmann	b7999ac765	runtime-rs: don't emit scsi parameter for block devices This parameter has been deprecated for a long time and QEMU 9.1.0 finally removes it. Signed-off-by: Tom Dohrmann <erbse.13@gmx.de>	2024-08-02 07:30:23 +02:00
Fabiano Fidêncio	4183680bc3	Merge pull request #10107 from fidencio/topic/rotate-journal-logs-every-run tests: k8s: Rotate & cleanup journal for every run	2024-08-02 07:27:10 +02:00
Fabiano Fidêncio	302e02aed8	Merge pull request #10114 from fidencio/topic/kata-manager-configure-qemu-and-ovmf-for-tdx kata-manager: Ensure distro specific TDX config is set	2024-08-02 07:24:57 +02:00
Saul Paredes	194cc7ca81	genpolicy: add --version flag - Add --version flag to the genpolicy tool that prints the current version - Add version.rs.in template to store the version information - Update makefile to autogenerate version.rs from version.rs.in - Add license to Cargo.toml Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-08-01 17:18:17 -07:00
David Esparza	dcd0c0b269	metrics: Remove duplicated headers from results file. This PR removes duplicated entries (vcpus count, and available memory), from onednn and openvino results files. Fixes: #10119 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-08-01 18:11:06 -06:00
Dan Mihai	9e99329bef	genpolicy: reject create sandbox hooks Reject CreateSandboxRequest hooks, because these hooks may be used by an attacker. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-08-01 16:58:35 -07:00
ChengyuZhu6	2eac8fa452	tests: Fix error with `kubectl debug` The issue is similar to #10011. The root cause is that tty and stderr are set to true at same time in containerd: #10031. Fixes: #10081 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-02 07:32:30 +08:00
David Esparza	1e640ec3a6	metrics: fix pargins json results file. This PR encloses the search string for 'default_vcpus =' and 'default_memory =' with double quotes in order to parse the precise values, which are included in the kata configuration file. Fixes: #10118 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-08-01 17:05:03 -06:00
Dan Mihai	c2a55552b2	agent: fix the AllowRequestsFailingPolicy functionality 1. Use the new value of AllowRequestsFailingPolicy after setting up a new Policy. Before this change, the only way to enable AllowRequestsFailingPolicy was to change the default Policy file, built into the Guest rootfs image. 2. Ignore errors returned by regorus while evaluating Policy rules, if AllowRequestsFailingPolicy was enabled. For example, trying to evaluate the UpdateInterfaceRequest rules using a policy that didn't define any UpdateInterfaceRequest rules results in a "not found" error from regorus. Allow AllowRequestsFailingPolicy := true to bypass that error. 3. Add simple CI test for AllowRequestsFailingPolicy. These changes are restoring functionality that was broken recently by commmit `df23eb09a6`. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-08-01 22:37:18 +00:00
Fabiano Fidêncio	66b0305eed	Merge pull request #10117 from fidencio/topic/temporarily-remove-arm-nightly-jobs-part-2 ci: Temporarily remove arm64 builds -- part II	2024-08-01 23:06:46 +02:00
GabyCT	20a88b6470	Merge pull request #10099 from GabyCT/topic/fixmemo metrics: Update memory tests to use grep -F	2024-08-01 13:48:36 -06:00
Fabiano Fidêncio	aef7da7bc9	tests: k8s: Rotate & cleanup journal for every run This will help to avoid huge logs, and allow us to debug issues in a better way. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-01 21:36:57 +02:00
Fabiano Fidêncio	c5dad991ce	ci: Temporarily remove arm64 builds -- part II Let's remove what we commented out, as publish manifest complains: ``` Created manifest list quay.io/kata-containers/kata-deploy-ci:kata-containers-latest ./tools/packaging/release/release.sh: line 146: --amend: command not found ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-01 20:43:28 +02:00
Fabiano Fidêncio	5ec11afc21	Merge pull request #10111 from fidencio/topic/temporarily-remove-arm-nightly-jobs ci: Temporarily remove arm64 builds	2024-08-01 19:50:07 +02:00
Gabriela Cervantes	7454908690	metrics: Update memory tests to use grep -F This PR updates the memory tests like fast footprint to use grep -F instead of fgrep as this command has been deprecated. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-01 17:20:57 +00:00
Gabriela Cervantes	d72cb8ccfc	tests: kbs: Add missing dependencies to install kbs cli This PR adds missing packages depenencies to install kbs cli in a fresh new baremetal environment. This will avoid to have a failure when trying to run install-kbs-client. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-08-01 17:09:50 +00:00
Fabiano Fidêncio	bfd014871a	kata-manager: Ensure distro specific TDX config is set We've done something quite similar for kata-deploy, but I've noticed we forgot about the kata-manager counterpart. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-01 17:27:01 +02:00
Fabiano Fidêncio	e9710332e7	ci: Temporarily remove arm64 builds It's been a reasonable time that we're not able to even build arm64 artefacts. For now I am removing the builds as it doesn't make sense to keep running failing builds, and those can be re-enabled once we have arm64 machines plugged in that can be used for building the stuff, and maintainers for those machines. The `arm-jetson-xavier-nx-01` is also being removed from the runners. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-08-01 13:30:47 +02:00
Fabiano Fidêncio	c784fb6508	Merge pull request #10110 from ChengyuZhu6/bump-trustee version: bump trustee version	2024-08-01 07:34:38 +02:00
ChengyuZhu6	d35320472c	version: bump trustee version Bump trustee to the latest version to fix error with pulling busybox from dockerhub. Fixes: #10109 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-08-01 08:59:58 +08:00
Fupan Li	230aefc0da	Merge pull request #10070 from BbolroC/qemu-runtime-rs-k8s-s390x GHA: Run k8s e2e tests for qemu-runtime-rs on s390x	2024-07-31 18:41:11 +08:00
Chengyu Zhu	8e9f140ee0	Merge pull request #10080 from ChengyuZhu6/fix-coco-ci tests: add image check before running coco tests	2024-07-31 17:08:00 +08:00
Peng Tao	11e10647f9	Merge pull request #10104 from BbolroC/fix-zvsi-cleanup-s390x gha: Restore cleanup-zvsi for s390x	2024-07-31 16:23:26 +08:00
Chengyu Zhu	fc0f635098	Merge pull request #10101 from AdithyaKrishnan/main ci: Fix rate limit error by migrating busybox_image	2024-07-31 14:48:12 +08:00
ChengyuZhu6	2cfb32ac4d	version: bump nydus snapshotter to v0.13.14 bump nydus snapshotter to v0.13.14 to stabilize CIs. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-31 14:47:33 +08:00
ChengyuZhu6	41b7577f08	tests: add image check before running coco tests Currently, there are some issues with pulling images in CI, such as : https://github.com/kata-containers/kata-containers/actions/runs/10109747602/job/27959198585 This issue is caused by switching between different snapshotters for the same image in some scenarios. To resolve it, we can check existing images to ensure all content is available locally before running tests. Fixes: #10029 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-31 14:47:33 +08:00
Hyounggyu Choi	e135d536c5	gha: Restore cleanup-zvsi for s390x In #10096, a cleanup step for kata-deploy is removed by mistake. This leads to a cleanup error in the following `Complete job` step. This commit restores the removed step to resolve the current CI failure on s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-31 06:42:16 +02:00
Adithya Krishnan Kannan	fdf7036d5e	ci: Fix rate limit error by migrating busybox_image Changing the busybox_image from docker to quay to fix rate limit errors. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com>	2024-07-30 22:32:22 -05:00
Hyounggyu Choi	c8a160d14a	Merge pull request #10096 from BbolroC/remove-pre-post-action-s390x gha: Eradicate {pre,post}-action steps for s390x runners	2024-07-30 22:30:05 +02:00
Hyounggyu Choi	8d529b960a	gha: Eradicate {pre,post}-action steps for s390x runners As suggested in #9934, the following hooks have been introduced for s390x runners: - ACTIONS_RUNNER_HOOK_JOB_STARTED - ACTIONS_RUNNER_HOOK_JOB_COMPLETED These hooks will perfectly replace the existing {pre,post}-action scripts. This commit wipes out all GHA steps for s390x where the actions are triggered. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-30 17:10:19 +02:00
Wainer Moschetta	528745fc88	Merge pull request #10052 from nubificus/feat_fix_qemu_after_8070 runtime-rs: Fix QEMU backend for runtime-rs	2024-07-30 11:00:14 -03:00
Fupan Li	de22b3c4bf	Merge pull request #10024 from lifupan/main runtime-rs: enable dragonball hypervisor support initrd	2024-07-30 16:00:42 +08:00
Fupan Li	e3f0d2a751	runtime-rs: enable dragonball hypervisor support initrd enable the dragonball support initrd. Fixes: #10023 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-07-30 14:50:24 +08:00
Fupan Li	4fbf9d67a5	Merge pull request #10043 from lifupan/fix_sandbox runtime-rs : fix the issue of stop sandbox	2024-07-29 09:22:26 +08:00
Fabiano Fidêncio	949ffd146a	Merge pull request #10083 from microsoft/danmihai1/policy-tests tests: k8s: minor policy tests clean-up	2024-07-28 11:04:24 +02:00
Dan Mihai	3e348e9768	tests: k8s: rename hard-coded policy test script Rename k8s-exec-rejected.bats to k8s-policy-hard-coded.bats, getting ready to test additional hard-coded policies using the same script. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-26 20:14:05 +00:00
Dan Mihai	7b691455c2	tests: k8s: hard-coded policy for any platform Users of AUTO_GENERATE_POLICY=yes: - Already tested auto-generated policy on any platform. - Will be able to test hard-coded policy too on any platform, after this change. CI continues to test hard-coded policies just on the platforms listed here, but testing those policies locally (outside of CI) on other platforms can be useful too. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-26 19:30:03 +00:00
Dan Mihai	83056457d6	tests: k8s-policy-pod: avoid word splitting Avoid potential word splitting when using array of command args array. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-26 18:55:52 +00:00
Dan Mihai	5546ce4031	Merge pull request #10069 from microsoft/danmihai1/exec-args genpolicy: validate each exec command line arg	2024-07-26 11:39:44 -07:00
Fabiano Fidêncio	b0b04bd2f3	Merge pull request #10078 from fidencio/topic/increase-rootfs-confidential-slash-run-to-50-percent tee: osbuilder: Set /run to use 50% of the image with systemd	2024-07-26 18:37:41 +02:00
Anastassios Nanos	d11657a581	runtime-rs: Remove unused env vars from build Since we can't find a homogeneous value for the resource/cgroup management of multiple hypervisors, and we have decoupled the env vars in the Makefile, we don't need the generic ones. Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2024-07-26 14:03:50 +00:00
Anastassios Nanos	3f58ea9258	runtime-rs: Decouple Makefile env VARS To avoid overriding env vars when multiple hypervisors are available, we add per-hypervisor vars for static resource management and cgroups handling. We reflect that in the relevant config files as well. Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2024-07-26 14:02:35 +00:00
Fabiano Fidêncio	5f146e10a1	osbuilder: Add logs for setting up systemd based stuff This helps us to debug any kind of changes. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-07-26 14:22:45 +02:00
Alex Carter	4a8fb475be	tee: osbuilder: Set /run to use 50% of the image with systemd Let's ensure at least 50% of the memory is used for /run, as systemd by default forces it to be 10%, which is way too small even for very small workloads. This is only done for the rootfs-confidential image. Fixes: kata-containers#6775 Signed-off-by: Alex Carter <Alex.Carter@ibm.com> Signed-off-by: Wang, Arron <arron.wang@intel.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.co Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-07-26 14:22:38 +02:00
Chengyu Zhu	2a9ed19512	Merge pull request #9988 from huoqifeng/annotation initdata: add initdata annotation in hypervisor config	2024-07-26 19:59:45 +08:00
Fupan Li	c51ba73199	container: fix the issue of send signal to process It's better to check the container's status before try to send signal to it. Since there's no need to send signal to it when the container's stopped. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-07-26 19:23:43 +08:00
Fupan Li	e156516bde	sandbox: fix the issue of stop sandbox Since stop sandbox would be called in multi path, thus it's better to set and check the sandbox's state. Fixes: #10042 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-07-26 19:23:34 +08:00
Qi Feng Huo	a113fc93c8	initdata: fix unit test code for initdata annotation Added ut code for initdata annotation Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>	2024-07-26 18:24:05 +08:00
Qi Feng Huo	8d61029676	initdata: add unit test code for initdata annotation Added ut code for initdata annotation Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>	2024-07-26 14:20:57 +08:00
Qi Feng Huo	b80057dfb5	initdata: Merge branch 'main' into annotation - Merge branch 'main' into feature branch annotation	2024-07-26 14:01:04 +08:00
Archana Shinde	d7637f93f9	Merge pull request #9899 from amshinde/multiple-networks-fix Fix issue while adding multiple networks with nerdctl	2024-07-25 11:56:27 -07:00
Dan Mihai	a37f10fc87	genpolicy: validate each exec command line arg Generate policy that validates each exec command line argument, instead of joining those args and validating the resulting string. Joining the args ignored the fact that some of the args might include space characters. The older format from genpolicy-settings.json was similar to: "ExecProcessRequest": { "commands": [ "sh -c cat /proc/self/status" ], "regex": [] }, That format will not be supported anymore. genpolicy will detect if its users are trying to use the older "commands" field and will exit with a relevant error message in that case. The new settings format is: "ExecProcessRequest": { "allowed_commands": [ [ "sh", "-c", "cat /proc/self/status" ] ], "regex": [] }, Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-25 16:57:17 +00:00
Dan Mihai	0f11384ede	tests: k8s-policy-pod: exec_command clean-up Use "${exec_command[@]}" for calling both: - add_exec_to_policy_settings - kubectl exec Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-25 16:55:03 +00:00
Dan Mihai	95b78ecaa9	tests: k8s-exec: reuse sh_command variable Reuse sh_command variable instead of repeading "sh". Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-25 16:50:34 +00:00
Alex Lyn	abb0a2659a	Merge pull request #9944 from Apokleos/align-ocispec-rs Align kata oci spec with oci-spec-rs	2024-07-25 19:36:52 +08:00
Alex Lyn	bb2b60dcfc	oci: Delete the kata oci spec It's time to delete the kata oci spec implemented just for kata. As we have already done align OCI Spec with oci-spec-rs. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	b56313472b	agent: Align agent OCI spec with oci-spec-rs Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	882385858d	runtime-rs: Align oci spec in runtime-rs with oci-spec-rs This commit aligns the OCI Spec implementation in runtime-rs with the OCI Spec definitions and related operations provided by oci-spec-rs. Key changes as below: (1) Leveraged oci-spec-rs to align Kata Runtime OCI Spec with the official OCI Spec. (2) Introduced runtime-spec to separate OCI Spec definitions from Kata-specific State data structures. (3) Preserved the original code logic and implementation as much as possible. (4) Made minor code adjustments to adhere to Rust programming conventions; Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	bf813f85f2	runk: Align oci spec with oci-spec-rs Utilized oci-spec-rs to align OCI Spec structures and data representations in runk with the OCI Spec. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	b3eab5ffea	genpolicy: Align agent-ctl OCI Spec with oci-spec-rs Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	c500fd5761	agent-ctl: Align agent-ctl OCI Spec with oci-spec-rs This commit aligns the OCI Spec used within agent-ctl with the oci-spec-rs definition and operations. This enhancement ensures that agent-ctl adheres to the latest OCI standards and provides a more consistent and reliable experience for managing container images and configurations. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	faffee8909	libs: update Cargo config and lock file update Cargo.toml and Cargo.lock for adding runtime-spec Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:47:01 +08:00
Alex Lyn	8b5499204d	protocols: Reimplement OCI Spec to TTRPC Data Translation This commit transitions the data implementation for OCI Spec from kata-oci-spec to oci-spec-rs. While both libraries adhere to the OCI Spec standard, significant implementation details differ. To ensure data exchange through TTRPC services, this commit reimplements necessary data conversion logic. This conversion bridges the gap between oci-spec-rs data and TTRPC data formats, guaranteeing consistent and reliable data transfer across the system. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 17:46:07 +08:00
Anastassios Nanos	cda00ed176	runtime-rs: Add FC specific KERNELPARAMS To avoid overriding KERNELPARAMS for other hypervisors, add FC-specific KERNELPARAMS. Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>	2024-07-25 08:53:57 +00:00
Hyounggyu Choi	d8cac9f60b	GHA: Run k8s e2e tests for qemu-runtime-rs on s390x This commit adds a new CI job for qemu-runtime-rs to the existing zvsi Kubernetes test matrix. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-25 08:11:49 +02:00
Alex Lyn	4e003a2125	Merge pull request #10058 from Apokleos/enhance-vsock-connect runtime-rs: enhance debug info for agent connect.	2024-07-25 11:29:04 +08:00
Alex Lyn	36385a114d	runtime-rs: enhance debug info for agent connect. we need more friendly logs for debugging agent conntion cases when kata pods fail. Fixes #10057 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-25 08:51:57 +08:00
Dan Mihai	c3adeda3cc	Merge pull request #10051 from microsoft/danmihai1/exec-variable-reuse tests: k8s: reuse policy exec variable	2024-07-24 14:58:40 -07:00
Aurélien Bombo	f08b594733	Merge pull request #9576 from microsoft/saulparedes/support_env_from genpolicy: Add support for envFrom	2024-07-24 13:39:54 -07:00
GabyCT	79edf2ca7d	Merge pull request #10054 from GabyCT/topic/docnydus docs: Update url links in kata nydus document	2024-07-24 14:08:44 -06:00
Archana Shinde	64d6293bb0	tests:Add nerdctl test for testing with multiple netwokrs Add integration test that creates two bridge networks with nerdctl and verifies that Kata container is brought up while passing the networks created. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-24 10:45:56 -07:00
Archana Shinde	49fbae4fb1	agent: Wait for interface in update_interface For nerdctl and docker runtimes, network is hot-plugged instead of cold-plugged. While this change was made in the runtime, we did not have the agent waiting for the device to be ready. On some systems, the device hotplug could take some time causing the update_interface rpc call to fail as the interface is not available. Add a watcher for the network interface based on the pci-path of the network interface. Note, waiting on the device based on name is really not reliable especially in case multiple networks are hotplugged. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-24 10:45:56 -07:00
Dan Mihai	fecb70b85e	tests: k8s: reuse policy exec variable Share a single test script variable for both: - Allowing a command to be executed using Policy settings. - Executing that command using "kubectl exec". Fixes: #10014 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-24 17:42:04 +00:00
Fabiano Fidêncio	162a6b44f6	Merge pull request #10063 from ChengyuZhu6/fix-ci-timeout gha: Increase timeout to run CoCo tests	2024-07-24 15:14:35 +02:00
Pavel Mores	dd1e09bd9d	runtime-rs: add experimental support for memory hotunplugging to qemu-rs Hotunplugging memory is not guaranteed or even likely to work. Nevertheless I'd really like to have this code in for tests and observation. It shouldn't hurt, from experience so far. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-07-24 13:22:41 +02:00
Pavel Mores	3095b65ac3	runtime-rs: support hotplugging memory in QemuInner The bulk of this implementation are simple though tedious sanity checks, alignment computations and logging. Note that before any hotplugging, we query qemu directly for the current size of hotplugged memory. This ensures that any request to resize memory will be properly compared to the actual already available amount and only necessary amount will be added. Note also that we borrow checked_next_multiple_of() from CH implementation. While this might look uncleanly it's just a rather temporary solution since an equivalent function will apparently be part of std soon, likely the upcoming 1.75. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-07-24 13:22:41 +02:00
Pavel Mores	4a1c828bf8	runtime-rs: support hotplugging memory in Qmp The algorithm is rather simple - we query qemu for existing memory devices to figure out the index of the one we're about to add. Then we add a backend object and a corresponding frontend device. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-07-24 13:22:41 +02:00
Pavel Mores	0e0b146b87	runtime-rs: support storage & retrieval of guest memblock size in qemu-rs This will be used for ensuring that hotplugged memory block sizes are properly aligned. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-07-24 13:22:41 +02:00
Alex Lyn	efb7390357	kata-sys-utils: align OCI Spec with oci-spec-rs Do align oci spec and fix warnings to make clippy happy. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-24 14:38:48 +08:00
Alex Lyn	012029063c	runtime-spec: Introduce runtime-spec for Container State As part of aligning the Kata OCI Spec with oci-spec-rs, the concept of "State" falls outside the scope of the OCI Spec itself. While we'll retain the existing code for State management for now, to improve code organizationand clarity, we propose moving the State-related code from the oci/ dir to a dedicated directory named runtime-spec/. This separation will be completed in subsequent commits with the removal of the oci/ directory. Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-24 14:38:30 +08:00
Zvonko Kaiser	a388d2b8d4	Merge pull request #9919 from zvonkok/ubuntu-dockerfile gpu: rootfs ubuntu build expansion	2024-07-24 08:05:54 +02:00
ChengyuZhu6	2b44e9427c	gha: Increase timeout to run CoCo tests This PR increases the timeout for running the CoCo tests to avoid random failures. These failures occur when the action `Run tests` times out after 30 minutes, causing the CI to fail. Fixes: #10062 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-24 12:31:38 +08:00
GabyCT	b408cc1694	Merge pull request #10060 from GabyCT/topic/fgreptest metrics: Update launch times to use grep -F	2024-07-23 17:23:14 -06:00
Gabriela Cervantes	0e5489797d	docs: Update url links in kata nydus document This PR updates the url links in the kata nydus document. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-23 17:49:12 +00:00
Gabriela Cervantes	3d17a7038a	metrics: Update launch times to use grep -F This PR updates the metrics launch times to use grep -F instead of fgrep as this command has been deprecated. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-23 17:13:52 +00:00
Zvonko Kaiser	941577ab3b	gpu: rootfs ubuntu build expansion For the GPU build we need go/rust and some other helpers to build the rootfs. Always use versions.yaml for the correct and working Rust and golang version Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-23 14:31:35 +00:00
Steve Horsman	d69950e5c6	Merge pull request #10053 from stevenhorsman/release-env-var ci: cache: Pass through RELEASE env	2024-07-22 21:53:20 +01:00
Dan Mihai	f26d595e5d	Merge pull request #9910 from microsoft/saulparedes/set_policy_rego_via_env tools: Allow setting policy rego file via	2024-07-22 11:00:30 -07:00
stevenhorsman	66f6ec2919	ci: cache: Pass through RELEASE env In kata-deploy-binaries.sh we want to understand if we are running as part of a release, so we need to pass through the RELEASE env from the workflow, which I missed in https://github.com/kata-containers/kata-containers/pull/9550 Fixes: #9921 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-22 16:39:35 +01:00
Zvonko Kaiser	5765b6e062	Merge pull request #9920 from zvonkok/initrd-builer gpu: rootfs/initrd build init	2024-07-22 15:06:49 +02:00
Zvonko Kaiser	73bcb09232	Merge pull request #9968 from zvonkok/kernel-gpu-dragonball-6.1.x dragonball: kernel gpu dragonball 6.1.x	2024-07-22 13:03:14 +02:00
Zvonko Kaiser	3029e6e849	gpu: rootfs/initrd build init Initramfs expects /init, create symlink only if ${ROOTFS}/init does not exist Init may be provided by other packages, e.g. systemd or GPU initrd/rootfs Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-22 10:19:05 +00:00
Saul Paredes	b7a184a0d8	rootfs: Allow AGENT_POLICY_FILE te be an absolute path Don't set AGENT_POLICY_FILE as $script_dir may change Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-21 14:57:41 -07:00
Alex Lyn	67466aa27f	kata-types: do alignment of oci-spec for kata-types Fixes #9766 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-21 22:54:43 +08:00
Hyounggyu Choi	c774cd6bb0	Merge pull request #10031 from ChengyuZhu6/fix-log-contain-tdx tests: Fix missing log on TDX	2024-07-20 07:26:08 +02:00
ChengyuZhu6	6ea6e85f77	tests: Re-enable authenticated image tests on tdx Try to re-enable authenticated image tests on tdx. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-20 12:10:02 +08:00
ChengyuZhu6	3476fb481e	tests: Fix missing log on TDX Currently, we have found that `assert_logs_contain` does not work on TDX. We manually located the specific log, but it fails to get the log using `kubectl debug`. The error found in CI is: ``` warning: couldn't attach to pod/node-debugger-984fee00bd70.jf.intel.com-pdgsj, falling back to streaming logs: error stream protocol error: unknown error ``` Upon debugging the TDX CI machine, we found an error in containerd: ``` Attach container from runtime service failed" err="rpc error: code = InvalidArgument desc = tty and stderr cannot both be true" containerID="abc8c7a546c5fede4aae53a6ff2f4382ff35da331bfc5fd3843b0c8b231728bf" ``` We believe this is the root cause of the test failures in TDX CI. Therefore, we need to ensure that tty and stderr are not set to true at same time. Fixes: #10011 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Wang, Arron <arron.wang@intel.com>	2024-07-20 12:10:01 +08:00
Steve Horsman	7dd560f07f	Merge pull request #9620 from l8huang/kernel Add kernel config for NVIDIA DPU/ConnectX adapter	2024-07-19 23:16:51 +01:00
Dan Mihai	3127dbb3df	Merge pull request #10035 from microsoft/danmihai1/k8s-credentials-secrets tests: k8s-credentials-secrets: policy for second pod	2024-07-19 12:44:21 -07:00
Saul Paredes	2681fc7eb0	genpolicy: Add support for envFrom This change adds support for the `envFrom` field in the `Pod` resource Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-19 09:53:58 -07:00
GabyCT	be2d4719c2	Merge pull request #10040 from kata-containers/fix_blogbench_midvalues metrics: update avg reference values for blogbench.	2024-07-19 09:51:29 -06:00
Zvonko Kaiser	8eaa2f0dc8	dragonball: Add GPU support Build a GPU flavoured dragonball kernel Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-19 14:48:05 +00:00
Dan Mihai	44e443678d	Merge pull request #9835 from microsoft/saulparedes/test_policy_on_sev gha: enable autogenerated policy testing on SEV and SEV-SNP	2024-07-19 07:46:01 -07:00
Greg Kurz	dc97f3f540	Merge pull request #10045 from lifupan/cleanup_container runtime-rs: container: fix the issue of missing cleanup container	2024-07-19 16:36:04 +02:00
Alex Lyn	d0dc67bb96	Merge pull request #8597 from amshinde/vfio-hotplug-support Implement hotplug support for physical endpoints	2024-07-19 13:41:11 +08:00
Lei Huang	20f6979d8f	build: add kernel config for Nvidia DPU/ConnectX adapter With Nvidia DPU or ConnectX network adapter, VF can do VFIO passthrough to guest VM in `guest-kernel` mode. In the guest kernel, the adapter's driver is required to claim the VFIO device and create network interface. Signed-off-by: Lei Huang <leih@nvidia.com>	2024-07-18 22:29:16 -07:00
Fupan Li	8a2f7b7a8c	container: fix the issue of missing cleanup container When create container failed, it should cleanup the container thus there's no device/resource left. Fixes: #10044 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-07-19 11:02:55 +08:00
ms-mahuber	ddff762782	tools: Allow setting policy rego file via environment variable * Set policy file via env var * Add restrictive policy file to kata-opa folder * Change restrictive policy file name * Change relative default path location * Add license headers Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-18 15:05:45 -07:00
David Esparza	60f52a4b93	metrics: update avg reference values for blogbench. This PR updates the Blogbench reference values for read and write operations used in the CI check metrics job. This is due to the update to version 1.2 of blobench. Fixes: #10039 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-18 15:47:14 -06:00
Greg Kurz	fc4357f642	Merge pull request #10034 from BbolroC/hide-repack_secure_image-from-test tests: Call repack_secure_image() in set_metadata_annotation()	2024-07-18 23:03:41 +02:00
Aurélien Bombo	ab6f37aa52	Merge pull request #10022 from microsoft/danmihai1/probes-and-lifecycle genpolicy: container.exec_commands args validation	2024-07-18 12:21:31 -07:00
Steve Horsman	256ab50f1a	Merge pull request #9959 from sprt/fix-ci-cleanup ci: cleanup: Ignore nonexisting resources	2024-07-18 19:23:48 +01:00
David Esparza	1fdc5c1183	Merge pull request #10028 from amshinde/upgrade-blogbench-1.2 metric: Upgrade blogbench to 1.2	2024-07-18 11:30:17 -06:00
Hyounggyu Choi	a7e4d3b738	tests: Call repack_secure_image() in set_metadata_annotation() It is not good practice to call repack_secure_image() from a bats file because the test code might not consider cases where `qemu-se` is used as `KATA_HYPERVISOR`. This commit moves the function call to set_metadata_annotation() if a key includes `kernel_params` and `KATA_HYPERVISOR` is set to `qemu-se`, allowing developers to focus on the test scenario itself. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-18 18:09:45 +02:00
Dan Mihai	035a42baa4	tests: k8s-credentials-secrets: policy for second pod Add policy to pod-secret-env.yaml from k8s-credentials-secrets.bats. Policy was already auto-generated for the other pod used by the same test (pod-secret.yaml). pod-secret-env.yaml was inconsistent, because it was taking advantage of the "allow all" policy built into the Guest image. Sooner or later, CI Guests for CoCo will not get the "allow all" policy built in anymore and pod-secret-env.yaml would have stopped working then. Note that pod-secret-env.yaml continues to use an "allow all" policy after these changes. #10033 must be solved before a more restrictive policy will be generated for pod-secret-env.yaml. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-18 15:03:57 +00:00
Hyounggyu Choi	d2ac01c862	Merge pull request #10032 from BbolroC/fix-image-authenticated-for-s390x tests: Rebuild secure boot image for guest-pull-image-authenticated for IBM SE	2024-07-18 17:00:18 +02:00
Hyounggyu Choi	6e7ee4bdab	tests: Rebuild secure image for guest-pull-image-authenticated on SE Since #9904 was merged, newly introduced tests for `k8s-guest-pull-image-authenticated.bats` have been failing on IBM SE (s390x). The agent fails to start because a kernel parameter cannot pass to the guest VM via annotation. To fix this, the boot image must be rebuilt with updated parameters. This commit adds the rebuilding step in create_pod_yaml_with_private_image() for `qemu-se`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-18 14:56:12 +02:00
Archana Shinde	1636c201f4	network: Implement network hotunplug for physical endpoints Similar to HotAttach, the HotDetach method signature for network endoints needs to be changed as well to allow for the method to make use of device manager to manage the hot unplug of physical network devices. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 16:42:41 -07:00
Archana Shinde	c6390f2a2a	vfio: Introduce function to get vfio dev path This function will be later used to get the vfio dev path. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 16:42:41 -07:00
Archana Shinde	1e304e6307	network: Implement hotplug for physical endpoints Enable physical network interfaces to be hotplugged. For this, we need to change the signature of the HotAttach method to make use of Sandbox instead of Hypervisor. Similar approach was followed for Attach method, but this change was overlooked for HotAttach. The signature change is required in order to make use of device manager and receiver for physical network enpoints. Fixes: #8405 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 16:42:40 -07:00
Archana Shinde	2fef4bc844	vfio: use driver_override field for device binding. The current implementation for device binding using driver bind/unbind and new_id fails in the scenario when the physical device is not bound to a driver before assigning it to vfio. There exists and updated mechanism to accomplish the same that does not have the same issue as above. The driver_override field for a device allows us to specify the driver for a device rather than relying on the bound driver to provide a positive match of the device. It also has other advantages referenced here: https://patchwork.kernel.org/project/linux-pci/patch/1396372540.476.160.camel@ul30vt.home/ So use the updated driver_override mechanism for binding/unbinding a physical device/virtual function to vfio-pci. Signed-off-by: liangxianlong <liang.xianlong@zte.com.cn> Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 16:42:40 -07:00
GabyCT	6aff5f300a	Merge pull request #10021 from GabyCT/topic/fixarchdoc docs: Update devmapper docs	2024-07-17 14:56:40 -06:00
Saul Paredes	57d2ded3e2	gha: enable autogenerated policy testing on SEV-SNP Enable autogenerated policy testing on SEV-SNP Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-17 13:32:06 -07:00
Archana Shinde	30e5e88ff1	metric: Upgrade blogbench to 1.2 Move to blogbench 1.2 version from 1.1. This version includes an important fix for the read_score test which was reported to be broken in the previous version. It essentially fixes this issue here: https://github.com/jedisct1/Blogbench/issues/4 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-07-17 11:32:09 -07:00
Steve Horsman	e5d5284761	Merge pull request #10026 from wainersm/release_370 release: Bump VERSION to 3.7.0	2024-07-17 18:43:51 +01:00
Wainer dos Santos Moschetta	6f7ab31860	release: Bump VERSION to 3.7.0 On preparation for the 3.7.0 release, bumped the version in VERSION file. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-07-17 14:19:44 -03:00
Saul Paredes	b3cc8b200f	gha: enable autogenerated policy testing on SEV Enable autogenerated policy testing on SEV Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-17 09:55:13 -07:00
Dan Mihai	f31c1b121e	Merge pull request #9812 from microsoft/saulparedes/test_policy_on_tdx gha: enable policy testing on TDX	2024-07-17 08:47:44 -07:00
Dan Mihai	449103c7bf	Merge pull request #10020 from microsoft/danmihai1/pod-security-context tests: fix ps command in k8s-security-context	2024-07-17 08:12:57 -07:00
Fabiano Fidêncio	b7051890af	Merge pull request #9722 from zvonkok/busybox-build deploy: Add busybox target	2024-07-17 13:47:15 +02:00
Steve Horsman	5ce2c1010a	Merge pull request #9904 from stevenhorsman/registry-authentication Support for registry authentication in guest pull	2024-07-17 10:48:38 +01:00
Fupan Li	65f2bfb8c4	Merge pull request #9967 from zvonkok/kernel-dragonball-6.1.x dragonball: kernel dragonball 6.1.x	2024-07-17 14:38:06 +08:00
Dan Mihai	0e86a96157	tests: fix ps command in k8s-security-context 1. Use a container image that supports "ps --user 1000 -f". 2. Execute that command using: sh -c "ps --user 1000 -f" instead of passing additional arguments to sh: sh -c ps --user 1000 -f Fixes: #10019 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-17 01:33:31 +00:00
Dan Mihai	9f4d1ffd43	genpolicy: container.exec_commands args validation Keep track of individual exec args instead of joining them in the policy text. Verifying each arg results in a more precise policy, because some of the args might include space characters. This improved validation applies to commands specified in K8s YAML files using: - livenessProbe - readinessProbe - startupProbe - lifecycle.postStart - lifecycle.preStop Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-17 01:19:23 +00:00
Dan Mihai	b23ea508d5	tests: k8s: container.exec_commands policy tests Add tests for genpolicy's handling of container.exec_commands. These are commands allowed by the policy and originating from these input K8s YAML fields: - livenessProbe - readinessProbe - startupProbe - lifecycle.postStart - lifecycle.preStop Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-17 01:19:00 +00:00
stevenhorsman	567b4d5788	test/k8s: Fix up node logging typo We had a typo in the attestation tests that we've copied around a lot and Wainer spotted it in the authenticated registry tests, so let's fix it up now Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-16 21:39:31 -03:00
stevenhorsman	0015c8ef51	tests: Add guest-pull auth registry tests Add three new test cases for guest pull from an authenticated registry for the following scenarios: _Scenario: Creating a container from an authenticated image, with correct credentials via KBC works_ Given An authenticated container registry quay.io/kata-containers/confidential-containers-auth And a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for [guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml) And a KBS set up to have the correct auth.json for registry quay.io/kata-containers/confidential-containers-auth embedded in the `"Credential"` section of `its resources file` When I create a pod from the container image quay.io/kata-containers/confidential-containers-auth:test Then The pull image works and the pod can start _Scenario: Creating a container from an authenticated image, with incorrect credentials via KBC fails_ Given An authenticated container registry quay.io/kata-containers/confidential-containers-auth And a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for [guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml) And An installed kata CC with the sample_kbs set up to have the auth.json for registry quay.io/kata-containers/confidential-containers-auth embedded in the `"Credential"` resource, but with a dummy user name and password When I create a pod from the container image quay.io/kata-containers/confidential-containers-auth:test Then The pull image fails with a message that reflects that the authorisation failed _Scenario: Creating a container from an authenticated image, with no credentials fails_ Given An authenticated container registry quay.io/kata-containers/confidential-containers-auth And a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for [guest-pulling](https://github.com/containerd/nydus-snapshotter/blob/main/misc/snapshotter/config-coco-guest-pulling.toml) And An installed kata CC with no credentials section When I create a pod from the container image quay.io/kata-containers/confidential-containers-auth:test Then The pull image fails with a message that reflects that the authorisation failed Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-16 21:39:31 -03:00
stevenhorsman	eb07f5ef5e	agent: doc: Fix ordering of options - Fix the config options to be back in alphabetical order to be easier to find Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-16 21:39:31 -03:00
stevenhorsman	7cc81ce867	agent: image: Set image-rs auth config If the agent-config has a value for `image_registry_auth`, Then pass this to the image-rs client and enable auth mode too Fixes: #8122 Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-16 21:39:31 -03:00
stevenhorsman	265322990a	agent: config: Add config option to provide auth for guest-pull Add optional config for agent.image_registry_auth, to specify the uri of credentials to be used when pulling images in the guest from an authenticated registry Fixes: #8122 Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com> Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-16 21:39:31 -03:00
Steve Horsman	064b45a2fa	Merge pull request #10016 from wainersm/ibm-se-auth-reg workflows: setup environment to run auth registry tests on s390x	2024-07-16 22:24:39 +01:00
Gabriela Cervantes	d2866081d2	docs: Update devmapper docs This PR updates the devmapper docs by updating the url link for the current containerd devmapper information. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-16 21:07:51 +00:00
GabyCT	2206e2dd5c	Merge pull request #10013 from GabyCT/topic/updatecontdoc docs: Update cri installion guide url in containerd documentation	2024-07-16 14:32:59 -06:00
Wainer dos Santos Moschetta	66c600f8d8	gha: delint the s390x workflow Made run-k8s-tests-on-zvsi.yaml free of warnings by removing: SC2086:info:1:1: Double quote to prevent globbing and word splitting ... SC2086:info:2:1: Double quote to prevent globbing and word splitting ... Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-07-16 15:20:46 -03:00
Wainer dos Santos Moschetta	a98985fab8	gha: export user/password for auth registry tests on s390x Counterpart of commit `d8961cbd4a` for run-k8s-tests-on-zvsi workflow Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-07-16 15:18:40 -03:00
Saul Paredes	af49252c69	gha: enable policy testing on TDX Enable policy testing on TDX Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-15 14:09:49 -07:00
Saul Paredes	0b3d193730	genpolicy: Support cpath for mount sources Add setting to allow specifying the cpath for a mount source. cpath is the root path for most files used by a container. For example, the container rootfs and various files copied from the Host to the Guest when shared_fs=none are hosted under cpath. mount_source_cpath is the root of the paths used a storage mount sources. Depending on Kata settings, mount_source_cpath might have the same value as cpath - but on TDX for example these two paths are different: TDX uses "/run/kata-containers" as cpath, but "/run/kata-containers/shared/containers" as mount_source_cpath. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-15 14:09:49 -07:00
Gabriela Cervantes	e4045ff29a	docs: Update runtime v2 containerd url information This PR updates the runtime v2 containerd url information at containerd documentation. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-15 20:36:17 +00:00
Dan Mihai	bcaf7fc3b4	Merge pull request #10008 from microsoft/danmihai1/runAsUser genpolicy: add support for runAsUser fields	2024-07-15 12:08:50 -07:00
Gabriela Cervantes	9f738f0d05	docs: Update cri installion guide url in containerd documentation This PR updates the cri installation guide url link in the containerd documentation guide as the previous url link does not exists. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-15 16:58:18 +00:00
Dan Mihai	648265d80e	Merge pull request #9998 from microsoft/danmihai1/GENPOLICY_PULL_METHOD tests: k8s: GENPOLICY_PULL_METHOD clean-up	2024-07-15 09:32:29 -07:00
Steve Horsman	02b9fd6e95	Merge pull request #9382 from Xynnn007/feat-encrypt-image Merge to main: supporting pull encrypted images	2024-07-15 15:58:42 +01:00
stevenhorsman	b060fb5b31	tests/k8s: Skip measured rootfs test The only kernel built for measured rootfs was the kernel-tdx-experimental, so this test only ran in the qemu-tdx job runs the test. In commit `6cbdba7` we switched all TEE configurations to use the same kernel-confidential, so rootfs measured is disabled for qemu-tdx too now. The VM still fails to boot (because of a different reason...) but the bug in the assert_logs_contain, fixed in this PR was masking the checks on the logs. We still have a few open issues related to measured rootfs and generating the root hash, so let's skip this test that doesn't work until they are looked at Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-15 12:00:50 +01:00
stevenhorsman	2cf94ae717	tests: Add guest-pull encrypted image tests Add three new tests cases for guest-pull of an encrypted image for the following scenarios: _Scenario: Pull encrypted image on guest with correct key works_ Given I have a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for guest-pulling And A public encrypted container image i with a decryption key k that is configured as a resource the KBS, so that image-rs on the guest can connect to it When I try and create a pod from i Then The pod is successfully created and runs _Scenario: Cannot pull encrypted image with no decryption key_ Given I have a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for guest-pulling And A public encrypted container image i with a decryption key k, that is not configured in a KBS that image-rs on the guest can connect to When I try and create a pod from i Then The pod is not created with an error message that reflects why _Scenario: Cannot pull encrypted image with wrong decryption key_ Given I have a version of kata deployed with a guest image that has an agent with `guest_pull` feature enabled and nydus-snapshotter installed and configured for guest-pulling And A public encrypted container image i with a decryption key k and a different key k' that is set as a resource in a KBS, that image-rs on the guest can connect to When I try and create a pod from i Then The pod is not created with an error message that reflects why Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-15 12:00:50 +01:00
Xynnn007	a56b15112a	agent: add ocicrypt config ocicrypt config is for kata-agent to connect to CDH to request for image decryption key. This value is specified by an env. We use this workaround the same as CCv0 branch. In future, we will consider better ways instead of writting files and setting envs inside inner logic of kata-agent. Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-07-15 12:00:50 +01:00
Xynnn007	1072658219	agent: Enable kata-cc-rustls-tls in image-rs - Enable the kata-cc-rustls-tls feature in image-rs, so that it can get resources from the KBS in order to retrieve the registry credentials. - Also bump to the latest image-rs to pick up protobuf fixes - Add libprotobuf-dev dependency to the agent packaging as it is needed by the new image-rs feature - Add extra env in the agent make test as the new version of the anyhow crate has changed the backtrace capture thus unit tests of kata-agent that compares a raised error with an expected one would fail. To fix this, we need only panics to have backtraces, thus set RUST_BACKTRACE=0 for tests due to document https://docs.rs/anyhow/latest/anyhow/ Fixes #9538 Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-15 12:00:50 +01:00
stevenhorsman	3b72e9ffab	tests/k8s: Fix assert_logs_contain The pipe needs adding to the grep, otherwise the grep gets consumed as an argument to `print_node_journal` and run in the debug pod. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-15 12:00:50 +01:00
Hyounggyu Choi	83b3a681f4	Merge pull request #10010 from BbolroC/osbuilder-bump-fedora-to-40 osbuilder: Bump Fedora to 40	2024-07-15 13:00:28 +02:00
Greg Kurz	203d9e7803	Merge pull request #10000 from littlejawa/kata_deploy_add_storage_config_for_crio kata-deploy: add storage configuration for cri-o	2024-07-15 12:29:21 +02:00
Hyounggyu Choi	08d2f6bfe4	osbuilder: Bump Fedora to 40 As Fedora 38 has reached EOL, we are encountering 404 errors for s390x, such as: ``` Status code: 404 for https://dl.fedoraproject.org/pub/fedora-secondary/updates/38/Everything/s390x/repodata/repomd.xml ``` Let's bump the OS to the latest version. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-15 09:58:54 +02:00
Fupan Li	a7179be31d	Merge pull request #9534 from Tim-Zhang/fix-stdin-stuck Fix ctr exec stuck problem	2024-07-15 13:19:19 +08:00
Dan Mihai	dded329d26	tests: k8s: SecurityContext.runAsUser policy test Add test for auto-generating policy for a pod spec that includes the SecurityContext.runAsUser field. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-13 01:37:58 +00:00
Dan Mihai	7040fb8c50	tests: k8s-security-context auto-generated policy Auto-generate the policy in k8s-security-context.bats - previously blocked by lacking support for PodSecurityContext.runAsUser. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-13 01:23:54 +00:00
Dan Mihai	f087044ecb	genpolicy: add support for runAsUser Add ability to auto-generate policy for SecurityContext.runAsUser and PodSecurityContext.runAsUser. Fixes: #8879 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-13 01:10:43 +00:00
Dan Mihai	5282701b5b	genpolicy: add link to allow_user() active issue Improve comment to workaround in rules.rego, to explain better the reason for that workaround. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-13 01:05:58 +00:00
GabyCT	3c0171df3d	Merge pull request #10005 from GabyCT/topic/katadragonball common: Add share fs information for dragonball	2024-07-12 16:10:29 -06:00
Wainer Moschetta	646d7ea4fb	Merge pull request #9951 from BbolroC/enable-attestation-for-ibm-se tests: Enable attestation e2e tests for IBM SE	2024-07-11 16:02:59 -03:00
Hyounggyu Choi	ca80301b4b	Merge pull request #10003 from BbolroC/skip-pod-shared-volume-for-ibm-se k8s: Skip shared-volume relevant tests for IBM SE	2024-07-11 19:29:13 +02:00
Gabriela Cervantes	4477b4c9dc	common: Add share fs information for dragonball This PR adds the share fs information for dragonball using kata-ctl to avoid the failures in runk tests saying that shared_fs is an unbound variable. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-11 17:09:35 +00:00
Dan Mihai	09c5ca8032	tests: k8s: clarify the need to use containerd.sock Modify the permissions of containerd.sock just when genpolicy needs access to this socket, when testing GENPOLICY_PULL_METHOD=containerd. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-11 16:49:58 +00:00
Dan Mihai	c1247cc254	tests: k8s: explain the default containerd settings Explain why the containerd settings on the local machine get set to containerd's defaults when testing GENPOLICY_PULL_METHOD=containerd. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-11 16:49:39 +00:00
Dan Mihai	3b62eb4695	tests: k8s: add comment for GENPOLICY_PULL_METHOD Explain why there are two different methods for pulling container images in genpolicy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-11 16:40:01 +00:00
Dan Mihai	eaedd21277	tests: k8s: use oci-distribution as default value oci-distribution is the value used by run-k8s-tests-on-aks.yaml, so use the same value as default for GENPOLICY_PULL_METHOD in gha-run.sh. The value of GENPOLICY_PULL_METHOD is currently compared just with "containerd", but avoid possible future problems due to using a different default value in gha-run.sh. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-11 16:40:01 +00:00
GabyCT	2056eda5f0	Merge pull request #9922 from GabyCT/topic/updateblogname metrics: Update container name in blogbench test	2024-07-11 10:05:35 -06:00
Hyounggyu Choi	32c3e55cde	k8s: Skip shared-volume relevant tests for IBM SE Currently, it is not viable to share a writable volume (e.g., emptyDir) between containers in a single pod for IBM SE. The following tests are relevant: - pod-shared-volume.bats - k8s-empty-dirs.bats (See: https://github.com/kata-containers/kata-containers/issues/10002) This commit skips the tests until the issue is resolved. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-11 14:09:19 +02:00
Julien Ropé	b83d4e1528	kata-deploy: add storage configuration for cri-o Make sure that the "skip_mount_home" flag is set in cri-o config. Fixes: #9878 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-07-11 10:11:30 +02:00
Qi Feng Huo	4d66ee1935	initdata: add initdata annotation in hypervisor config - Add Initdata annotation for hypervisor config, so that it can be passed when CreateVM Signed-off-by: Qi Feng Huo <huoqif@cn.ibm.com>	2024-07-11 10:56:18 +08:00
GabyCT	dac07239f5	Merge pull request #9974 from squarti/sharedfs runtime: Initialize SharedFS for remote hypervisor	2024-07-10 17:03:00 -06:00
GabyCT	3827b5f9f2	Merge pull request #9982 from ChengyuZhu6/fix-ci tests: Delete test scripts forcely	2024-07-10 17:00:41 -06:00
Wainer Moschetta	deb4627558	Merge pull request #9975 from niteeshkd/nd_snp_attestation gha: enable SNP attestation	2024-07-10 18:59:05 -03:00
GabyCT	c40b3b4ce7	Merge pull request #9992 from sprt/fix-nydus ci: fix run-nydus tests	2024-07-10 13:56:16 -06:00
David Esparza	be9385342e	Merge pull request #9990 from GabyCT/topic/tdxtimeout gha: Increase timeout to run CoCo TDX tests	2024-07-10 13:21:23 -06:00
Silenio Quarti	8260ce8d15	runtime: Initialize SharedFS for remote hypervisor Sets SharedFS config to NoSharedFS for remote hypervisor in order to start the file watcher which syncs files from the host to the guest VMs. Signed-off-by: Silenio Quarti <silenio_quarti@ca.ibm.com>	2024-07-10 14:31:25 -03:00
Aurélien Bombo	25e0e2fb35	ci: fix run-nydus tests GH-9973 introduced: * New function get_kata_memory_and_vcpus() in tests/metrics/lib/common.bash. * A call to get_kata_memory_and_vcpus() from extract_kata_env(), which is defined in tests/common.bash. Because the nydus test only sources tests/common.bash, it can't find get_kata_memory_and_vcpus() and errors out. We fix this by moving the get_kata_memory_and_vcpus() call from tests/common.bash to tests/metrics/lib/json.bash so that it doesn't impact the nydus test. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-10 17:19:08 +00:00
Gabriela Cervantes	b6b8524ab7	gha: Increase timeout to run CoCo TDX tests This PR increases the timeout to run the CoCo TDX tests in order to avoid the random failures on TDX saying that The action 'Run tests' has timed out after 30 minutes and making the GHA job fail. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-10 16:06:07 +00:00
Niteesh Dubey	e8a3f8571e	docs: update for SNP attestation This updates how-to document for SNP attestation. Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-07-10 15:06:55 +00:00
Niteesh Dubey	ff04154fdb	gha: enable SNP attestation This removes the code to skip the SNP attestation. Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-07-10 15:06:55 +00:00
Hyounggyu Choi	d94b285189	tests: Enable k8s-confidential-attestation.bats for s390x For running a KBS with `se-verifier` in service, specific credentials need to be configured. (See https://github.com/confidential-containers/trustee/tree/main/attestation-service/verifier/src/se for details.) This commit introduces two procedures to support IBM SE attestation: - Prepare required files and directory structure - Set necessary environment variables for KBS deployment - Repackage a secure image once the KBS service address is determined These changes enable `k8s-confidential-attestation.bats` for s390x. Fixes: #9933 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-10 16:18:37 +02:00
Hyounggyu Choi	5d0f74cd70	local-build: Extract build_secure_image() as a separate library Currently, all functions in `build_se_image.sh` are dedicated to publishing a payload image. However, `build_secure_image()` is now also used for repackaging a secure image when a kernel parameter is reconfigured. This reconfiguration is necessary because the KBS service address is determined after the initial secure image build. This commit extracts `build_secure_image()` from `build_se_image.sh` and creates a separate library, which can be loaded by bats-core. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-10 16:18:37 +02:00
Hyounggyu Choi	bf2f0ea2ca	tests: Change a location for creating key.bin The current KBS deployment creates a file `key.bin` assuming that `kustomization.yaml` is located in `overlays/`. However, this does not hold true when the kustomize config is enabled for multiple architectures. In such cases, the configuration file should be located in `overlays/$(uname -m)`. This commit changes the location for file creation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-10 16:18:37 +02:00
Hyounggyu Choi	4025ef7193	versions: Bump trustee to multi-arch deployment for KBS As part of the enablement for s390x, KBS should support multi-arch deployment. This commit updates the version of coco-trustee to a commit where the support is implemented. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-10 16:18:37 +02:00
Hyounggyu Choi	856a1f72c6	packaging: Set ATTESTER to se-attester for guest components on s390x This commit allows the guest-components builder to only build se-attester on s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-10 16:18:37 +02:00
Xuewei Niu	7f71eac6de	Merge pull request #9868 from l8huang/dan runtime: implement DAN in Go kata-runtime	2024-07-10 19:09:46 +08:00
Alex Lyn	dafff26f01	Merge pull request #9814 from Apokleos/bugfix-pcipath runtime-rs: bugfix for root bus slot allocation	2024-07-10 16:19:06 +08:00
Steve Horsman	aa487307e8	Merge pull request #9962 from GabyCT/topic/removecif scripts: Eliminate CI variable as it is not longer used	2024-07-10 09:02:33 +01:00
Steve Horsman	78bbc51ff0	Merge pull request #9806 from niteeshkd/nd_snp_certs runtime: pass certificates to get extended attestation report for SNP coco	2024-07-10 08:57:45 +01:00
Steve Horsman	29413021e5	Merge pull request #9981 from stevenhorsman/run-k8s-tests-on-zvsi-inherit-secrets gha: make run-k8s-tests-on-zvsi inherit secrets	2024-07-10 08:49:11 +01:00
Lei Huang	171d298dea	runtime: implement DAN in Go kata-runtime The DAN feature has already been implemented in kata-runtime-rs, and this commit brings the same capability to the Go kata-runtime. Fixes: #9758 Signed-off-by: Lei Huang <leih@nvidia.com>	2024-07-10 00:22:30 -07:00
ChengyuZhu6	489afffd8c	tests:gha: delete namespace before resetting namespace Delete the kata-containers-k8s-tests namespace before resetting the namespace to ensure that no deployments or services are restarting and creating pods in the default namespace. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Wang, Arron <arron.wang@intel.com>	2024-07-10 12:08:28 +08:00
ChengyuZhu6	e874c8fa2e	tests: Delete test scripts forcely Delete test scripts forcely in `Delete kata-deploy` step before deleting all kata pods. Fixes: #9980 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-10 12:08:28 +08:00
Alex Lyn	806e959b01	runtime-rs: bugfix for device slot allocation failed in dragonball In dragonball Vfio device passthrough scenarois, the first passthrough device will be allocated slot 0 which is occupied by root device. It will cause error, looks like as below: ``` ... 6: failed to add VFIO passthrough device: NoResource\n 7: no resource available for VFIO device"): unknown ... ``` To address such problem, we adopt another method with no pre-allocated guest device id and just let dragonball auto allocate guest device id and return it to runtime. With this idea, add_device will return value Result<DeviceType> and apply the change to related code. Fixes #9813 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-10 10:59:57 +08:00
Alex Lyn	27947cbb0b	dragonball: make add vfio device return guest device id Fixes #9813 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-07-10 10:59:51 +08:00
Alex Lyn	fa4af09658	Merge pull request #9985 from GabyCT/topic/fixcrites cri-containerd: Remove use_devmapper variable for cri-containerd tests	2024-07-10 10:13:27 +08:00
Alex Lyn	e4997760f1	Merge pull request #9987 from kata-containers/remove_double_process_check_from_memory_usage_test metrics: Remove duplicate check of processes from memory test.	2024-07-10 10:12:18 +08:00
David Esparza	09f523c815	Merge pull request #9973 from kata-containers/add_memory_and_vcpus_info_to_results Add memory and vcpus info to metrics results	2024-07-09 18:05:07 -06:00
David Esparza	e77d44614b	metrics: Remove duplicate check of processes from memory test. This PR removes the common_init function call from the memory usage script to eliminate duplicate checking that is also done from the init_env function. It also eliminates duplicaction of nested conditionals. Fixes: #9984 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-09 12:34:51 -06:00
Gabriela Cervantes	7061272b4e	kernel: bump kata config version This PR bumps the kata config version as the kernel scripts were modified. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 20:04:24 +02:00
Gabriela Cervantes	de848c1458	packaging: Remove CI variable from build kernel script This PR removes the CI variable from build kernel script which is not longer supported it as this was part of the jenkins environment. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 20:04:24 +02:00
Gabriela Cervantes	28601b51d2	tools: Remove CI variable in kata deploy in docker script This PR removes the CI variable in kata deploy in docker script which was supported it in jenkins environment which is not longer being supported it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 20:04:24 +02:00
Gabriela Cervantes	f2b8c6619d	makefile: Remove CI variable from local build makefile This PR removes the CI variable from the local build makefile as this was part of the jenkins environment which is not longer supported it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 20:04:24 +02:00
Gabriela Cervantes	4161fa3792	tools: Remove CI variable in test images script for osbuilder This PR removes the CI variable in test images script for osbuilder as this was part of the jenkins environment which is not longer supported it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 20:04:24 +02:00
Greg Kurz	7506d1ec29	tools: Remove CI variable in test config osbuilder script This PR removes the CI variable in test config osbuilder script which was supported on the jenkins environment which is not longer supported it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> [greg: squash all fixes into a single patch] Signed-off-by: Greg Kurz <groug@kaod.org>	2024-07-09 20:03:08 +02:00
Niteesh Dubey	647dad2a00	gha: skip SNP attestation test Skip the SNP attestation test for now. Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-07-09 17:16:07 +00:00
Niteesh Dubey	e7b4e5e386	gha: add SNP attestation test This tests the attestation of SNP guest. Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-07-09 17:14:26 +00:00
Gabriela Cervantes	1a1e62b968	cri-containerd: Remove use_devmapper variable for cri-containerd tests This PR removes the use_devmapper variable which was part of the jenkins environment flags which is not longer support it or available for the cri-containerd tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-09 17:09:55 +00:00
GabyCT	eb0bc5007c	Merge pull request #9976 from sprt/fix-cri-containerd tests: cri-containerd: Ensure Docker isn't present	2024-07-09 11:02:20 -06:00
David Esparza	04df85a44f	metrics: Add num_vcpus and free_mem to metrics results template. This PR retrieves the free memory and the vcpus count from a kata container and includes them to the json results file of any metric. Additionally this PR parses the requested vcpus quantity and the requested amount memory from kata configuration file and includes this pair of values into the json results file of any metric. Finally, the file system defined in the kata configuration file is included in the results template. Fixes: #9972 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-09 10:29:29 -06:00
David Esparza	a554541495	metrics: Improvement to the description of certain functions. This PR rephrased the description and usage of certain functions as such as: - set_kata_configuration_performance - set_kata_config_file - get_current_kata_config_file - check_if_root - check_ctr_images Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-07-09 10:29:29 -06:00
stevenhorsman	c7cf26fa32	gha: make run-k8s-tests-on-zvsi inherit secrets run-k8s-tests-on-zvsi runs the coco tests and we've added new secrets to provide credentials for the authenticated image testing, so we need to let the zvsi job inherit these from the caller workflow like the rest of the coco tests Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-09 15:29:48 +01:00
Hyounggyu Choi	37b907dfbc	Merge pull request #9859 from BbolroC/set-ocispec-for-vfio-ap tests: Extend vfio-ap hotplug test to use a zcrypttest tool	2024-07-09 14:03:45 +02:00
Steve Horsman	ff498c55d1	Merge pull request #9719 from fitzthum/sealed-secret Support Confidential Sealed Secrets (as env vars)	2024-07-09 09:43:51 +01:00
Niteesh Dubey	529660fafb	runtime: pass certificates for SNP coco This will be used to get extended attestation report. Fixes: #9805 Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-07-09 03:46:00 +00:00
Tim Zhang	704da86e9b	CI: Add tests for stdio Add tests for stdio Signed-off-by: Tim Zhang <tim@hyper.sh>	2024-07-09 11:44:40 +08:00
Tim Zhang	8801554889	runtime-rs: Fix ctr exec stuck problem Fixes: #9532 Instead of call agent.close_stdin in close_io, we call agent.write_stdin with 0 len data when the stdin pipe ends. Signed-off-by: Tim Zhang <tim@hyper.sh>	2024-07-09 11:44:36 +08:00
Tobin Feldman-Fitzthum	1c2d69ded7	tests: add test for sealed env secrets The sealed secret test depends on the KBS to provide the unsealed value of a vault secret. This secret is provisioned to an environment variable. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-07-08 17:41:20 -05:00
Linda Yu	b4d61f887b	agent: unittest for sealed secret as env in kata To test unsealing secrets stored in environment variables, we create a simple test server that takes the place of the CDH. We start this server and then use it to unseal a test secret. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-07-08 17:32:45 -05:00
Linda Yu	6003608fe6	agent: support sealed secret as env in kata When sealed-secret is enabled, the Kata Agent intercepts environment variables containing sealed secrets and uses the CDH to unseal the value. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-07-08 17:31:33 -05:00
Gabriela Cervantes	cf2d5ff4c1	scrips: Fix indentation in QAT run script This PR fixes the indentation of the QAT run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:23:50 +00:00
Gabriela Cervantes	d53eb61856	QAT: Remove CI variable from QAT run script This PR removes the CI variable from QAT run script which was used in the jenkins environment and not longer used. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:16:00 +00:00
Gabriela Cervantes	8a79b1449e	tests: Remove CI variable in tracing test This PR removes the CI variable as well as the instructions related to this as this was part of the jenkins environment which is not longer supported it. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:12:41 +00:00
Gabriela Cervantes	9d44abb406	tests: Remove CI variable in test agent shutdown This PR removes the CI variable as well as the instructions related to this variable which was used on the jenkins environment and not longer supported. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:10:24 +00:00
Gabriela Cervantes	f2ed8dc568	docs: Remove CI variable from Intel QAT documentation This PR updates the Intel QAT documentation by removing the CI variable which is not longer being supported as this was part of the jenkins CI environment. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:05:47 +00:00
Gabriela Cervantes	ff06ef0bbc	scripts: Eliminate CI variable as it is not longer used This PR removes the CI variable which is not longer being used or valid in the kata containers repository. The CI variable was used when we were using jenkins and scripts setups which are not longer supported. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-08 20:00:30 +00:00
GabyCT	cb0fb91bdd	Merge pull request #9966 from GabyCT/topic/fixstability tests: Use variable already defined in metrics common script for stability tests	2024-07-08 13:55:55 -06:00
Aurélien Bombo	e9d6179b28	tests: cri-containerd: Ensure Docker isn't present Following #9960 that transitioned this test to a free runner, we need to ensure Docker isn't installed on the system as that will conflict with the installation of Podman. Example error: https://github.com/kata-containers/kata-containers/actions/runs/9818218975/job/27177785716 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-08 18:50:57 +00:00
Steve Horsman	e8836fafaa	Merge pull request #9828 from stevenhorsman/image-rs-bump-bad84c7 Image rs bump to latest main	2024-07-08 17:07:59 +01:00
Fabiano Fidêncio	67ba0ad0ad	Merge pull request #9971 from GabyCT/topic/fixnerdctldep gha: Fix pip installation for nerdctl GHA	2024-07-06 21:37:55 +02:00
Gabriela Cervantes	724b2c612c	gha: Fix pip installation for nerdctl GHA This PR fixes the pip installation for nerdctl by removing a flag which is not longer supported and avoid the failure of no such option: --break-system-packages. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-05 17:31:52 +00:00
stevenhorsman	1d6c1d1621	test: Add journal logging for debug - Due to the error we hit with pulling the agnhost image used in the liveness-probe tests, we want to leave the console printing to help with debug when we next try to bump the image-rs version Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-05 10:25:28 +01:00
stevenhorsman	d511820974	agent: Bump image-rs - Bump the commit of image-rs we are pulling in to 413295415 Note: This is the last commmit before a change to whiteout handling was introduced that lead to the error `'failed to unpack: convert whiteout"` when pulling the agnhost:2.21 image Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-05 10:25:28 +01:00
Fabiano Fidêncio	543c90f145	Merge pull request #9695 from ChengyuZhu6/fix-init Fix issues on CI about guest-pull	2024-07-05 11:21:08 +02:00
ChengyuZhu6	65dc12d791	tests: Re-enable k8s-kill-all-process-in-container.bats This test was fixed by previous patches in this PR: kata-containers#9695 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	2ea521db5e	tests:tdx: Re-enable k8s-liveness-probes.bats This test was fixed by previous patches in this PR: kata-containers#9695 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	93453c37d6	tests: Re-enable k8s-sysctls.bats This test was fixed by previous patches in this PR: kata-containers#9695 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	6c5e053dd5	tests: Re-enable k8s-shared-volume.bats This test was fixed by previous patches in this PR: kata-containers#9695 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	85979021b3	tests: Re-enable k8s-file-volume.bats This test was fixed by previous patches in this PR: kata-containers#9695 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	e71c7ab932	agent/image: Remove functions about merging container spec for guest pull Let me explain why: In our previous approach, we implemented guest pull by passing PullImageRequest to the guest. However, this method resulted in the loss of specifications essential for running the container, such as commands specified in YAML, during the CreateContainer stage. To address this, it is necessary to integrate the OCI specifications and process information from the image’s configuration with the container in guest pull. The snapshotter method does not care this issue. Nevertheless, a problem arises when two containers in the same pod attempt to pull the same image, like InitContainer. This is because the image service searches for the existing configuration, which resides in the guest. The configuration, associated with <image name, cid>, is stored in the directory /run/kata-containers/<cid>. Consequently, when the InitContainer finishes its task and terminates, the directory ceases to exist. As a result, during the creation of the application container, the OCI spec and process information cannot be merged due to the absence of the expected configuration file. Fixes: kata-containers#9665 Fixes: kata-containers#9666 Fixes: kata-containers#9667 Fixes: kata-containers#9668 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
ChengyuZhu6	c9d1a758cd	agent/image: Reuse the mountpoint in image-rs Currently, the image is pulled by image-rs in the guest and mounted at `/run/kata-containers/image/cid/rootfs`. Finally, the agent rebinds `/run/kata-containers/image/cid/rootfs` to `/run/kata-containers/cid/rootfs` in CreateContainer. However, this process requires specific cleanup steps for these mount points. To simplify, we reuse the mount point `/run/kata-containers/cid/rootfs` and allow image-rs to directly mount the image there, eliminating the need for rebinding. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-07-05 08:10:04 +08:00
stevenhorsman	05cd1cc7a0	agent: Add CreateContainer support for pre-pulled bundle - Add a check in setup_bundle to see if the bundle already exists and if it does then skip the setup. This commit is cherry-picked from `44ed3ab80e`. The reason that k8s-kill-all-process-in-container.bats failed is that deletion of the directory `/root/kata-containers/cid/rootfs` failed during removing container because it was mounted twice (one in image-rs and one in set_bundle ) and only unmounted once in removing container. Fixes: #9664 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Dave Hay <david_hay@uk.ibm.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-05 08:10:00 +08:00
Zvonko Kaiser	7990d3a154	dragonball: Update kata config version Mandatory update Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-04 17:24:16 +00:00
Zvonko Kaiser	cfbca4fe0d	dragonball: Update versions Use the latest guest kernel that we use for all other VMMs Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-04 17:24:16 +00:00
Zvonko Kaiser	26446d1edb	dragonball: Update patches After v5.14 there is no cpu_hotplug_begin function now cpus_write_lock same for cpu_hotplug_done = cpus_write_unlock Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-04 17:23:24 +00:00
Zvonko Kaiser	ad574b7e10	dragonball: Add patches for 6.1.x Ported the 5.10 patchs to 6.1.x Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-07-04 17:06:39 +00:00
Gabriela Cervantes	757f37d956	stability: General improvements for soak parallel test This PR has better variable definitons as well the use of a variable which is already defined in the metrics common script for soak parallel test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-04 16:32:46 +00:00
Gabriela Cervantes	6d56abbdad	stability: General improvements to agent stability test This PR is for better variable definitions as well as the use of the CTR_EXE variable which is already defined in the metrics common script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-04 16:24:27 +00:00
Gabriela Cervantes	3e6c32c3c8	tests: Use variable already defined in stability tests This PR uses the CTR_EXE which is already defined in the metrics common script to have uniformity across the multiple stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-04 16:21:24 +00:00
Steve Horsman	ddb8a94677	Merge pull request #9960 from sprt/fix-garm ci: Transition GARM tests to free runners, pt. I	2024-07-04 09:04:58 +01:00
Biao Lu	6c1a2f01f8	protocols: add support for sealed_secret service To unseal a secret, the Kata agent will contact the CDH using ttRPC. Add the proto that describes the sealed secret service and messages that will be used. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Biao Lu <biao.lu@intel.com>	2024-07-04 01:03:41 -05:00
Fabiano Fidêncio	49696bbdf2	Merge pull request #9943 from AdithyaKrishnan/nydus-cleanup-timeout tests: Fixes TEE timeout issue	2024-07-03 22:57:17 +02:00
Anastassios Nanos	db75b5f3c4	Merge pull request #8070 from nubificus/feat_add-fc-runtime-rs runtime-rs: firecracker hypervisor backend	2024-07-03 22:29:30 +03:00
Adithya Krishnan Kannan	9250858c3e	tests: Stop trying to patch finalize We have not seen instances of the nydus snapshotter hanging on its deletion that we must patch its finalize. Let's just drop this line for now. Signed-Off-By: Adithya Krishnan Kannan <AdithyaKrishnan.Kannan@amd.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-07-03 12:19:26 -05:00
Dan Mihai	ada53744ea	Merge pull request #9907 from microsoft/saulparedes/allow_empty_env_vars genpolicy: allow some empty env vars	2024-07-03 08:07:23 -07:00
Aurélien Bombo	f18e35014f	ci: Move `run-nerdctl-tests` to free runner See #9940. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:58:11 +00:00
Aurélien Bombo	c0919d6f45	ci: Move `run-docker-tests` to free runner Removed the Docker installation step as that's preinstalled in free runners. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:57:59 +00:00
Aurélien Bombo	743a765525	ci: Move `run-runk` to free runner See #9940. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:57:48 +00:00
Aurélien Bombo	09cce86cc7	ci: Move `run-nydus` to free runner See #9940. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:57:42 +00:00
Aurélien Bombo	9e1b6064dc	ci: Move `run-containerd-stability` to free runner Removes the Docker installation step as that's preinstalled on the free runner: https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2204-Readme.md#tools Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:57:37 +00:00
Aurélien Bombo	6a0e403acf	ci: Move `run-cri-containerd` to free runner See #9940. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-03 14:57:29 +00:00
George Pyrros	2d19f3fbd7	runtime-rs: firecracker hypervisor backend Add a basic runtime-rs `Hypervisor` trait implementation for AWS Firecracker - Add basic hypervisor operations (setup / start / stop / add_device) - Implement AWS Firecracker API on a separate file `fc_api.rs` - Add support for running jailed (include all sandbox-related content) - Add initial device support (limited as hotplug is not supported) - Add separate config for runtime-rs (FC) Notes: - devmapper is the only snapshotter supported - to account for no sharefs support, we copy files in the sandbox (as in the GO runtime) - nerdctl spawn is broken (TODO: #7703) Fixes: #5268 Signed-off-by: George Pyrros <gpyrros@nubificus.co.uk> Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk> Signed-off-by: Charalampos Mainas <cmainas@nubificus.co.uk> Signed-off-by: George Ntoutsos <gntouts@nubificus.co.uk>	2024-07-03 08:30:30 +00:00
GabyCT	e3e3873857	Merge pull request #9954 from GabyCT/topic/sysbenchci metrics: Remove variable in sysbench that is not being used	2024-07-02 16:58:46 -06:00
Aurélien Bombo	eda5d2c623	ci: cleanup: Run every 24 hours instead of 6 hours Resources don't fail to get deleted as often to need to run every 6 hours. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-02 22:27:58 +00:00
Aurélien Bombo	f20924db24	ci: cleanup: Ignore nonexisting resources Some resource names seem to be lingering in Azure limbo but do not map to any actual resources, so we ignore those. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-07-02 22:23:54 +00:00
GabyCT	0590aab3e6	Merge pull request #9952 from GabyCT/topic/unitjenkins docs: Remove jenkins reference from unit testing presentation	2024-07-02 15:34:25 -06:00
Aurélien Bombo	33d08a8417	Merge pull request #9825 from microsoft/mahuber/main osbuilder: allow rootfs builds w/o git or version file deps	2024-07-02 09:38:13 -07:00
Steve Horsman	078a1147a6	Merge pull request #9909 from kata-containers/sprt/gha-cleanup-pt2 ci: Add scheduled job to cleanup resources, pt. II	2024-07-02 17:12:03 +01:00
Gabriela Cervantes	b7da1291ea	metrics: Remove variable in sysbench that is not being used This PR removes the CI_JOB variable which previously was used but not longer being supported of the metrics sysbench test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-02 15:29:50 +00:00
Wainer Moschetta	ec695f67e1	Merge pull request #9577 from microsoft/saulparedes/topology genpolicy: add topologySpreadConstraints support	2024-07-02 11:24:26 -03:00
Fabiano Fidêncio	ef3f6515cf	Merge pull request #9941 from sprt/temp-disable-test ci: Temporarily disable kata-deploy and GARM tests	2024-07-02 14:13:46 +02:00
Amulya Meka	dd12089e0d	Merge pull request #9914 from Amulyam24/qemu-fix kata-deploy: fix qemu static build on ppc64le	2024-07-02 10:45:03 +05:30
Saul Paredes	f3f3caa80a	genpolicy: update sample Update pod-one-container.yaml sample Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-07-01 13:49:08 -07:00
Dan Mihai	75aee526a9	genpolicy: add topologySpreadConstraints support Allow genpolicy to process Pod YAML files including topologySpreadConstraints. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-07-01 13:32:49 -07:00
Gabriela Cervantes	c270df7a9c	docs: Remove jenkins reference from unit testing presentation This PR removes the jenkins reference from unit testing presentation as this is not longer supported on the kata containers project. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-01 20:26:35 +00:00
GabyCT	e94490232e	Merge pull request #9949 from cmaf/tests-fix-openvino-help tests: Update help section in openvino test	2024-07-01 13:31:51 -06:00
Gabriela Cervantes	e3318a04f7	metrics: Update container name in blogbench test This PR updates the container name to put a random name instead of using a hard coded name. This PR is a general improvement to avoid random bug failures specially when we are running on baremetal environments. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-07-01 19:28:16 +00:00
Fabiano Fidêncio	05848d0c34	Merge pull request #9930 from likebreath/0627/clh_v40.0 Upgrade to Cloud Hypervisor v40.0	2024-07-01 20:04:47 +02:00
Steve Horsman	4fd820abd2	Merge pull request #9947 from stevenhorsman/fix-cleanups-workflow-secret gha: ci: Remove incorrect secrets line	2024-07-01 16:30:37 +01:00
Chelsea Mafrica	0b83c8549a	tests: Update help section in openvino test Test reports that it is a onednn test when it is openvino; update description. Fixes: #9948 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-07-01 14:24:50 +00:00
Hyounggyu Choi	795c5dc0ff	tests: Extend vfio-ap hotplug test to use zcrypttest This commit extends the vfio-ap hotplug test to include the use of `zcrypttest`. A newly introduced test by the tool consists of several test rounds as follows: - ioctl_test - simple_test - simple_one_thread_test - simple_multi_threads_test - multi_thread_stress_test - hang_after_offline_online_test A writable root filesystem is required for testing because the reference count needs to be reset after each test round. The current containerd kata containers support does not include `--privileged_without_host_devices`, which is necessary to configure a writable filesystem along with `--privileged`. (Please check out https://github.com/kata-containers/kata-containers/issues/9791 for details) So `crictl` is chosen to extend the test. The commit also includes the removal of old commands previously used for the tests repository but no longer in use. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-01 11:41:59 +02:00
Hyounggyu Choi	5bda197e9d	tests: Add zcrypttest tool to test image Dockerfile This commit copies an internal testing tool `zcrypttest` to the test image. A base image is changed to `ubuntu:22.04` due to a library dependency issue. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-01 11:40:49 +02:00
Hyounggyu Choi	99690ab202	runtime: Instantiate/pass vfio-ap device to ociSpec This commit adds the missing step of passing an attached vfio-ap device to a container via ociSpec. It instantiates and passes a vfio-ap device (e.g. a Z crypto device). A device at `/dev/z90crypt` covers all use cases at the time of writing. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-07-01 11:40:49 +02:00
Amulyam24	259ec408b5	kata-deploy: fix qemu static build for v8.2.1 on ppc64le Do not install the packages librados-dev and librbd-dev as they are not needed for building static qemu. Add machine option cap-ail-mode-3=off while creating the VM to qemu cmdline. Fixes: #9893 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-07-01 14:56:43 +05:30
stevenhorsman	16130e473c	gha: ci: Remove incorrect secrets line The CI is failing with: ``` Invalid workflow file: .github/workflows/cleanup-resources.yaml#L10 The workflow is not valid. .github/workflows/cleanup-resources.yaml (Line: 10, Col: 5): Unexpected value 'secrets' ``` I think this is because `secrets: inherit` is only applicable when re-using a workflow, not for a standalone job like we have here. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-07-01 09:32:58 +01:00
Hyounggyu Choi	f0187ff969	Merge pull request #9932 from BbolroC/drop-ci-install-go CI: Eliminate dependency on tests repo	2024-07-01 08:24:28 +02:00
Hyounggyu Choi	f2bfc306a2	Merge pull request #9936 from BbolroC/use-quay-lpine-bash-curl CI: Use multi-arch image for alpine-bash-curl	2024-07-01 08:02:01 +02:00
Manuel Huber	4b2e725d03	rootfs: Install Rust only when necessary For docker-based builds only install Rust when necessary. Further, execute the detect Rust version check only when intending to install Rust. As of today, this is the case when we intend to build the agent during rootfs build. Signed-off-by: Manuel Huber <mahuber@microsoft.com>	2024-06-28 22:19:46 +00:00
Aurélien Bombo	c605fff4c1	ci: Temporarily disable kata-deploy and GARM tests Per the decision taken in the 6/27 AC meeting, this PR temporarily disables kata-deploy and GARM tests until we secure further Azure CI funding. In the meantime, I'll transition the GARM tests to free runners and reenable them to regain that coverage without affecting spending (see #9940). If it turns out the free runners are too slow, we'll switch back to GARM. After funding is secured, we'll reenable the kata-deploy tests (see #9939). Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-06-28 20:23:07 +00:00
Hyounggyu Choi	dd23beeb05	CI: Eliminating dependency on clone_tests_repo() As part of archiving the tests repo, we are eliminating the dependency on `clone_tests_repo()`. The scripts using the function is as follows: - `ci/install_rust.sh`. - `ci/setup.sh` - `ci/lib.sh` This commit removes or replaces the files, and makes an adjustment accordingly. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-28 14:52:02 +02:00
Hyounggyu Choi	f2c5f18952	CI: Use multi-arch image for alpine-bash-curl A multi-arch image for `alpine-bash-curl` has been pushed to and available at `quay.io/kata-containers`. This commit switches the test image to `quay.io/kata-containers/alpine-bash-curl`. Fixes: #9935 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-28 12:01:53 +02:00
Hyounggyu Choi	0e20f60534	CI: Drop unused scripts The following scripts are not used by the repository any more: - ci/install_go.sh - ci/run.sh - ci/install_vc.sh Additionally, they rely on the tests repo, which is soon to be archived. This commit drops the unused scripts. Fixes: #8507 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-28 07:55:21 +02:00
Archana Shinde	82a1892d34	agent: Add additional info while returning errors for update_interface This should provide additional context for errors while updating network interface. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-06-27 12:56:53 -07:00
Archana Shinde	2127288437	agent: Bring interface down before renaming it. In case we are dealing with multiple interfaces and there exists a network interface with a conflicting name, we temporarily rename it to avoid name conflicts. Before doing this, we need to rename bring the interface down. Failure to do so results in netlink returning Resource busy errors. The resource needs to be down for subsequent operation when the name is swapped back as well. This solves the issue of passing multiple networks in case of nerdctl as: nerdctl run --rm --net foo --net bar docker.io/library/busybox:latest ip a Fixes: #9900 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-06-27 12:56:53 -07:00
Zvonko Kaiser	a32b21bd32	Merge pull request #9918 from zvonkok/build-error rootfs: Fix spurious error	2024-06-27 19:46:51 +02:00
Bo Chen	25e3cab028	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v40.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #9929 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-06-27 09:59:00 -07:00
Bo Chen	ad92d73e43	versions: Upgrade to Cloud Hypervisor v40.0 Details of this release can be found in our roadmap project as iteration v40.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #9929 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-06-27 09:40:13 -07:00
Alex Lyn	d66c214ae7	Merge pull request #9849 from markyangcc/main runtime: fix missing of VhostUserDeviceReconnect parameter assignment	2024-06-27 21:48:37 +08:00
Wainer Moschetta	afc1c1a782	Merge pull request #9896 from fitzthum/bump-gc-090 versions: bump coco guest components and trustee	2024-06-27 09:46:06 -03:00
Zvonko Kaiser	29bb9de864	Merge pull request #9923 from BbolroC/increase-interval-max-tries-kubectl tests: Increase interval and max_tries for kubectl_retry	2024-06-27 09:49:24 +02:00
Hyounggyu Choi	4ec355fb78	tests: Increase interval and max_tries for kubectl_retry Observed instability in the API server after deploying kata-deploy caused test failures. (see: https://github.com/kata-containers/kata-containers/actions/runs/9681494440/job/26743286861) Specifically, `kubectl_retry logs` failed before the API server could respond properly. This commit increases the interval and max_tries for kubectl_retry(), allowing sufficient time to handle this situation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-27 08:39:22 +02:00
Aurélien Bombo	2c89828749	ci: Add scheduled job to cleanup resources, pt. II Follow-up to #9898 and final PR of this set. This implements the actual deletion logic. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-06-26 17:36:47 +00:00
Zvonko Kaiser	893fd2b59c	Merge pull request #9916 from zvonkok/config-fix gpu: Missing separator	2024-06-26 14:46:47 +02:00
Greg Kurz	fe7ef878d2	Merge pull request #9913 from gkurz/update-kata-ctl-deps kata-ctl: Update Cargo.lock	2024-06-26 14:31:03 +02:00
Zvonko Kaiser	30ec78b19a	rootfs: Fix spurious error In some DMZ'ed or CI systems the repos are not up to date and multistrap fails to find the ubuntu-keyring package. Update the repos to fix this; Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-26 11:10:58 +00:00
Zvonko Kaiser	e0aa54301f	gpu: Missing separator Add the correct separator for replacement Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-26 10:40:35 +00:00
Greg Kurz	ac33a389c0	Merge pull request #9879 from pmores/remove-dependency-on-containerd-bundle-dir-tree runtime-rs: remove attempt to access sandbox bundle from container bu…	2024-06-26 10:57:50 +02:00
Greg Kurz	db7b2f7aaa	kata-ctl: Update Cargo.lock A previous change missed to refresh Cargo.lock. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-06-26 08:27:52 +02:00
Tobin Feldman-Fitzthum	dd8605917b	versions: bump coco guest components and trustee Pick up the changes from the newest version of guest-components and trustee. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-06-25 23:56:18 +00:00
GabyCT	81d23a1865	Merge pull request #9897 from GabyCT/topic/montime tests: Increase timeout to crictl calls on kata monitor tests	2024-06-25 17:27:15 -06:00
Gabriela Cervantes	a8432880f8	tests: Increase timeout to crictl calls on kata monitor tests This PR increases the timeout to crictl calls on kata monitor tests to avoid to hit issues every now and avoid random failures. This PR is very similar to PR #7640. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-25 22:32:47 +00:00
Wainer Moschetta	c4fb6fbda2	Merge pull request #9887 from ldoktor/ci-kata-runtime ci.ocp: Ensure we smoke-test with the right runtime class	2024-06-25 15:27:27 -03:00
Fabiano Fidêncio	fb44edc22f	Merge pull request #9906 from stevenhorsman/TEE-sample-kbs-policy-guards tests: attestation: Restrict sample policy use	2024-06-25 20:27:13 +02:00
Steve Horsman	c9df743dab	Merge pull request #9898 from sprt/gha-cleanup-job ci: Add scheduled job to cleanup resources, pt. I	2024-06-25 19:11:30 +01:00
Saul Paredes	ce19419d72	genpolicy: allow some empty env vars Updated genpolicy settings to allow 2 empty environment variables that may be forgotten to specify (AZURE_CLIENT_ID and AZURE_TENANT_ID) Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-06-25 10:53:05 -07:00
Aurélien Bombo	0582a9c75b	Merge pull request #9864 from 3u13r/feat/genpolicy/layers-cache-file-path genpolicy: allow specifying layer cache file	2024-06-25 10:42:22 -07:00
Aurélien Bombo	d60b548d61	ci: Add scheduled job to cleanup resources This is the first part of adding a job to clean up potentially dangling Azure resources. This will be based on Jeremi's tool from https://github.com/jepio/kata-azure-automation. At first, we'll only clean up AKS clusters, as this is what has been causing us problems lately, but this could very well be extended to cleaning up entire resource groups, which is why I left the different names pretty generic (i.e. "resources" instead of "clusters"). Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-06-25 16:33:03 +00:00
stevenhorsman	7610b34426	tests: attestation: Restrict sample policy use - We only want to enable the sample verifier in the KBS for non-TEE tests, so prevent an edge case where the TEE platform isn't set up correctly and we might fall back to the sample and get false positives. To prevent this we add guards around the sample policy enablement and only run it for non confidential hardware Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-25 16:59:40 +01:00
Steve Horsman	d574d37c4b	Merge pull request #9903 from stevenhorsman/authenticated-regsitry-workflow-secrets workflow: coco: Add auth registry secret	2024-06-25 16:40:46 +01:00
stevenhorsman	d8961cbd4a	workflow: coco: Add auth registry secret - Add the `AUTHENTICATED_IMAGE_USER` and `AUTHENTICATED_IMAGE_PASSWORD` repository secrets as env vars to the coco tests, so we can use them to pull an images from and authenticated registry for testing Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-25 11:11:02 +01:00
Alex Lyn	2c5b3a5c20	Merge pull request #9830 from gaohuatao-1/ght/count-rs runtime-rs: fix the bug of func count_files	2024-06-25 15:00:46 +08:00
GabyCT	27d75f93e2	Merge pull request #9872 from GabyCT/topic/varmemin metrics: Improve variable definition in memory inside containers script	2024-06-24 15:30:05 -06:00
Aurélien Bombo	b0cdf4eb0d	Merge pull request #9579 from microsoft/saulparedes/add_seccomp_support genpolicy: ignore SeccompProfile in PodSpec	2024-06-24 08:58:01 -07:00
Wainer Moschetta	bcdc4fde10	Merge pull request #9857 from wainersm/disable_failing_jobs-part2 CI: disable jobs that failed >= 50% on nightly CI recently - part 2	2024-06-24 10:11:05 -03:00
Leonard Cohnen	6a3ed38140	genpolicy: allow specifying layer cache file Add --layers-cache-file-path flag to allow the user to specify where the cache file for the container layers is saved. This allows e.g. to have one cache file independent of the user's working directory. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-06-24 14:53:27 +02:00
Fabiano Fidêncio	3adf9e250f	Merge pull request #9875 from zvonkok/gha-no-sudo-arm64 ci: gha no sudo arm64	2024-06-21 15:28:54 +02:00
Wainer Moschetta	f7e0d6313b	Merge pull request #9865 from wainersm/qemu-coco-dev_updates runtime: updates to qemu-coco-dev configuration	2024-06-21 10:14:30 -03:00
Fabiano Fidêncio	2d552800f2	Merge pull request #9876 from zvonkok/gha-no-sudo-s390x ci: remove sudo from s390x build	2024-06-21 15:00:31 +02:00
Saul Paredes	44afb4aa5f	genpolicy: ignore SeccompProfile in PodSpec Ignore SeccompProfile in PodSpec Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-06-20 09:42:17 -07:00
Dan Mihai	7aeaf2502a	Merge pull request #9856 from microsoft/danmihai1/new-policy-rules genpolicy: reject untested CreateContainer field values	2024-06-20 09:34:53 -07:00
GabyCT	9320c2e484	Merge pull request #9845 from GabyCT/topic/fixartifacts gha: Do not fail when collecting artifacts	2024-06-20 10:15:53 -06:00
Hyounggyu Choi	959a277dc5	Merge pull request #9886 from BbolroC/kernel-config-uv-uapi-s390x kernel: Add CONFIG_S390_UV_UAPI for s390x	2024-06-20 16:05:15 +02:00
Steve Horsman	d5b4da7331	Merge pull request #9881 from stevenhorsman/remote-hypervisor-policy runtime: Support policy in remote hypervisor	2024-06-20 14:01:29 +01:00
Hyounggyu Choi	9cb12dfa88	kernel: Add CONFIG_S390_UV_UAPI for s390x While enabling the attestation for IBM SE, it was observed that a kernel config `CONFIG_S390_UV_UAPI` is missing. This config is required to present an ultravisor in the guest VM. Ths commit adds the missing config. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-20 13:15:33 +02:00
Lukáš Doktor	b08c019003	ci.ocp: Ensure we smoke-test with the right runtime class we do encourage people to set the KATA_RUNTIME, but it is only used by the webhook. Let's define it in the main `test.sh` and use it in the smoke test to ensure the user-defined runtime is smoke-tested rather than hard-coded kata-qemu one. Related to: #9804 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-06-20 11:15:02 +02:00
Fabiano Fidêncio	0f2a4d202e	Merge pull request #9884 from fidencio/topic/re-enable-tdx-ci ci: tdx: Re-enable TDX CI	2024-06-20 06:39:06 +02:00
GabyCT	02075f73e9	Merge pull request #9874 from GabyCT/topic/fixvarnerdctl tests: nerdctl: Fix variables names and remove network	2024-06-19 13:43:25 -06:00
Fabiano Fidêncio	2bab0f31d7	ci: tdx: Re-enable TDX CI Now, using vanilla kubernetes, let's re-enable the TDX CI and hope it becomes more stable than it used to be. The cleanup-snapshotter is now taking ~4 minutes, and that matches with the other platforms, mainly considering there's a sum of 210 seconds sleep in the process. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-19 20:08:28 +02:00
Greg Kurz	81972f6ffc	Merge pull request #9149 from ryansavino/upgrade-to-qemu-8.2.1 qemu: upgrade to 8.2.4	2024-06-19 19:10:02 +02:00
stevenhorsman	779754dcf6	runtime: Support policy in remote hypervisor Move the `sandbox.agent.setPolicy` call out of the remoteHypervisor if, block, so we can use the policy implementation on peer pods Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-19 16:43:53 +01:00
Fabiano Fidêncio	f9862e054c	Merge pull request #9882 from fidencio/topic/ci-tdx-use-vanilla-k8s ci: tdx: Use vanilla k8s instead of k3s	2024-06-19 17:33:00 +02:00
Pavel Mores	6a4919eeb9	runtime-rs: fix misleading log message get_vmm_master_tid() currently returns an error with the message "cannot get qemu pid (though it seems running)" when it finds a valid QemuInner::qemu_process instance but fails to extract the PID out of it. This condition however in fact means that a qemu child process was running (otherwise QemuInner::qemu_process would be None) but isn't anymore (id() returns None). Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-19 17:15:24 +02:00
Pavel Mores	af5492e773	runtime-rs: made Qemu::stop_vm() idempotent Since Hypervisor::stop_vm() is called from the WaitProcess request handling which appears to be per-container, it can be called multiple times during kata pod shutdown. Currently the function errors out on any subsequent call after the initial one since there's no VM to stop anymore. This commit makes the function tolerate that condition. While it seems conceivable that sandbox shouldn't be stopped by WaitProcess handling, and the right fix would then have to happen elsewhere, this commit at least makes qemu driver's behaviour consistent with other hypervisor drivers in runtime-rs. We also slightly improve the error message in case there's no QemuInner::qemu_process instance. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-19 17:15:24 +02:00
Pavel Mores	5fbbff9e5e	runtime-rs: remove attempt to access sandbox bundle from container bundle Since no objections were raised in the linked issue (#9847) this commit removes the attempt to derive sandbox bundle path from container bundle path. As described in more detail in the linked issue, this is container runtime specific and doesn't seem to serve any purpose. As for implementation, we hoist the only part of get_shim_info_from_sandbox() that's still useful (getting the socket address) directly into the caller and remove the function altogether. Fixes #9847 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-19 17:09:15 +02:00
Fabiano Fidêncio	7127178acc	ci: tdx: Use vanilla k8s instead of k3s We've noticed a bunch of issues related to deploying and deleting the nydus-snapshotter. As we don't see the same issues on other machines using vanilla kubernetes, let's avoid using k3s for now follow the flow. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-19 16:56:15 +02:00
Zvonko Kaiser	beab17f765	Merge pull request #9877 from zvonkok/gha-no-sudo-ppc64 ci: gha no sudo ppc64	2024-06-19 14:02:05 +02:00
Zvonko Kaiser	d783ddaf03	ci: Remove not needed chown for ppc64 Now that all artifacts are owned by $USER no extra step needed to adjust ownership Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-19 07:56:45 +00:00
Zvonko Kaiser	5bc37e39d5	ci: remove sudo from ppc64 build We can now do the same for ppc64 that we did for amd64 and remove the sudo cp. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-19 07:55:45 +00:00
Zvonko Kaiser	c341234c0b	ci: remove sudo from s390x build We can now do the same for s390x that we did for amd64 and remove the sudo cp. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-19 07:53:33 +00:00
Zvonko Kaiser	3beb460a97	ci: Remove not needed chown for arm64 Now that all artifacts are owned by $USER no extra step needed to adjust ownership Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-19 07:48:00 +00:00
Zvonko Kaiser	445b389b16	ci: remove sudo from arm64 build We can now do the same for arm64 that we did for amd64 and remove the sudo cp. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-19 07:46:51 +00:00
Gabriela Cervantes	6ec7971f7a	tests: nerdctl: Fix variables names and remove network This PR fixes the variables names for the network that was created as well removes the network that were created for the tests to ensure a clean environment when running all the tests and avoid failures specially on baremental environments that network already exists. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-18 23:00:49 +00:00
Dan Mihai	4df66568cf	genpolicy: reject untested CreateContainer field values Reject CreateContainerRequest field values that are not tested by Kata CI and that might impact the confidentiality of CoCo Guests. This change uses a "better safe than sorry" approach to untested fields. It is very possible that in the future we'll encounter reasonable use cases that will either: - Show that some of these fields are benign and don't have to be verified by Policy, or - Show that Policy should verify legitimate values of these fields These are the new CreateContainerRequest Policy rules: count(input.shared_mounts) == 0 is_null(input.string_user) i_oci := input.OCI is_null(i_oci.Hooks) is_null(i_oci.Linux.Seccomp) is_null(i_oci.Solaris) is_null(i_oci.Windows) i_linux := i_oci.Linux count(i_linux.GIDMappings) == 0 count(i_linux.MountLabel) == 0 count(i_linux.Resources.Devices) == 0 count(i_linux.RootfsPropagation) == 0 count(i_linux.UIDMappings) == 0 is_null(i_linux.IntelRdt) is_null(i_linux.Resources.BlockIO) is_null(i_linux.Resources.Network) is_null(i_linux.Resources.Pids) is_null(i_linux.Seccomp) i_linux.Sysctl == {} i_process := i_oci.Process count(i_process.SelinuxLabel) == 0 count(i_process.User.Username) == 0 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-06-18 18:09:31 +00:00
Wainer Moschetta	cf372f41bf	Merge pull request #9869 from fidencio/topic/disable-tdx-ci ci: tdx: Disable TDX CI	2024-06-18 14:47:38 -03:00
Gabriela Cervantes	671d9af456	metrics: Improve variable definition in memory inside containers script This PR improves the variable definition in memory inside the container script for metrics. This change declares and assigns the variables separately to avoid masking return values. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-18 16:56:12 +00:00
Gabriela Cervantes	eeb467bdc2	gha: Do not fail when collecting artifacts This PR will avoid the failures when collecting artifacts for the gha. This will ensure that we collect and archive system's data for the purpose of debugging. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-18 16:05:23 +00:00
Zvonko Kaiser	b1909e940e	deploy: Add busybox target For a minimal initrd/image build we may want to leverage busybox. This is part number two of the NVIDIA initrd/image build Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-18 15:31:00 +00:00
Wainer Moschetta	36093e86e0	Merge pull request #9863 from wainersm/kata-deploy_yq kata-deploy: always copy ci/install_yq.sh	2024-06-18 10:05:41 -03:00
Fabiano Fidêncio	587f4d45de	ci: tdx: Disable TDX CI TDX CI has been having some issues with the Nydus snapshotter cleanup, which has been stuck for hours depending every now and then. With this in mind, let's disable the TDX CI, so we avoid it blocking the progress of Kata Containers project, and we re-enable it as soon as we have it solved on Intel's side. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-18 10:30:40 +02:00
markyangcc	a28bf266f9	runtime: fix missing of VhostUserDeviceReconnect parameter assignment Commit 'ca02c9f5124e' implements the vhost-user-blk reconnection functionality, However, it has missed assigning VhostUserDeviceReconnect when new the QEMU HypervisorConfig, resulting in VhostUserDeviceReconnect always set to default value 0. Real change is this line, most of changes caused by go format, return vc.HypervisorConfig{ // ... VhostUserDeviceReconnect: h.VhostUserDeviceReconnect, }, nil Fixes: #9848 Signed-off-by: markyangcc <mmdou3@163.com>	2024-06-18 12:15:10 +08:00
Alex Lyn	388cd7dde4	Merge pull request #9772 from pmores/add-base-qmp-framework runtime-rs: add base qmp framework	2024-06-18 09:53:28 +08:00
Alex Lyn	275c498dc9	Merge pull request #9834 from lifupan/main sandbox: fix the issue of failed to get the vmm master tid	2024-06-18 08:57:21 +08:00
Alex Lyn	d3fb6bfd35	Merge pull request #9860 from stevenhorsman/tokio-vulnerability-bump Tokio vulnerability bump	2024-06-18 08:35:34 +08:00
Wainer dos Santos Moschetta	bdbee78517	runtime: allow default_{vcpus,memory} annotations to qemu-coco-dev This is a counterpart of commit `abf52420a4` for the qemu-coco-dev configuration. By allowing default_vcpu and default_memory annotations users can fine-tune the VM based on the size of the container image to avoid issues related with pulling large images in the guest. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-17 18:59:52 -03:00
Wainer dos Santos Moschetta	baa8d9d99c	runtime: set shared_fs=none to qemu-coco-dev configuration Just like the TEE configurations (sev, snp, tdx) we want to have the qemu-coco-dev using shared_fs=none. Fixes: #9676 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-17 18:42:46 -03:00
Wainer Moschetta	b8d7a8c546	Merge pull request #9862 from BbolroC/improve-kubectl-retry tests: Use selector rather than pod name for kubectl logs/describe	2024-06-17 18:33:24 -03:00
Hyounggyu Choi	6b065f5609	tests: Use selector rather than pod name for kubectl logs/describe The following error was observed during the deployment of nydus snapshotter: ``` Error from server (NotFound): the server could not find the requested resource ( pods/log nydus-snapshotter-5v82v) 'kubectl logs nydus-snapshotter-5v82v -n nydus-system' failed after 3 tries Error: Process completed with exit code 1. ``` This error can occur when a pod is re-created by a daemonset during the retry interval. This commit addresses the issue by using `--selector` rather than the pod name for `kubectl logs/describe`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-17 22:27:50 +02:00
Wainer Moschetta	7df221a8f9	Merge pull request #9833 from wainersm/qemu-rs_tests tests/k8s: run for qemu-runtime-rs on AKS	2024-06-17 16:59:46 -03:00
Zvonko Kaiser	5f11c0f144	Merge pull request #9861 from zvonkok/release-3.6.0 release: Bump VERSIONS file to 3.6.0	2024-06-17 20:35:29 +02:00
Wainer Moschetta	b6a28bd932	Merge pull request #9786 from microsoft/saulparedes/add_back_insecure_registry_pull genpolicy: add back support for insecure	2024-06-17 15:21:25 -03:00
Wainer Moschetta	68415dabcd	Merge pull request #9815 from msanft/fix/genpolicy/flag-name genpolicy: fix settings path flag name	2024-06-17 15:13:25 -03:00
Wainer dos Santos Moschetta	08eaa60b59	CI: disable all run-kata-deploy-tests-on-garm jobs The following jobs have failed more than 50% on nightly CI. run-kata-deploy-tests-on-garm / run-kata-deploy-tests (clh, k0s) run-kata-deploy-tests-on-garm / run-kata-deploy-tests (clh, rke2) run-kata-deploy-tests-on-garm / run-kata-deploy-tests (qemu, k0s) Instead of removing only those jobs, let's skip the kata-deploy-tests on GARM completely so we can try to fix all the issues (or maybe drop the jobs altogether). Issue: #9854 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-17 14:39:38 -03:00
Steve Horsman	4a41cee534	Merge pull request #9838 from zvonkok/gha-no-sudo CI: remove sudo from GHA	2024-06-17 16:23:39 +01:00
Wainer dos Santos Moschetta	e517167825	kata-deploy: always copy ci/install_yq.sh To build the build-kata-deploy image, it should be copied ci/install_yq.sh to tools/packaging/kata-deploy/local-build/dockerbuild as this script will install yq within the image. Currently, if tools/packaging/kata-deploy/local-build/dockerbuild/install_yq.sh exists then make won't copy it again. This can raise problems as, for example, the current update of yq version (commit `c99ba42d`) in ci/install_yq.sh won't force the rebuild of the build-kata-deploy image. Note: this isn't a problem on a fresh dev or CI environment. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-17 12:18:22 -03:00
Zvonko Kaiser	618121a654	release: Bump VERSIONS file to 3.6.0 Let's bump the VERSIONS file and start preparing for a new release of the project. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-17 12:06:46 +00:00
stevenhorsman	53659f1ede	libs: Update tokio dependencies - Bump tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:03:01 +01:00
stevenhorsman	35f6be97df	runtime-rs: Update tokio dependency - Bump tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 If possible it would be good to add the many runtime-rs creates into the runtime-rs workspace and provide a centralised version to avoid the updates in many places. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:03:01 +01:00
stevenhorsman	3bb1a67d80	agent-ctl: Update rustjail dependencies - Run `cargo update -p rustjail` to pick up rustjail's bump of tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:03:01 +01:00
stevenhorsman	d2d35d2dcc	runk: Update tokio dependencies - Bump tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:03:01 +01:00
stevenhorsman	adda401a8c	genpolicy: Update tokio dependencies - Bump tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:03:01 +01:00
stevenhorsman	b7928f465e	agent: Update tokio dependencies - Bump tokio to 1.38.0 to fix the security vulnerability https://rustsec.org/advisories/RUSTSEC-2024-0019 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-17 13:02:47 +01:00
Zvonko Kaiser	5c2f3f34a8	CI: remove sudo from GHA Now that all artifacts are owned by $USER we can start to remove sudo from our GHA Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-17 11:06:56 +00:00
Steve Horsman	cce735a09e	Merge pull request #9840 from stevenhorsman/bump-agent-rust-1.75.0 versions: Bump rust toolchain	2024-06-17 11:28:07 +01:00
Fupan Li	b218c4bc10	Merge pull request #9836 from lifupan/main_fix sandbox: fix the issue of double initial_size_manager config	2024-06-17 09:15:51 +08:00
Fabiano Fidêncio	9b5dd854db	Merge pull request #9726 from GabyCT/topic/unodeport tests: kbs: Use nodeport deployment from upstream trustee	2024-06-16 22:31:27 +02:00
Wainer dos Santos Moschetta	d4f664b73b	CI: disable run-kata-monitor-tests / run-monitor (containerd, lts) job The job has failed more than 50% on nightly CI. Remove it from the list of execution until we don't have a fix. Issue: #9853 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-14 16:27:04 -03:00
Wainer dos Santos Moschetta	cbf0b7ca7b	CI: disable run-basic-amd64-tests / run-nerdctl-tests (clh) job The job has failed more than 50% on nightly CI. Remove it from the list of execution until we don't have a fix. Issue: #9852 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-14 16:17:26 -03:00
Wainer dos Santos Moschetta	562820449e	CI: disable run-basic-amd64-tests / run-vfio (qemu) job The job has failed more than 50% on nightly CI. Remove it from the list of execution until we don't have a fix. The clh variation was disabled on commit `5f5274e699` so this change will actually result on all the VFIO jobs disabled. Instead of delete the entire entry from this workflow yaml (or comment the entry), I preferred to use `if: false` which will make the jobs appear on the UI as skipped. Issue: 9851 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-14 16:09:59 -03:00
GabyCT	4800e242a4	Merge pull request #9832 from GabyCT/topic/fixsets tests: setup: Improve setup script for kubernetes tests	2024-06-14 11:14:05 -06:00
Bo Chen	a68aeca356	Merge pull request #9575 from likebreath/0430/clh_v39.0 versions: Upgrade to Cloud Hypervisor v39.0	2024-06-14 09:10:19 -07:00
stevenhorsman	e23b929ba0	versions: Bump rust toolchain - Bump the rust version used to build the agent to 1.75.0 as agreed on in the AC meeting Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-14 16:45:16 +01:00
stevenhorsman	3fb176970f	dragonball: Fix device manager warning - Fix the lint error: ``` error: you seem to use `.enumerate()` and immediately discard the index --> src/device_manager/mod.rs:427:33 \| 427 \| for (_index, device) in self.virtio_devices.iter().enumerate() { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` by removing the unnecessary enumerate Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-14 16:45:16 +01:00
stevenhorsman	1ea2671f2f	dragonball: Fix lint with rust 1.75.0 The ci failed with: ``` error: use of `or_insert_with` to construct default value --> src/address_space_manager.rs:650:14 \| 650 \| .or_insert_with(NumaNode::new); \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try: `or_default()` \| ``` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-14 16:45:16 +01:00
Steve Horsman	ab8a9882c1	Merge pull request #9818 from EmmEff/fix-spelling runtime: fix minor spelling issues	2024-06-14 13:12:56 +01:00
Steve Horsman	99bf95f773	Merge pull request #9827 from littlejawa/fix_panic_on_metrics_gathering runtime: avoid panic on metrics gathering	2024-06-14 11:12:43 +01:00
Steve Horsman	3eba4211f3	Merge pull request #9843 from microsoft/danmihai1/install_yq ci: fix the expected yq version string	2024-06-14 10:26:21 +01:00
Pavel Mores	380f8ad03f	runtime-rs: add base vCPU hotplugging support We take advantage of the Inner pattern to enable QemuInner::resize_vcpu() take `&mut self` which we need to call non-const functions on Qmp. This runs on Intel architecture but will need to be verified and ported (if necessary) to other architectures in the future. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-14 10:13:32 +02:00
Pavel Mores	8231c6c4a3	runtime-rs: instantiate Qmp as (optional) member of QemuInner The QMP_SOCKET_FILE constant in cmdline_generator.rs is made public to make it accessible from QemuInner. This is fine for now however if the constant needs to be accessed from additional places in the future we could consider moving it to somewhere more visible. The Debug impl for Qmp is empty since first, we don't actually want it, it's only forced by Hypervisor trait bounds, and second, it doesn't have anything to display anyway. If Qmp gets any members in the future that can be meaningfully displayed they should be handled by Qmp's Debug::fmt(). Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-14 10:13:32 +02:00
Pavel Mores	6fdb262dca	runtime-rs: add Qmp object to encapsulate QMP functionality The constructor handles QMP connection initialisation, too, so there can be non-functional Qmp instance. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-06-14 10:13:32 +02:00
Manuel Huber	62fd84dfd8	build: allow rootfs builds w/o git or VERSION file deps We set the VERSION variable consistently across Makefiles to 'unknown' if the file is empty or not present. We also use git commands consistently for calculating the COMMIT, COMMIT_NO variables, not erroring out when building outside of a git repository. In create_summary_file we also account for a missing/empty VERSION file. This makes e.g. the UVM build process in an environment where we build outside of git with a minimal/reduced set of files smoother. Signed-off-by: Manuel Huber <mahuber@microsoft.com>	2024-06-13 22:46:52 +00:00
Dan Mihai	824287d64a	Merge pull request #9844 from microsoft/danmihai1/k8s-policy-pvc tests: fix yq command line in k8s-policy-pvc	2024-06-13 15:07:15 -07:00
Wainer dos Santos Moschetta	73ab5942fb	tests/k8s: run for qemu-runtime-rs on AKS The following tests are disabled because they fail (alike with dragonball): - k8s-cpu-ns.bats - k8s-number-cpus.bats - k8s-sandbox-vcpus-allocation.bats Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-13 16:20:59 -03:00
Mike Frisch	c2f61b0fe3	runtime: spelling fixes Minor spelling fixes in runtime log messages. Signed-off-by: Mike Frisch <mikef17@gmail.com>	2024-06-13 12:11:34 -04:00
Dan Mihai	56f9e23710	tests: fix yq command line in k8s-policy-pvc Fix the collision between: - https://github.com/kata-containers/kata-containers/pull/9377 - https://github.com/kata-containers/kata-containers/pull/9706 One enabled a newer yq command line format and the other used the older format. Both passed CI because they were not tested together. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-06-13 16:06:15 +00:00
Dan Mihai	23e99e264c	ci: fix the expected yq version string I get: ~/gopath/bin/yq --version yq (https://github.com/mikefarah/yq/) version v4.40.7 Also add support for set -o xtrace to install_yq.sh. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-06-13 15:52:26 +00:00
Ryan Savino	0430794952	qemu: upgrade to 8.2.4 There is a known issue in qemu 7.2.0 that causes kernel-hashes to fail the verification of the launch binaries for the SEV legacy use case. Upgraded to qemu 8.2.4. new available features disabled. Fixes: #9148 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-13 10:19:42 -05:00
Greg Kurz	b85b1c1058	Merge pull request #9790 from gkurz/kill-some-dead-runtime-code Kill some dead runtime code	2024-06-13 15:45:51 +02:00
gaohuatao	4cb4e44234	runtime-rs: fix the bug of func count_files When the total number of files observed is greater than limit, return -1 directly. runtime has fixed this bug, it should b ported to runtime-rs. Fixes:#9829 Signed-off-by: gaohuatao <gaohuatao@bytedance.com>	2024-06-13 16:02:33 +08:00
Fupan Li	cd68ef372f	sandbox: fix the issue of double initial_size_manager config It shouldn't call the initial_size_manager's setup_config in the load_config since it had been called in the sandbox's try_init function. Fixes: #9778 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-06-13 15:44:51 +08:00
Fupan Li	61687992f4	sandbox: fix the issue of failed to get the vmm master tid For kata container, the container's pid is meaning less to containerd/crio since the container's pid is belonged to VM, and containerd/crio couldn't use it. Thus we just return any tid of kata shim or hypervisor. But since the hypervisor had been stopped before deleting the container, and it wouldn't get the hypervisor's tid for some supported hypervisor, thus we'd better to return the kata shim's pid instead of hypervisor's tid. Fixes: #9777 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-06-13 10:27:04 +08:00
Fabiano Fidêncio	56423cbbfe	Merge pull request #9706 from burgerdev/burgerdev/genpolicy-devices genpolicy: add support for devices	2024-06-12 23:03:41 +02:00
Wainer Moschetta	d971e5ae68	Merge pull request #9537 from wainersm/kata-deploy-crio kata-deploy: configuring CRI-O for guest-pull image pulling	2024-06-12 17:27:00 -03:00
Gabriela Cervantes	c36c300fd6	tests: kbs: Use nodeport deployment from upstream trustee This PR uses the nodeport deployment from upstream trustee. To ensure our deployment is as close to upstream trustee replace the custom nodeport handling and replace it with nodeport kustomized flavour from the trustee project. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-12 20:01:59 +00:00
Gabriela Cervantes	0066aebd84	tests: setup: Improve setup script for kubernetes tests This PR makes general improvements like definition of variables and the use of them to improve the general setup script for kubernetes tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-12 19:39:54 +00:00
GabyCT	461b6e7c93	Merge pull request #9821 from GabyCT/topic/fixts metrics: Use function definition to have uniformity	2024-06-12 10:04:28 -06:00
Fabiano Fidêncio	3a0247ed43	Merge pull request #9819 from stevenhorsman/config-envvar-precedence agent: config: Ensure envs take precedence	2024-06-12 11:26:02 +02:00
Julien Ropé	9c86eb1d35	runtime: avoid panic on metrics gathering While running with a remote hypervisor, whenever kata-monitor tries to access metrics from the shim, the shim does a "panic" and no metric can be gathered. The function GetVirtioFsPid() is called on metrics gathering, and had a call to "panic()". Since there is no virtiofs process for remote hypervisor, the right implementation is to return nil. The caller expects that, and will skip metrics gathering for virtiofs. Fixes: #9826 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-06-12 10:02:44 +02:00
Xuewei Niu	92cc5e0adb	Merge pull request #9781 from gaohuatao-1/ght/shm	2024-06-12 12:39:28 +08:00
Moritz Sanft	84903c898c	genpolicy: fix settings path flag name This corrects the warning to point to the \`-j\` flag, which is the correct flag for the JSON settings file. Previously, the warning was confusing, as it pointed to the \`-p\` flag, which specifies to the path for the Rego ruleset. Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>	2024-06-11 21:17:18 +02:00
Greg Kurz	1acf8d0c35	govmm: Drop QEMU's `NoShutdown` knob Code is not used. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-06-11 19:55:54 +02:00
Greg Kurz	cb5b548ad7	govmm: Drop QEMU's `Daemonize` knob Code isn't used anymore. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-06-11 19:55:54 +02:00
Greg Kurz	33eaf69d5f	virtcontainers: Drop QEMU's `Daemonize` knob QEMU isn't started as daemon anymore and this won't change (see #5736 for details). Drop the related code. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-06-11 19:55:54 +02:00
Wainer Moschetta	f66a5b6287	Merge pull request #9807 from wainersm/qemu-rs_kata-deploy kata-deploy: add qemu-runtime-rs runtimeClass	2024-06-11 14:50:01 -03:00
Dan Mihai	d47f40210a	Merge pull request #9808 from microsoft/saulparedes/oci_from_settings genpolicy: load OCI version from settings	2024-06-11 10:42:04 -07:00
Gabriela Cervantes	a96ff49060	metrics: Use function definition to have uniformity This PR uses the function definition to have uniformity across all the launch times script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-11 17:36:08 +00:00
Saul Paredes	3e9d6c11a1	genpolicy: add back support for insecure registries Adding back changes from `77540503f9`. Fixes: #9008 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-06-11 09:42:23 -07:00
Bo Chen	2398442c58	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v39.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #8694, #9574 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-06-11 09:42:17 -07:00
Bo Chen	7a82894502	versions: Upgrade to Cloud Hypervisor v39.0 This patch upgrades Cloud Hypervisor to v39.0 from v36.0, which contains fixes of several security advisories from dependencies. Details can be found from #9574. Fixes: #8694, #9574 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-06-11 09:42:16 -07:00
Wainer dos Santos Moschetta	be9990144a	workflow: run kata-deploy tests to qemu-runtime-rs on AKS Start testing the ability of kata-deploy to install and configure the qemu-runtime-rs runtimeClass. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-11 12:58:47 -03:00
Wainer dos Santos Moschetta	4f398cc969	kata-deploy: add qemu-runtime-rs runtimeClass Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass which ties to qemu hypervisor implementation in rust for the runtime-rs. Fixes: #9804 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-11 12:58:47 -03:00
stevenhorsman	40e02b34cb	agent: config: Ensure envs take precedence - Update the config parsing logic so that when reading from the agent-config.toml file any envs are still processed - Add units tests to formalise that the envs take precedence over values from the command line and the config file Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-06-11 16:31:10 +01:00
Steve Horsman	59ff40f054	Merge pull request #9811 from mkulke/mkulke/use-kebabcase-for-enum-values-in-config-file-parsing agent: convert enum vals to kebab-case in cfg file	2024-06-11 14:49:30 +01:00
gaohuatao	638e9acf89	runtime: fix the bug of func countFiles When the total number of files observed is greater than limit, return (-1, err). When the returned err is not nil, the func countFiles should return -1. Fixes:#9780 Signed-off-by: gaohuatao <gaohuatao@bytedance.com>	2024-06-11 18:17:18 +08:00
Alex Lyn	1c8db85d54	Merge pull request #9784 from Apokleos/bufix-testcases kata-types: fix bug in kata-types several test cases	2024-06-11 10:01:45 +08:00
Saul Paredes	6a84562c16	genpolicy: load OCI version from settings Load OCI version from genpolicy-settings.json and validate it in rules.rego Fixes: #9593 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-06-10 15:30:39 -07:00
GabyCT	0c5849b68b	Merge pull request #9809 from microsoft/danmihai1/yq-breaking-change tests: k8s: use newer yq command line format	2024-06-10 16:29:59 -06:00
Wainer Moschetta	ade69e44f9	Merge pull request #9785 from BbolroC/kubectl-retry CI: Introduce retry mechanism for kubectl in gha-run.sh	2024-06-10 18:33:34 -03:00
Magnus Kulke	abc704a720	agent: convert enum vals to kebab-case in cfg file fixes #9810 Add an annotation to the enum values in the agent config that will deserialize them using a kebab-case conversion, aligning the behaviour to parsing of params specified via kernel cmdline. drive-by fix: add config override for guest_component_procs variable Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>	2024-06-10 21:55:05 +02:00
Dan Mihai	32198620a9	tests: k8s: use newer yq command line format Fix the recent collision between: - https://github.com/kata-containers/kata-containers/pull/9377 - https://github.com/kata-containers/kata-containers/pull/9725 One enabled a newer yq command line format and the other used the older format. Both passed CI because they were not tested together. Fixes: #9789 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-06-10 18:48:25 +00:00
Dan Mihai	079a0a017c	Merge pull request #9557 from portersrc/ci-debug-output-nydus-pod CI: describe pod on k8s-create-pod wait failure	2024-06-10 08:17:54 -07:00
Ryan Savino	84280115f6	Merge pull request #9151 from niteeshkd/nd_snp_kernel_hashes runtime: enable kernel-hashes for SNP confidential container	2024-06-07 18:19:51 -05:00
GabyCT	03bcc167a4	Merge pull request #9779 from GabyCT/topic/fixcoscript tests: Fix indentation in common script	2024-06-07 15:37:10 -06:00
Wainer Moschetta	7a28535277	Merge pull request #9800 from fidencio/topic/ci-tdx-re-enable-some-of-the-tests ci: tdx: Re-enable a bunch of volume related tests	2024-06-07 16:17:19 -03:00
Hyounggyu Choi	8ff128dda8	CI: Introduce retry mechanism for kubectl in gha-run.sh Frequent errors have been observed during k8s e2e tests: - The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port? - Error from server (ServiceUnavailable): the server is currently unable to handle the request - Error from server (NotFound): the server could not find the requested resource These errors can be resolved by retrying the kubectl command. This commit introduces a wrapper function in common.sh that runs kubectl up to 3 times with a 5-second interval. Initially, this change only covers gha-run.sh for Kubernetes. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-07 18:24:19 +02:00
Fabiano Fidêncio	81c221c1b4	ci: k8s: tdx: Re-enable volume tests It seems I was very lose on disabling some of the tests, and the issues I faced could be related to other instabilities in the CI. Let's re-enable this one, following what was done for the SEV, SNP, and coco-qemu-dev. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 18:13:36 +02:00
Fabiano Fidêncio	9db9d35198	ci: k8s: tdx: Re-enable projected-volume tests It seems I was very lose on disabling some of the tests, and the issues I faced could be related to other instabilities in the CI. Let's re-enable this one, following what was done for the SEV, SNP, and coco-qemu-dev. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 18:12:36 +02:00
Fabiano Fidêncio	f6a6cba8ca	ci: k8s: tdx: Re-enable nested-configmap-secret tests It seems I was very lose on disabling some of the tests, and the issues I faced could be related to other instabilities in the CI. Let's re-enable this one, following what was done for the SEV, SNP, and coco-qemu-dev. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 18:12:06 +02:00
Fabiano Fidêncio	957d0cccf6	ci: k8s: tdx: Re-enable inotify tests It seems I was very lose on disabling some of the tests, and the issues I faced could be related to other instabilities in the CI. Let's re-enable this one, following what was done for the SEV, SNP, and coco-qemu-dev. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 18:10:39 +02:00
Fabiano Fidêncio	fc6f662ae0	ci: k8s: tdx: Re-enable credentials-secrets tests It seems I was very lose on disabling some of the tests, and the issues I faced could be related to other instabilities in the CI. Let's re-enable this one, following what was done for the SEV, SNP, and coco-qemu-dev. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 18:08:29 +02:00
Fabiano Fidêncio	5741c6d3e6	Merge pull request #9768 from fidencio/topic/ci-tdx-enable-cdh-test ci: kbs: Enable CDH tests for TDX	2024-06-07 17:59:12 +02:00
Greg Kurz	afeb98d73f	Merge pull request #9782 from ldoktor/ci-centos-9 ci.ocp: Switch base to centos-9	2024-06-07 13:15:02 +02:00
Fabiano Fidêncio	fde457589e	ci: kbs: tdx: Enable basic attestation tests Let's stop skipping the CDH tests for TDX, as know we should have an environmemnt where it can run and should pass. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 12:18:50 +02:00
Fabiano Fidêncio	cac525059e	ci: kbs: tdx: Use the hostname ip instead of localhost for the PCCS We must ensure we use the host ip to connect to the PCCS running on the host side, instead of using localhost (which has a different meaning from inside the KBS pod). The reason we're using `hostname -i` isntead of the helper functions, is because the helper functions need the coco-kbs deployed for them to work, and what we do is before the deployment. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-06-07 12:18:07 +02:00
Alex Lyn	27685c91e5	kata-types: fix bug in kata-types several test cases (1) As mis-use of cap.set causing previous Caps lost which causing assert! failed, just replacing cap.set with cap.add. (2) It will return error if there's no such name setting when do update_config_by_annotation { ... if config.runtime.name.is_empty() { return Err(io::Error::new( io::ErrorKind::InvalidData, "Runtime name is missing in the configuration", )); } ... } Fixes #9783 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-06-07 09:16:23 +08:00
David Esparza	822c641b58	Merge pull request #9760 from amshinde/kata-manager-link-runc kata-manager: Add symlinks for runc and slirp4netns	2024-06-06 12:55:57 -06:00
Lukáš Doktor	699376c535	ci.ocp: Switch base to centos-9 Centos8 is EOL and repos are not available anymore. Centos9 contains the same packages and should do well as a base for testing. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-06-06 09:03:17 +02:00
Chris Porter	4172ccb3a0	CI: describe pod on k8s-create-pod wait failure This is generally useful debug output on test failures, and specifically this has been useful for nydus-related issues recently. Signed-off-by: Chris Porter <porter@ibm.com>	2024-06-05 12:37:53 -04:00
Gabriela Cervantes	264c7e9473	tests: Fix indentation in common script This PR fixes the indentation in common script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-05 15:52:40 +00:00
Niteesh Dubey	1dbf5208ac	versions: Upgrade ovmf This is required to support SEV-SNP confidential container with kernel-hashes. Since this ovmf is latest stable version, it is good to upgrade for tdx and Vanilaa builds too. Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-06-05 15:02:02 +00:00
Niteesh Dubey	62d3d7c58f	runtime: enable kernel-hashes for SNP confidential container This is required to provide the hashes of kernel, initrd and cmdline needed during the attestation of the coco. Fixes: #9150 Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-06-05 15:02:02 +00:00
Steve Horsman	b30d085271	Merge pull request #9702 from ildikov/blog-submission-guide docs: Adding blog submission guidelines	2024-06-05 09:03:19 +01:00
Amulya Meka	b323afeda9	Merge pull request #9214 from Amulyam24/oras kata-deploy: install oras using release artefacts on ppc64le	2024-06-05 11:40:55 +05:30
Fabiano Fidêncio	138ef2c55f	Merge pull request #9678 from AdithyaKrishnan/main TEEs: Skip a few CI tests for SEV/SNP	2024-06-04 23:42:51 +02:00
GabyCT	ba30f0804a	Merge pull request #9770 from GabyCT/topic/fixvad tests: Use variable definition for better uniformity	2024-06-04 15:23:34 -06:00
Wainer dos Santos Moschetta	af4f9afb71	kata-deploy: add PULL_TYPE handler for CRI-O A new PULL_TYPE environment variable is recognized by the kata-deploy's install script to allow it to configure CRIO-O for guest-pull image pulling type. The tests/integration/kubernetes/gha-run.sh change allows for testing it: ``` export PULL_TYPE=guest-pull cd tests/integration/kubernetes ./gha-run.sh deploy-k8s ``` Fixes #9474 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-06-04 14:02:01 -03:00
GabyCT	6c2e8bed77	Merge pull request #9725 from 3u13r/feat/genpolicy/filter-by-runtime genpolicy: add ability to filter for runtimeClassName	2024-06-04 10:06:14 -06:00
Hyounggyu Choi	869f89c338	Merge pull request #9773 from BbolroC/use-qemu-coco-dev-s390x GHA: Use qemu-coco-dev for k8s nydus test on s390x	2024-06-04 17:49:38 +02:00
Gabriela Cervantes	cafba23f3e	tests: Use variable definition for better uniformity This PR replaces the name to use a variable that is already defined to have a better uniformity across the general script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-06-04 15:49:27 +00:00
Wainer Moschetta	2b8cdd9ff2	Merge pull request #9765 from wainersm/disable_failing_jobs CI: disable jobs that failed > 50% on nightly CI recently - part 1	2024-06-04 12:05:36 -03:00
Hyounggyu Choi	246ee83768	GHA: Use qemu-coco-dev for k8s nydus test on s390x In line with the changes for x86_64, the k8s nydus test for s390x should also use `qemu-coco-dev` for `KATA_HYPERVISOR`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-04 15:49:23 +02:00
Hyounggyu Choi	3aff6c5bd8	CI: Retry fetching node_start_time when it is empty It was observed that the `node_start_time` value is sometimes empty, leading to a test failure. This commit retries fetching the value up to 3 times. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-06-04 15:41:15 +02:00
Zvonko Kaiser	647560539f	Merge pull request #9769 from zvonkok/initrd-image-no-sudo ci: remove sudo and make sure artifacts is owned by user	2024-06-04 07:16:51 +02:00
Wainer Moschetta	b5561074c3	Merge pull request #9377 from beraldoleal/yqbump deps: bumping yq to v4.40.7	2024-06-03 14:34:58 -03:00
Ildiko Vancsa	5e03bec26b	docs: Adding blog submission guidelines The Kata blog was recently moved to the project's website. The content of the blog is stored together with the rest of the website source on GitHub. This patch adds a short guide that describes how to submit a new blog post as a PR, to appear on the project's website. Signed-off-by: Ildiko Vancsa <ildiko.vancsa@gmail.com>	2024-06-03 08:58:05 -07:00
GabyCT	6c7affbd85	Merge pull request #9741 from GabyCT/topic/staticcheck tests: Fix indentation in static checks script	2024-06-03 09:43:23 -06:00
Zvonko Kaiser	a48c084e13	ci: remove sudo and make sure image is owed by user The image build needs special handling since we're doing a lot of privileged operations. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-06-03 15:29:06 +00:00
Fabiano Fidêncio	34d45f0868	Merge pull request #9749 from mkulke/mkulke/configure-guest-components-spawning CoCo: introduce config for guest-components procs	2024-06-03 15:50:36 +02:00
Ryan Savino	72dc823059	tests: k8s: sev: snp: skip "setting sysctl" test This test fails when using `shared_fs=none` with the nydus snapshotter. Issue tracked here: #9666 Skipping for now. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:17 -05:00
Ryan Savino	3f3be54893	tests: k8s: sev: snp: skip initContainers shared vol test This test is failing due to the initContainers not being properly handled with the guest image pulling. Issue tracked here: #9668 Skipping for now. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:17 -05:00
Ryan Savino	35dfb730ce	tests: k8s: sev: snp: skip "kill all processes in container" test This test fails when using `shared_fs=none` with the nydus napshotter, Issue tracked here: #9664 Skipping for now. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:16 -05:00
Ryan Savino	62cc1dec4c	tests: replace docker debug alpine image with ghcr docker alpine latest image is rate limited. Need to use ghcr.io image. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:16 -05:00
ChengyuZhu6	1820b02993	tests: replace busybox from docker with quay in guest pull To prevent download failures caused by high traffic to the Docker image, opt for quay.io/prometheus/busybox:latest over docker.io/library/busybox:latest . Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-06-03 01:14:16 -05:00
Ryan Savino	6c646dc96d	tests: k8s: sev: snp: add runtime annotation for sev and snp sev and snp cases added to the KATA_HYPERVISOR switch. Signed-off-by: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:16 -05:00
Ryan Savino	6db08ed620	runtime: sev: snp: Use shared_fs=none Disabling 9p for SEV and SNP TEEs. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:16 -05:00
Ryan Savino	668959408d	tests: ensure kata_deploy cleanup even if namespace deletion fails the test cluster namespace deletion failing causes kata_deploy to not get cleaned up. Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-06-03 01:14:15 -05:00
Wainer dos Santos Moschetta	c9f93fc507	github: add actionlint configuration file Added configuration file with rules to exclude some self-hosted runners from the linter warnings. Related-with: #9646 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-31 19:46:09 -03:00
Wainer dos Santos Moschetta	5f5274e699	CI: disable run-basic-amd64-tests / run-vfio (clh) job The job has failed more than 50% on nightly CI. Remove it from the list of execution until we don't have a fix. Issue: 9764 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-31 19:34:45 -03:00
Wainer dos Santos Moschetta	9154ce9051	CI: disable run-basic-amd64-tests / run-tracing jobs These jobs have failed more than 50% on nightly CI. Remove them from the list of execution until we don't have a fix. Issue: 9763 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-31 19:26:58 -03:00
Wainer dos Santos Moschetta	ac4d48ad17	CI: disable run-kata-monitor-tests / run-monitor (qemu, containerd) job This job has failed more than 50% on nightly CI. Remove it from the list of execution until we don't have a fix. Issue: 9761 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-31 19:21:21 -03:00
Archana Shinde	7a3e13fae8	kata-manager: Add symlinks for runc and slirp4netns For nerdctl install, add symlinks for runc and slirp4netns in the binary install path. runc link comes in handy for running runc containers with nerdctl fir quick tests. slirp4netns allows for running containers with user mode networking useful in case of rootless containers. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-05-31 13:53:42 -07:00
Markus Rudy	13310587ed	genpolicy: check requested devices CreateContainerRequest objects can specify devices to be created inside the guest VM. This change ensures that requested devices have a corresponding entry in the PodSpec. Devices that are added to the pod dynamically, for example via the Device Plugin architecture, can be allowlisted globally by adding their definition to the settings file. Fixes: #9651 Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-05-31 22:05:49 +02:00
Wainer Moschetta	f093c4c190	Merge pull request #9754 from wainersm/qemu_coco_dev-enable_policy_tests tests/k8s: enable policy tests for qemu-coco-dev	2024-05-31 15:09:25 -03:00
Markus Rudy	ea578f0a80	genpolicy: add support for VolumeDevices This adds structs and fields required to parse PodSpecs with VolumeDevices and PVCs with non-default VolumeModes. Signed-off-by: Markus Rudy <mr@edgeless.systems>	2024-05-31 19:34:14 +02:00
Beraldo Leal	d3a5eb299a	tools: bumping kernel config version Lets make ci happy. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	53b8158a81	tests: adding debug and skip to kata-deploy If a test is failing during setup, makes no much sense to run the suite. Let's skip and add some debug messages. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	9171821d57	tests: add debug message to check return code Lets add this message to make sure sh is starting properly. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	f91fbef184	tests: increase time after sh execution Increased sleep duration to ensure the shell process starts. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	ba5d2e54c2	tests: remove object separation mark from eof End of file should not end with --- mark. This will confuse tools like yq and kubectl that might think this is another object. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	3e8b4806b8	tests: increase debug messages for kata-deploy When the timeout happens we can't tell much information about the nodes. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	c99ba42d62	deps: bumping yq to v4.40.7 Since yq frequently updates, let's upgrade to a version from February to bypass potential issues with versions 4.41-4.43 for now. We can always upgrade to the newest version if necessary. Fixes #9354 Depends-on:github.com/kata-containers/tests#5818 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Beraldo Leal	4f6732595d	ci: skip go version check golang.mk is not ready to deal with non GOPATH installs. This is breaking test on s390x. Since previous steps here are installing go and yq our way, we could skip this aditional check. A full refactor to golang.mk would be needed to work with different paths. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-05-31 13:28:34 -04:00
Greg Kurz	7886ed6670	Merge pull request #9751 from wainersm/k8s_print_logs_on_fail tests/k8s: print logs on fail only (k8s-confidential-attestation.bats)	2024-05-31 14:47:27 +02:00
Fabiano Fidêncio	44df674232	Merge pull request #9757 from fidencio/topic/ci-tdx-skip-empty-dir-tests ci: k8s: Skip empty dir tests also for TDX	2024-05-31 13:18:35 +02:00
Magnus Kulke	9f04dc4c8b	agent: introduce config for coco attestion procs fixes #9748 A configuration option `guest_component_procs` has been introduced that indicates which guest component processes are supposed to be spawned by the agent. The default behaviour remains that all of those processes are actively spawned by the agent. At the moment this is based on presence of binaries in the rootfs and the guest_component_api_rest option. The new option is incremental: none -> attestation-agent -> confidential-data-hub -> api-server-rest e.g. api-server-rest implies attestation-agent and confidential-data-hub the `none` option has been removed from guest_component_api_rest, since this is addresses by the introduced option. To not change expected behaviour for non-coco guests we still will still only attempt to spawn the processes if the requested attestation binaries are present on the rootfs, and issue in warning in those cases. Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>	2024-05-31 12:15:41 +02:00
Amulyam24	eadcb868f4	kata-deploy: install oras using release artefacts on ppc64le We are currently building Oras from source on ppc64le. Now that they offically release the artefacts for power, consume them to install Oras. Fixes: #9213 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-05-31 14:16:14 +05:30
Zvonko Kaiser	0321a3adcc	Merge pull request #8944 from zvonkok/update-threat-model threat-model: Add VFIO, ACPI and KVM/VMM threat-model descriptions	2024-05-31 10:38:27 +02:00
Fabiano Fidêncio	03a7cf4b02	ci: k8s: Skip empty dir tests also for TDX Wainer noticed this is failing for the coco-qemu-dev case, and decided to skip it, notifying me that he didn't fully understand why it was not failing on TDX. Turns out, though, this is also failing on TDX, and we need to skip it there as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-31 09:59:46 +02:00
Fabiano Fidêncio	72a71ff2bf	Merge pull request #9737 from zvonkok/kata-deploy-no-sudo ci: kata-deploy no sudo	2024-05-31 09:55:24 +02:00
Zvonko Kaiser	dd89d35b75	Merge pull request #9747 from zvonkok/remove-git-config ci: Remove all git config safe.directory	2024-05-31 07:25:28 +02:00
Leonard Cohnen	1d1690e2a4	genpolicy: add ability to filter for runtimeClassName Add the CLI flag --runtime-class-names, which is used during policy generation. For resources that can define a runtimeClassName (e.g., Pods, Deployments, ReplicaSets,...) the value must have any of the --runtime-class-names as prefix, otherwise the resource is ignored. This allows to run genpolicy on larger yaml files defining many different resources and only generating a policy for resources which will be deployed in a confidential context. Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-05-31 03:17:02 +02:00
Wainer dos Santos Moschetta	3333f8ddfd	tests/k8s: enable policy tests for qemu-coco-dev So qemu-coco-dev is on pair with the TEE configurations. Fixes: #9753 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-30 21:51:15 -03:00
Wainer Moschetta	83fa813700	Merge pull request #9694 from wainersm/qemu_coco_dev-k8s-guest-pull tests: enable guest-pull on all k8s tests for the qemu-coco-dev configuration	2024-05-30 21:48:11 -03:00
Wainer dos Santos Moschetta	55ae98eb28	tests/k8s: print logs on fail only (k8s-confidential-attestation.bats) Use the variable BATS_TEST_COMPLETED which is defined by the bats framework when the test finishes. `BATS_TEST_COMPLETED=` (empty) means the test failed, so the node syslogs will be printed only at that condition. Fixes: #9750 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-30 17:19:33 -03:00
Wainer Moschetta	66e3b88694	Merge pull request #9746 from wainersm/nydus_snapshotter_pin ci: pin the nydus-snapshotter image version	2024-05-30 16:49:10 -03:00
Wainer dos Santos Moschetta	3e18fe7805	tests/k8s: skip file volume tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Issue: #9667 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-30 14:50:59 -03:00
Zvonko Kaiser	063db516f2	ci: Remove all git config safe.directory Now with the sudo less build we should be good to remove those hacks. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-30 15:12:28 +00:00
Zvonko Kaiser	d8889684f0	ci: kata-deploy no sudo Build/push/manage aritfacts without sudo Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-30 15:07:27 +00:00
Wainer dos Santos Moschetta	5faf9ca344	ci: pin the nydus-snapshotter image version It's cloning the nydus-snapshotter repo from the version specified in versions.yaml, however, the deployment files are set to pull in the latest version of the snapshotter image. With this version we are pinning the image version too. This is a temporary fix as it should be better worked out at nydus-snapshotter project side. Fixes: #9742 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-30 11:21:16 -03:00
Greg Kurz	b3cb19b6a7	Merge pull request #9639 from emanuellima1/rng-impl runtime-rs: Add RNG to QEMU cmdline	2024-05-30 12:00:11 +02:00
Zvonko Kaiser	7cc0ebe75e	Merge pull request #9743 from zvonkok/tools-fix ci: Fix tools builder images	2024-05-30 11:53:34 +02:00
Zvonko Kaiser	02a7f8c852	ci: Fix tools builder images We weren't considering changes of the tools script dir adding a fourth hash to accomodate this Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-30 08:10:42 +00:00
Fabiano Fidêncio	97806dbdaa	Merge pull request #9732 from zvonkok/shim-v2-no-sudo ci: shim-v2 no sudo	2024-05-30 07:01:04 +02:00
Wainer dos Santos Moschetta	37894923c1	tests/k8s: skip empty dir volumes tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	79a8b31ec5	tests/k8s: skip shared volume tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Issue: #9668 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	aa1a37081e	tests/k8s: skip sysctls tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Issue: #9666 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	0e81ced9f1	tests/k8s: skip kill-all-process tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Issue: #9664 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	18896efa3c	tests/k8s: skip seccomp tests for qemu-coco-dev This test fails with qemu-coco-dev configuration and guest-pull image pull. Unlike other tests that I've seen failing on this scenario, k8s-seccomp.bats fails after a couple of consecutive executions, so it's that kind of failure that happens once in a while. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	b62ad71c43	tests/k8s: add runtime handler annotation for qemu-coco-dev This will enable the k8s tests to leverage guest pulling when PULL_TYPE=guest-pull for qemu-coco-dev runtimeclass. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
Wainer dos Santos Moschetta	089c7ad84a	tests/k8s: add runtime handler annotation only for guest-pull The runtime handler annotation is required for Kubernetes <= 1.28 and guest-pull pull type. So leverage $PULL_TYPE (which is exported by CI jobs) to conditionally apply the annotation. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-29 18:37:24 -03:00
GabyCT	0eddfdc74f	Merge pull request #9731 from zvonkok/pause-no-sudo ci: pause-image no sudo	2024-05-29 11:48:41 -06:00
Zvonko Kaiser	7354c427f9	Merge pull request #9734 from zvonkok/virtiofsd-no-sudo ci: virtiofsd no sudo	2024-05-29 19:31:25 +02:00
GabyCT	3c91aa0475	Merge pull request #9739 from zvonkok/initramfs-no-sudo ci: initramfs no sudo	2024-05-29 11:28:59 -06:00
Hyounggyu Choi	40d2306f95	Merge pull request #9729 from zvonkok/agent-no-sudo-build ci: build agent without sudo	2024-05-29 19:27:56 +02:00
GabyCT	03be220482	Merge pull request #9730 from zvonkok/kernel-no-sudo ci: kernel no sudo	2024-05-29 10:23:31 -06:00
GabyCT	a32058913a	Merge pull request #9679 from amshinde/kata-manager-install-cni kata-manager: Copy cni files under /opt/cni	2024-05-29 10:20:34 -06:00
GabyCT	a5808a556d	Merge pull request #9733 from zvonkok/tools-no-sudo ci: tools no sudo	2024-05-29 10:19:17 -06:00
GabyCT	e94b09839d	Merge pull request #9736 from zvonkok/qemu-no-sudo ci: qemu no sudo	2024-05-29 10:18:34 -06:00
GabyCT	6d58fce4a9	Merge pull request #9677 from GabyCT/topic/memoryusags metrics: Improve variable definition in memory usage script	2024-05-29 10:16:56 -06:00
Emanuel Lima	138d985c64	runtime-rs: Add RNG to QEMU cmdline It creates this line, as the Golang runtime does: -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0 Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-05-29 13:11:00 -03:00
Hyounggyu Choi	6ba2461404	Merge pull request #9728 from zvonkok/coco-guest-comp-no-sudo ci: guest-components without sudo	2024-05-29 17:55:43 +02:00
Gabriela Cervantes	09c3e08f6a	tests: Fix indentation in static checks script This PR fixes the indentation in the static checks script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-29 15:43:44 +00:00
Xuewei Niu	c297a7891c	Merge pull request #9723 from zvonkok/hotunplug-fix vfio: Fix hot-unplug	2024-05-29 22:02:05 +08:00
Zvonko Kaiser	25c784c568	ci: shim-v2 no sudo Build shim-v2 without sudo docker this is not needed. This is part 6 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-29 09:24:54 +00:00
Zvonko Kaiser	84a9773cec	ci: initramfs no sudo BUild initramfs without sudo docker this is not needed. This is part 10 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-29 09:20:39 +00:00
Zvonko Kaiser	7dc47c8150	ci: qemu no sudo Build qemu without sudo docker this is not needed. This is part 9 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 16:12:06 +00:00
Zvonko Kaiser	4a455bf24a	ci: virtiofsd no sudo build virtiofsd without sudo docker this is not needed. This is part 8 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 14:19:58 +00:00
Wainer Moschetta	9896f69827	Merge pull request #9414 from ldoktor/ci-bisection ci.ocp: Document openshift pipeline and manual bisection	2024-05-28 11:17:09 -03:00
Zvonko Kaiser	dd04d26cb0	ci: tools no sudo Build tools without sudo docker this is not needed. This is part 7 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 13:57:20 +00:00
Zvonko Kaiser	6c9c0306ac	ci: pause-image no sudo Build pause-image without sudo docker this is not needed. This is part 5 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 11:31:59 +00:00
Hyounggyu Choi	e8c06301d7	Merge pull request #9727 from zvonkok/ovmf-no-sudo ci: ovmf without sudo	2024-05-28 13:29:00 +02:00
Zvonko Kaiser	c95ae5a502	ci: kernel no sudo Build kernel without sudo docker this is not needed. This is part 4 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 11:19:08 +00:00
Zvonko Kaiser	8fab5dd584	ci: build agent without sudo Build agent without sudo docker this is not needed. This is part 3 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 09:55:32 +00:00
Zvonko Kaiser	1e4cbc4fcd	ci: guest-components wihout sudo Build guest-components without sudo docker this is not needed. This is part 2 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 09:03:14 +00:00
Zvonko Kaiser	b76938b922	ci: ovmf without sudo Build ovmf without sudo docker this is not needed. This is part 1 of N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 08:25:27 +00:00
Zvonko Kaiser	c6c20ac253	docs: Format the threat-model to 80 chars Truncate long lines to reasonable 80 characters Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 07:39:26 +00:00
Zvonko Kaiser	d4832b3b74	vfio: Fix hotpunplug We need to remove the device from the tracking map, a container restart will increment the bus index and we will get out of root-ports and crash the machine. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-28 07:37:30 +00:00
Zvonko Kaiser	a7931115a0	Merge pull request #8861 from zvonkok/config-pcie-root-switch-port gpu: reintroduce pcie_root_port and add pcie_switch_port	2024-05-27 13:17:57 +02:00
Fabiano Fidêncio	3276bb52b6	Merge pull request #9721 from fidencio/topic/ci-kata-deploy-improvements-and-fixes kata-deploy / kata-cleanup / ci: Fixes and improvements to kata-deploy / kata-cleanup and its usage in the CI	2024-05-27 12:29:40 +02:00
Zvonko Kaiser	4c93bb2d61	qemu: Add CDI device handling for any container type We need special handling for pod_sandbox, pod_container and single_container how and when to inject CDI devices Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-27 10:13:01 +00:00
Zvonko Kaiser	c7b41361b2	gpu: reintroduce pcie_root_port and add pcie_switch_port In Kubernetes we still do not have proper VM sizing at sandbox creation level. This KEP tries to mitigates that: kubernetes/enhancements#4113 but this can take some time until Kube and containerd or other runtimes have those changes rolled out. Before we used a static config of VFIO ports, and we introduced CDI support which needs a patched contianerd. We want to eliminate the patched continerd in the GPU case as well. Fixes: #8860 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-27 10:13:01 +00:00
Fupan Li	6f6a164451	Merge pull request #9268 from zvonkok/kata-agent-createcontainer kata-agent: CreateContainer Hook	2024-05-27 16:36:22 +08:00
Fabiano Fidêncio	e81e8a4527	tests: kata-deploy: Adjust timeout 10 minutes is waay too long. Let's give it 4 minutes only. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 06:23:00 +02:00
Fabiano Fidêncio	fba5793c0d	tests: kata-deploy: Run the tests from "${repo_root_dir}" Let's see if it helps with issues like: ``` error: must build at directory: not a valid directory: evalsymlink failure on '"/home/runner/actions-runner/_work/kata-containers/kata-containers/tests/functional/kata-deploy/../../..//tools/packaging/kata-deploy/kata-cleanup/overlays/k0s"' : lstat /home/runner/actions-runner/_work/kata-containers/kata-containers/tests/functional/kata-deploy/": no such file or directory ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 06:23:00 +02:00
Fabiano Fidêncio	8a8a7ea0e5	tests: kata-deploy: Show more logs in the setup() This will also help us to better understand possible failures with the CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	47d9589e9b	tests: kata-deploy: Show output of passing tests This will help us to debug failures and compare passing and failures outputs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	dbd0d4a090	gha: Only do preventive cleanups for baremetal This takes a few minutes that could be saved, so let's avoid doing this on all the platforms, but simply do this when it's needed (the baremetal use case). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	ee2ef0641c	tests: k8s: Allow passing "all" to run all the tests Currently only "baremetal" runs all the tests, but we could easily run "all" locally or using the github provided runners, even when not using a "baremetal" system. The reason I'd like to have a differentiation between "all" and "baremetal" is because "baremetal" may require some cleanup, which "all" can simply skip if testing against a fresh created VM. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	556227cb51	tests: Add the possibility to deploy k0s / rke2 For now we've only exposed the option to deploy kata-deploy for k3s and vanilla kubernetes when using containerd. However, I do need to also deploy k0s and rke2 for an internal CI, and having those exposed here do not hurt, and allow us to easily expand the CI at any time in the future. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	e3c2f0b0f1	kata-cleanup: Add k0s kustomization k0s was added to kata-deploy, but it's kata-cleanup counterpart was never added. Let's fix it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Fabiano Fidêncio	f15d40f8fb	kata-deploy: Fix k0s deployment k0s deployment has been broken since we moved to using `tomlq` in our scripts. The reason is that before using `tomlq` our script would, involuntarily, end up creating the file. Now, in order to fix the situation, we need to explicitly create the file and let `tomlq` add the needed content. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-27 05:05:06 +02:00
Alex Lyn	713c929a64	Merge pull request #9656 from pmores/document-qemu-rs-conventions runtime-rs: document architecture & implementation conventions in qem…	2024-05-27 10:38:58 +08:00
Xuewei Niu	bb7a1c56e9	Merge pull request #9693 from sidneychang/9690/Adjust-indentation	2024-05-27 00:20:34 +08:00
Alex Lyn	55dbf6121a	Merge pull request #9604 from Apokleos/qmp-cmdline01 runtime-rs: add QMP support for Qemu(part I)	2024-05-26 20:22:59 +08:00
Alex Lyn	028b10ce7a	Merge pull request #9687 from l8huang/vfio-pci-gk agent: collect PCI address mapping for both vfio-pci-gk and vfio-pci device	2024-05-26 17:48:25 +08:00
Steve Horsman	b89c3e35dd	Merge pull request #9583 from cncal/update_check_error_message runtime: make kata-runtime check error more understandable when /dev/kvm doesn't exist	2024-05-24 17:49:43 +01:00
Alex Lyn	41fb7aeb89	runtime-rs: add QMP params suppport in cmdline Fixes: #9603 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-05-24 22:16:24 +08:00
Alex Lyn	7ed6c6896b	runtime-rs: add an option dbg_monitor_socket for HMP support This option allows to add a debug monitor socket when `enable_debug = true` to control QEMU within debugging case. Fixes: #9603 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-05-24 22:16:17 +08:00
Lei Huang	3624573b12	agent: collect PCI address mapping for both vfio-pci-gk and vfio-pci device The `update_env_pci()` function need the PCI address mapping to translate the host PCI address to guest PCI address in below environment variables: - PCIDEVICE_<prefix>_<resource-name>_INFO - PCIDEVICE_<prefix>_<resource-name> So collect PCI address mapping for both vfio-pci-gk and vfio-pci devices. Fixes #9614 Signed-off-by: Lei Huang <leih@nvidia.com>	2024-05-23 21:20:01 -07:00
Fupan Li	d73876252e	Merge pull request #9690 from justxuewei/agent-timeout runtime-rs: Remove obsoleted dial_timeout config	2024-05-24 10:31:12 +08:00
Zvonko Kaiser	3affd83e14	Merge pull request #9605 from l8huang/skip-env kata-agent: update env PCIDEVICE_<prefix>_<resource-name>_INFO	2024-05-23 18:45:00 +02:00
Fabiano Fidêncio	44d6cb7791	Merge pull request #9698 from wainersm/k8s_tests_disable_fail_fast tests/k8s: disable "fail-fast" behavior by default	2024-05-23 18:28:00 +02:00
Fabiano Fidêncio	d83cf39ba1	Merge pull request #9680 from kata-containers/dependabot/go_modules/src/runtime/go_modules-5e29427af7 build(deps): bump golang.org/x/net from 0.24.0 to 0.25.0 in /src/runtime in the go_modules group across 1 directory	2024-05-23 12:55:29 +02:00
Fabiano Fidêncio	d9ee950d8f	Merge pull request #9696 from wainersm/skip_custom_dns_test tests/k8s: skip custom DNS tests on confidential jobs	2024-05-22 23:57:21 +02:00
GabyCT	e08ad8d1b7	Merge pull request #9686 from GabyCT/topic/fixbootclh metrics: Fix minvalue for boot time	2024-05-22 15:46:50 -06:00
Wainer dos Santos Moschetta	76735df427	tests/k8s: disable "fail-fast" behavior by default The k8s test suite halts on the first failure, i.e., failing-fast. This isn't the behavior that we used to see when running tests on Jenkins and it seems that running the entire test suite is still the most productive way. So this disable fail-fast by default. However, if you still wish to run on fail-fast mode then just export K8S_TEST_FAIL_FAST=yes in your environment. Fixes: #9697 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-22 18:27:44 -03:00
Fabiano Fidêncio	8eb061cd5b	Merge pull request #9681 from GabyCT/topic/etdx gha: Enable install kbs and coco components for TDX, but still skip the CDH test	2024-05-22 23:18:42 +02:00
Wainer dos Santos Moschetta	43766cdb96	tests/k8s: skip custom DNS tests on confidential jobs This test has failed in confidential runtime jobs. Skip it until we don't have a fix. Fixes: #9663 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-05-22 17:08:22 -03:00
Fabiano Fidêncio	904370ecd6	tests: attestation: tdx: Skip test for now Skipping the test will allow us to have the TDX CI running while we debug the test. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:04:13 +02:00
Fabiano Fidêncio	414d716eef	tests: kbs: Enable cli installation also on CentOS One of our machines is running CentOS 9 Stream, and we could easily verify that we can build and install the kbs client there, thus we're expanding the installation script to also support CentOS 9 Stream. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:01:57 +02:00
Fabiano Fidêncio	27d7f4c5b8	tests: kbs: Fix rust installation `externals.coco-kbs.toolchain` is not defined, get the rust_version from `externals.coco-trustee.toolchain` instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:01:57 +02:00
Fabiano Fidêncio	fa8b5c76b8	tests: kbs: Add more info for the TDX deployment Ditto in the commit shortlog. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:01:57 +02:00
Fabiano Fidêncio	6ffd7b8425	versions: trustee: Bump version to 6adb8383309cbb7 We're bumping the version in order to bring in the customisation needed for setting up a custom pccs, which is needed for the KBS integration tests with Kata Containers + TDX. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:01:57 +02:00
Fabiano Fidêncio	dbd1fa51cd	tests: kbs: Don't assume /tmp/trustee exists in the machine Instead, check if the directory exists before pushd'ing into it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-22 20:01:57 +02:00
Gabriela Cervantes	f698caccc0	gha: Enable install kbs and coco components for TDX This PR enables the installation and unistallation of the kbs client as well as general coco components needed for the TDX GHA CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-22 20:01:57 +02:00
GabyCT	eaaab19763	Merge pull request #9685 from GabyCT/topic/fixic tests: Fix indentation in confidential common script	2024-05-22 11:53:33 -06:00
Gabriela Cervantes	29a10f1373	metrics: Fix minvalue for boot time This PR fixes the minvalue for boot time to avoid the random failures of the GHA CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-22 17:52:51 +00:00
GabyCT	0b32360ab4	Merge pull request #9684 from stevenhorsman/add-arch-to-component-cache-tags ci: cache: Add arch suffix to all cache tags	2024-05-22 09:24:28 -06:00
Fabiano Fidêncio	0e33ecf7fc	Merge pull request #9653 from JakubLedworowski/fixes-9497-ensure-quote-generation-service-is-added-to-qemu-cmd-2 runtime: Enable connection to Quote Generation Service (QGS)	2024-05-22 15:49:23 +02:00
sidneychang	8938f35627	runtime-rs: Adjust indentation in ifneq statements within Makefile. Replace tab indentation with spaces for the three lines within the ifneq statements, aligning them with the surrounding code. Fixes:#9692 Signed-off-by: sidneychang <2190206983@qq.com>	2024-05-22 20:24:35 +08:00
Fabiano Fidêncio	94f7bbf253	Merge pull request #9682 from fidencio/topic/allow-increasing-cpus-and-memory-via-annotation-for-tdx runtime: tdx: Allow default_{cpu,memory} annotations	2024-05-22 12:07:28 +02:00
Xuewei Niu	d31616cec3	runtime-rs: Remove obsoleted dial_timeout config The `dial_timeout` works fine for Runtime-go, but is obsoleted in Runtime-rs. When the pod cannot connect to the Agent upon starting, we need to adjust the `reconnect_timeout_ms` to increase the number of connection attempts to the Agent. Fixes: #9688 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-05-22 17:57:05 +08:00
Jakub Ledworowski	fc680139e5	runtime: Enable connection to Quote Generation Service (QGS) For the TD attestation to work the connection to QGS on the host is needed. By default QGS runs on vsock port 4050, but can be modified by the host owner. Format of the qemu object follows the SocketAddress structure, so it needs to be provided in the JSON format, as in the example below: -object '{"qom-type":"tdx-guest","id":"tdx","quote-generation-socket":{"type":"vsock","cid":"2","port":"4050"}}' Fixes: #9497 Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>	2024-05-22 11:16:24 +02:00
Alex Lyn	0331859740	Merge pull request #9642 from gkurz/drop-unused-knobs-qemu-rs runtime-rs: Drop some useless QEMU arguments	2024-05-22 16:13:14 +08:00
Alex Lyn	ce030d1804	Merge pull request #9641 from cmaf/runtime-resize-mem-1 runtime: Add missing check in ResizeMemory for CH	2024-05-22 14:05:30 +08:00
Alex Lyn	b7af00be2a	Merge pull request #9624 from cncal/bugfix_duplicated_devices runtime: fix duplicated devices requested to the agent	2024-05-22 12:45:46 +08:00
Steve Horsman	f41f642b90	Merge pull request #9635 from kata-containers/dependabot/go_modules/src/runtime/go_modules-f0df977846 build(deps): bump github.com/containerd/containerd from 1.7.11 to 1.7.16 in /src/runtime in the go_modules group across 1 directory	2024-05-21 21:19:32 +01:00
Steve Horsman	9b0ed3dfa7	Merge pull request #9657 from ajaypvictor/remote-hyp-annotations runtime: Disable number of cpu comparison on remote hypervisor scenario	2024-05-21 21:19:12 +01:00
Hyounggyu Choi	92101fc61f	Merge pull request #9658 from BbolroC/migrate-vfio-ap-test CI: Migrate vfio-ap test files from tests repo	2024-05-21 20:21:09 +02:00
Lei Huang	b0a91b0d13	kata-agent: update env PCIDEVICE_<prefix>_<resource-name>_INFO The new version of sriov-network-device-plugin adds an env `PCIDEVICE_<prefix>_<resource-name>_INFO`, which has a json value; kata-agent can't parse it as env `PCIDEVICE_<prefix>_<resource-name>` which has value in format "DDDD:BB:SS.F". This change updates env `PCIDEVICE_<prefix>_<resource-name>_INFO`. Signed-off-by: Lei Huang <leih@nvidia.com>	2024-05-21 10:46:41 -07:00
stevenhorsman	db4818fe1d	ci: cache: Enforce tag length limit Container tags can be a maximum of 128 characters long so calculate the length of the arch suffix and then restrict the tag to this length subtracted from 128 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-21 18:03:45 +01:00
Gabriela Cervantes	c9e91db16f	tests: Fix indentation in confidential common script This PR fixes the indentation in the confidential common script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-21 16:33:46 +00:00
stevenhorsman	d6afd77eae	ci: cache: Update agent cache to use the full commit hash - Previously I copied the logic that abbreviated the commit hash from the versioning, but looking at our versions.yaml the clear pattern is that when pointing at commits of dependencies we use the full commit hash, not the abbreviated one, so for consistency I think we should do the same with the components that we make available Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-21 16:51:16 +01:00
stevenhorsman	d46b6a3879	ci: cache: Add arch suffix to all cache tags As we have multi-arch builds for nearly all components, we want to ensure that all the cache tags we set have the architecture suffix, not just the `TARGET_BRANCH` one. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-21 11:25:07 +01:00
stevenhorsman	865fa9da15	runtime: Resolve go static-checks failure Remove `rand.Seed` call to resolve the following failure: ``` rand.Seed is deprecated: As of Go 1.20 there is no reason to call Seed with a random value. ``` The go rand.Seed docs: https://pkg.go.dev/math/rand@go1.20#Seed back this up and states: > If Seed is not called, the generator is seeded randomly at program startup. so I believe we can just delete the call. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-21 11:08:59 +01:00
Fabiano Fidêncio	abf52420a4	runtime: tdx: Allow default_{cpu,memory} annotations For now, let's allow the users to set the default_cpu and default_memory when using TDX, as they may hit issues related to the size of the container image that must be pulled and unpacked inside the guest, Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-21 10:26:39 +02:00
stevenhorsman	75a201389d	runtime: update go version in go.mod - Make due to us bumping the golang version used in our CI but `make vendor` fails without the go version in the runtime go.mod being increased, so update this and run go mod tidy Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-21 09:11:46 +01:00
dependabot[bot]	735185b15c	build(deps): bump github.com/containerd/containerd Bumps the go_modules group with 1 update in the /src/runtime directory: [github.com/containerd/containerd](https://github.com/containerd/containerd). Updates `github.com/containerd/containerd` from 1.7.11 to 1.7.16 - [Release notes](https://github.com/containerd/containerd/releases) - [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md) - [Commits](https://github.com/containerd/containerd/compare/v1.7.11...v1.7.16) --- updated-dependencies: - dependency-name: github.com/containerd/containerd dependency-type: direct:production dependency-group: go_modules ... Signed-off-by: dependabot[bot] <support@github.com>	2024-05-21 09:11:46 +01:00
Ajay Victor	abe607b0c7	runtime: Disable number of cpu comparison on remote hypervisor scenario Fixes https://github.com/kata-containers/kata-containers/issues/9238 Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>	2024-05-21 13:34:21 +05:30
dependabot[bot]	01868b2849	--- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: indirect dependency-group: go_modules ... Signed-off-by: dependabot[bot] <support@github.com>	2024-05-20 22:06:41 +00:00
Fabiano Fidêncio	8879e3bc45	Merge pull request #9452 from GabyCT/topic/tdxcoco gha: Add support to install KBS to k8s TDX GHA workflow	2024-05-20 23:28:52 +02:00
Fabiano Fidêncio	072b929b6f	Merge pull request #9660 from malt3/fix/genpolicy/namespace_empty_string genpolicy: detect empty string in ns as default	2024-05-20 21:34:13 +02:00
Gabriela Cervantes	cfdef7ed5f	tests/k8s: Use custom intel DCAP configuration This PR adds the use of custom Intel DCAP configuration when deploying the KBS. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-20 18:44:57 +00:00
Gabriela Cervantes	cace2fd340	metrics: Improve variable definition in memory usage script This PR improves general format like variable definition to have uniformity across the memory usage script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-20 16:14:59 +00:00
Fabiano Fidêncio	97056b017d	Merge pull request #9675 from stevenhorsman/release-build-tarballs-inherit-secrets gha: release: Set inherit secrets on tarball builds	2024-05-20 18:06:38 +02:00
Fabiano Fidêncio	b8b3bcc492	Merge pull request #9671 from bikesheddev/fix/kata-deploy-unbound-variable fix: kata-deploy.sh VERSION_ID unbound-variable	2024-05-20 17:22:55 +02:00
Fabiano Fidêncio	94cff3f74e	Merge pull request #9315 from fidencio/topic/adapt-TEEs-for-shared_fs-none TEEs: Use `shared_fs=none` for TDX	2024-05-20 17:17:36 +02:00
Fabiano Fidêncio	cffeb0ffb8	Merge pull request #9673 from fidencio/topic/revert-aks-workaround Revert "ci: azure: Workaround azure cli installation script"	2024-05-20 16:16:55 +02:00
stevenhorsman	f271983aeb	gha: release: Set inherit secrets on tarball builds Now we have updated the release builds to push artefacts to our registry for the release, so we can cache the images, we need to set `secrets: inherit` for all architecture's tarball builds so that we can log into quay.io and ghcr in those steps Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-20 14:19:17 +01:00
Fabiano Fidêncio	25c9cf32ff	Revert "ci: azure: Workaround azure cli installation script" This reverts commit `5ff53e4d1c`, as the script was fixed by MSFT, at least according to: https://github.com/Azure/azure-cli/issues/28984 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-20 14:38:46 +02:00
vac (Brendan)	d812007b99	kata-deploy: Fix unbound VERSION_ID VERSION_ID is not guaranteed to be specified in os-release, this makes kaka-deploy breaks in rolling distros like arch linux and void linux. Note that operating system vendors may choose not to provide version information, for example to accommodate for rolling releases. In this case, VERSION and VERSION_ID may be unset. Applications should not rely on these fields to be set. Signed-off-by: vac <dot.fun@protonmail.com>	2024-05-20 19:48:31 +08:00
Tim Zhang	857d2bbc8e	agent: Fix ctr exec stuck problem Fixes: #9532 Close stdin when write_stdin receives data of length 0. Stop call notify_term_close() in close_stdin, because it could discard stdout unexpectedly. Signed-off-by: Tim Zhang <tim@hyper.sh>	2024-05-20 14:52:14 +08:00
Fabiano Fidêncio	e8ebe18868	tests: k8s: tdx: Skip liveness probe test This test doesn't fail with the guest image pulling, but it for sure should. :-) We can see in the bats logs, something like: ``` Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 31s default-scheduler Successfully assigned kata-containers-k8s-tests/liveness-exec to 984fee00bd70.jf.intel.com Normal Pulled 23s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 345ms (345ms including waiting) Normal Started 21s kubelet Started container liveness Warning Unhealthy 7s (x3 over 13s) kubelet Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory Normal Killing 7s kubelet Container liveness failed liveness probe, will be restarted Normal Pulled 7s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 389ms (389ms including waiting) Warning Failed 5s kubelet Error: failed to create containerd task: failed to create shim task: the file /bin/sh was not found: unknown Normal Pulling 5s (x3 over 23s) kubelet Pulling image "quay.io/prometheus/busybox:latest" Normal Pulled 4s kubelet Successfully pulled image "quay.io/prometheus/busybox:latest" in 342ms (342ms including waiting) Normal Created 4s (x3 over 23s) kubelet Created container liveness Warning Failed 3s kubelet Error: failed to create containerd task: failed to create shim task: failed to mount /run/kata-containers/f0ec86fb156a578964007f7773a3ccbdaf60023106634fe030f039e2e154cd11/rootfs to /run/kata-containers/liveness/rootfs, with error: ENOENT: No such file or directory: unknown Warning BackOff 1s (x3 over 3s) kubelet Back-off restarting failed container liveness in pod liveness-exec_kata-containers-k8s-tests(b1a980bf-a5b3-479d-97c2-ebdb45773eff) ``` Let's skip it for now as we have an issue opened to track it down: https://github.com/kata-containers/kata-containers/issues/9665 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 21:59:29 +02:00
Fabiano Fidêncio	a2c70222a8	tests: k8s: tdx: Skip initContainerd shared vol test This is another one that is related to initContainers not being properly handled with the guest image pulling. Let's skip it for now as we have https://github.com/kata-containers/kata-containers/issues/9668 to track it down. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 20:58:45 +02:00
Fabiano Fidêncio	9d56145499	tests: k8s: tdx: Skip volume related tests Similarly to firecracker, which doesn't have support for virtio-fs / virtio-9p, TDX used with `shared_fs=none` will face the very same limitations. The tests affected are: * k8s-credentials-secrets.bats * k8s-file-volume.bats * k8s-inotify.bats * k8s-nested-configmap-secret.bats * k8s-projected-volume.bats * k8s-volume.bats Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 19:38:49 +02:00
Fabiano Fidêncio	606a62a0a7	tests: k8s: tdx: Skip "Setting sysctl" test This test fails when using `shared_fs=none` with the nydus-snapshotter, and we're tracking the issue here: https://github.com/kata-containers/kata-containers/issues/9666 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 19:38:38 +02:00
Fabiano Fidêncio	937b2d5806	tests: k8s: tdx: Skip "Kill all processes in container" test This test fails when using `shared_fs=none` with the nydus snapshotter, and we're tracking the issue here: https://github.com/kata-containers/kata-containers/issues/9664 For now, let's have it skipped. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:51:14 +02:00
Fabiano Fidêncio	03ce41b743	tests: k8s: tdx: Skip "Check custom dns" test The test has been failing on TDX for a while, and an issue has been created to track it down, see: https://github.com/kata-containers/kata-containers/issues/9663 For now, let's have it skipped. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:51:14 +02:00
Fabiano Fidêncio	1a8a4d046d	tests: k8s: setup: Improve / Fix logs Let's make sure the logs will print the correct annotation and its value, instead of always mentioning "kernel" and "initrd". Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:51:14 +02:00
Fabiano Fidêncio	3f38309c39	tests: k8s: tdx: Stop running `k8s-guest-pull-image.bats` We're doing that as all tests are going to be running with `shared_fs=none`, meaning that we don't need any specific test for this case anymore. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:51:00 +02:00
Fabiano Fidêncio	e84619d54b	tests: k8s: tdx: Add `add_runtime_handler_annotations` function This function will set the needed annotation for enforcing that the image pull will be handled by the snapshotter set for the runtime handler, instead of using the default one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:49:07 +02:00
Fabiano Fidêncio	f2de259387	runtime: tdx: Use shared_fs=none We shouldn't be using 9p, at all, with TEEs, as off right now we have no way to ensure the channels are encrypted. The way to work this around for now is using guest pull, either with containerd + nydus snapshotter or with CRI-O; or even tardev snapshotter for pulling on the host (which is the approach used by MSFT). This is only done for TDX for now, leaving the generic, AMD, and IBM related stuff for the folks working on those to switch and debug possible issues on their environment. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-19 18:47:09 +02:00
Fabiano Fidêncio	5b257685d9	Merge pull request #9662 from dborquez/fix_launchtimes_timestamp_generation Fix launch times timestamp generation.	2024-05-18 21:11:09 +02:00
Fabiano Fidêncio	94786dc939	Merge pull request #9659 from stevenhorsman/remove-non-printable-tag-characters ci: cache: Filter out non-printable characters from tag	2024-05-18 14:47:07 +02:00
Fabiano Fidêncio	874cda0e51	Merge pull request #9655 from BbolroC/add-arch-to-initramfs CI: Append arch type to initramfs-cryptsetup image	2024-05-18 14:31:57 +02:00
Malte Poll	babdab9078	genpolicy: detect empty string in ns as default In Kubernetes, the following values for namespace are equivalent and all refer to the default namespace: - ` ` (namespace field missing) - `namespace: ""` (namespace field is the empty string) - `namespace: "default"`(namespace field has the explicit value `default`) Genpolicy currently does not handle the empty string case correctly. Signed-Off-By: Malte Poll <1780588+malt3@users.noreply.github.com>	2024-05-18 12:44:59 +02:00
Fabiano Fidêncio	cbfdc70a55	Merge pull request #9613 from fidencio/topic/skip-pull-image-tests-on-tees-part-II tests: pull-image: Only skip tests for TEEs	2024-05-18 03:31:38 +02:00
Archana Shinde	0e28e904e0	kata-manager: Install cni for containerd When just containerd is installed without installing nerdctl, cni plugins are missing from the installation. containerd tarball does not include cni plugin files. Hence install cni plugins separately for containerd. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-05-18 00:19:57 +00:00
Archana Shinde	d23d58a484	kata-manager: Copy cni files under /opt/cni nerdctl requires cni plugins to be installed in /opt/cni/bin Without bridge plugin installed, it is not possible to run a container with nerdctl. The downloaded nerdctl tarball contains cni plugin files, but are extracted under /usr/local/libexec. Copy extracted tarball cni files under /usr/local/libexec to /opt/cni/bin Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-05-18 00:16:48 +00:00
David Esparza	938d3dc430	metrics: fix timestamps generation from launch times test. Use `eval` to process the `date` command along with its parameters, thus avoiding misinterpreting the parameters as commands. Fixes: #9661 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-05-17 14:44:41 -06:00
David Esparza	bae377b42a	metrics: determine the realpath of kata-shim component. Determine the realpath of kata-shim avoiding the check fails in case the kata-shim is not a symlink, as was happening prior to this commit. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-05-17 14:40:02 -06:00
Fabiano Fidêncio	5ff53e4d1c	ci: azure: Workaround azure cli installation script This is done in order to work around https://github.com/Azure/azure-cli/issues/28984, following a suggestion on the very same issue. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 20:28:24 +02:00
stevenhorsman	42fddb5530	ci: cache: Filter out non-printable characters from tag - The tags have a trailing non-printable character, which results in our cache tags having a trailing underscore e.g. `ghcr.io/kata-containers/cached-artefacts/agent:ce24e9835_` For ease of use of these cached components, we should strip off the trailing underscore. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-17 14:16:40 +01:00
Hyounggyu Choi	961735a181	CI: Migrate vfio-ap test files from tests repo An e2e test for `vfio-ap` has been conducted internally in IBM due to the lack of publicly available test machines equipped with a required crypto device. The test is performed by the `tests` repository: (i.e. `772105b560/Makefile (L144)`) The community is working to integrate all tests into the `kata-containers` repository, so the `vfio-ap` test should be part of that effort. This commit moves a test script and Dockerfile for a test image from the `tests` repository. We do not rename the script to `gha-run.sh` because it is not executed by Github Actions' workflow. You can check the test results from the s390x nightly test with the migrated files here: https://github.com/kata-containers/kata-containers/actions/runs/9123170010/job/25100026025 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-05-17 14:59:16 +02:00
stevenhorsman	a92defdffe	tests: pull-image: Remove skips Given that we think the containerd -> snapshotter image cache problems have been resolved by bumping to nydus-snapshotter v0.3.13 we can try removing the skips to test this out Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-17 12:39:57 +02:00
stevenhorsman	7ac302e2d8	tests: Slacken guest pull rootfs count assert - We previously have an expectation for the pause rootfs to be pull on the host when we did a guest pull. We weren't really clear why, but it is plausible related to the issues we had with containerd and nydus caching. Now that is fixed we can begin to address this with setting shared_fs=none, but let's start with updating the rootfs host check to be not higher than expected Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-17 12:39:56 +02:00
Fabiano Fidêncio	67ff58251d	tests: confidential_common: Remove unneeded `ensure_yq` call This test is called from `tests/integration/run_kuberentes_tests.sh`, which already ensures that yq is installed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 12:39:56 +02:00
Fabiano Fidêncio	cc874ad5e1	tests: confidential: Ensure those only run on TEEs Running those with the non-TEE runtime classes will simply fail. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 12:39:56 +02:00
Fabiano Fidêncio	2bc5b1bba2	tests: pull-image: Only skip tests for TEEs On `1423420`, I've mistakenly disabled the tests entirely, for both non-TEEs and TEEs. This happened as I didn't realise that `confidential_setup` would take non-TEEs into consideration. :-/ Now, let me follow-up on that and make sure that the tests will be running on non-TEEs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 12:39:56 +02:00
Fabiano Fidêncio	d875f89fa2	tests: Add is_confidential_hardware() This function is a helper to check whether the KATA_HYPERVISOR being used is a confidential hardware (TEE) or not, and we can use it to skip or only run tests on those platforms when needed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 12:39:56 +02:00
Fabiano Fidêncio	4a04a1f2ae	tests: Re-work confidential_setup() Let's rename it to `is_confidential_runtime_class`, and adapt all the places where it's called. The new name provides a better description, leading to a better understanding of what the function really does. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-17 12:39:56 +02:00
Pavel Mores	b9febc4458	runtime-rs: document architecture & implementation conventions in qemu-rs Implementation of QemuCmdLine has a fairly uniform and repetitive structure that's guided by a set of conventions. These conventions have however been mostly implicit so far, leading to a superfluous and annoying request/force-push churn during qemu-rs PR reviews. This commit aims to make things explicit so that contributors can take them into account before an initial PR submission. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-05-17 12:21:44 +02:00
Hyounggyu Choi	3917930a76	CI: Append arch type to initramfs-cryptsetup image This commit is to append an arch type to the initramfs-cryptsetup image to prevent a wrong arch image from being pulled on a different arch host. Fixes: #9654 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-05-17 11:42:49 +02:00
Steve Horsman	9a6d8d8330	Merge pull request #9650 from stevenhorsman/caching-tagging-update-partIII Caching tagging update part iii	2024-05-17 09:09:15 +01:00
stevenhorsman	ce24e98358	ci: cache: Add tag character filtering - Container image tags can only contain alphanumeric, period, hyphen and underscore characters, so convert characters outside of these to be underscores, to avoid having invalid tag failures Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-16 21:38:07 +01:00
stevenhorsman	a98b1e3afb	ci: cache: Integrate tagging updates with recent changes Recently the extra gpu caching was added, unfortunately when I rebased I ended up with both the new tagging logic and old logic. Let's try and integrate them properly to avoid doing the push twice. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-16 21:38:07 +01:00
Lukáš Doktor	f994f79078	ci.ocp: Add steps to reproduce/bisect CI runs in case the upstream CI fails it's useful to pin-point the PR that caused the regression. Currently openshift-ci does not allow doing that from their setup but we can mimic the setup on our infrastructure and use the available kata-deploy-ci images to find the first failing one. To help with that add a few helper scripts and a howto. Fixes: #9228 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-05-16 20:20:05 +02:00
Lukáš Doktor	a556ad7e01	ci.ocp: Document how to run openshift-tests with kata document the ocp pipeline. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-05-16 20:15:32 +02:00
Lukáš Doktor	ea081bd882	ci.ocp: Add webhook cleanup cleanup the webhook resources as well. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-05-16 20:15:31 +02:00
David Esparza	029a6de52b	Merge pull request #9615 from GabyCT/topic/fixlaunchtime metrics: Update launch times script	2024-05-16 11:28:44 -06:00
Steve Horsman	33e6b241ba	Merge pull request #9647 from stevenhorsman/fix-artefact-tags-unbound-variable ci: cache: Fix unbound variable	2024-05-16 16:22:47 +01:00
stevenhorsman	9d9487b17f	ci: cache: Fix unbound variable Now we have the workflow updated and can test the changes in caching we've hit an error: ``` line 1180: artefact_tag: unbound variable ``` so we need to fix that up. Sorry for missing this before. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-16 14:30:32 +01:00
Steve Horsman	03c08583c3	Merge pull request #9644 from stevenhorsman/fix-broken-workflow workflow: Remove if from env conditional	2024-05-16 14:13:25 +01:00
stevenhorsman	f7fd2f9a5d	workflow: Fix problems with build-asset workflows - It appears like the `if` isn't required when setting env as a conditional - `inputs.stage` over input.stage - Swap matrix.component to matrix.asset Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-16 11:51:46 +01:00
Steve Horsman	d8468cb178	Merge pull request #9550 from stevenhorsman/tag-component-caches Tag component caches	2024-05-16 11:05:18 +01:00
Steve Horsman	b31ff09b8d	Merge pull request #9617 from zvonkok/artefact-repository deploy: Add artefact repository	2024-05-16 10:41:23 +01:00
Fabiano Fidêncio	4d073c837d	Merge pull request #9636 from ChengyuZhu6/snapshotter version: Bump nydus snapshotter to v0.13.13	2024-05-16 02:54:53 +02:00
GabyCT	05cc8fae5e	Merge pull request #9610 from GabyCT/topic/fixrwfio metrics: Fix random write value for FIO	2024-05-15 17:44:41 -06:00
Gabriela Cervantes	793a02600a	metrics: Fix random write value for clh for FIO This PR decreases the random write value for clh for FIO. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-15 22:13:10 +00:00
Chelsea Mafrica	5d2af555da	runtime: Add missing check in ResizeMemory for CH ResizeMemory for Cloud Hypervisor is missing a check for the new requested memory being greater than the max hotplug size after alignment. Add the check, and since an earlier check for this setsrequested memory to the max hotplug size, do the same in the post-alignment check. Fixes #9640 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-05-15 11:29:18 -07:00
GabyCT	d752f0aa4f	Merge pull request #9627 from GabyCT/topic/ghacomk8s gha: Fix indentation in gha run k8s common	2024-05-15 11:55:14 -06:00
Greg Kurz	bd6420e0cc	runtime-rs: Drop some useless QEMU arguments All these settings are hardcoded as `false` and result in no extra options on the QEMU command line, like the go runtime does. There actually not needed : - we're never going to ask QEMU to survive a guest shutdown - we're never going to run QEMU daemonized since it prevents log collection - we're never going to ask QEMU to start with the guest stopped No need to keep this code around then. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-05-15 18:33:43 +02:00
stevenhorsman	7f41329010	ci: cache: Optional tag components with tags - CoCo wants to use the agent and coco-guest-components cached artifacts so tag them with a helpful version, so make these easier to get Signed-off-by: stevenhorsman <steven@uk.ibm.com> No commands remaining.	2024-05-15 16:56:40 +01:00
stevenhorsman	9999971656	release: Move component's don't ship logic - We don't want to ship certain components (agent, coco-guest-components) as part of the release, but for other consumers it's useful to be able to pull in the components from oras, so rather than not building them, just don't upload it as part of the release. - Also make the archs all consistent on not shipping the agent Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-15 16:55:55 +01:00
stevenhorsman	040e6cdf12	gha: release: Set RELEASE env - Set RELEASE env to 'yes', or 'no', based on if the stage passed in was 'release', so we can use it in the build scripts Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-15 16:55:55 +01:00
stevenhorsman	d93156d84d	gha: release: Push artifacts to registry on release For other projects (e.g. CoCo projects) being able to access the released versions of components is helpful, so push these during the release process Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-05-15 16:55:55 +01:00
Steve Horsman	19ca1a6656	Merge pull request #9638 from BbolroC/use-fixed-len-git-hash-explicitly CI: Use `--abbrev=9` explicitly for abbreviated commit hash	2024-05-15 16:55:07 +01:00
GabyCT	64b915b86e	Merge pull request #9438 from GabyCT/topic/addnegativetest tests: Add k8s negative policy test	2024-05-15 08:52:57 -06:00
Hyounggyu Choi	e075150fbe	CI: Use `--abbrev=9` explicitly for abbreviated commit hash A length of the result of `git log -1 --pretty=format:%h` could vary over different CI systems, highly likely messing up their caching mechanisms. This commit is to use an option `--abbrev=9` to standardize the length to 9 characters for CI. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-05-15 14:22:07 +02:00
Zvonko Kaiser	117e2f2ecc	Merge pull request #9618 from zvonkok/nvidia-rootfs-#1 gpu: Add build targets for GPU rootfs initrd/image	2024-05-15 13:30:42 +02:00
Hyounggyu Choi	6a4ff08156	Merge pull request #9632 from BbolroC/do-not-build-agent-policy-for-s390x local-build: Ensure the default rootfs is built with AGENT_POLICY=yes	2024-05-15 06:56:22 +02:00
ChengyuZhu6	d48c7ec979	version: Bump nydus snapshotter to v0.13.13 Bump nydus snapshotter to v0.13.13 to fix the gap when switching different snapshotters in guest pull. Fixes: #8407 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-05-15 12:21:01 +08:00
Fabiano Fidêncio	92bb235723	osbuilder: Log when the default policy is installed This will help us to debug issues in the future (and would have helped in the past as well). :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-14 20:45:49 +02:00
Fabiano Fidêncio	75bd97e8df	build: Ensure the default rootfs is built with AGENT_POLICY=yes This is needed, as `b1710ee2c0` made the default agent shipped the one with policy support. However, we simply didn't update the rootfs to reflect that, causing then an issue to start the agent as shown by the strace below: ``` open("/etc/kata-opa/default-policy.rego", O_RDONLY\|O_LARGEFILE\|O_CLOEXEC) = -1 ENOENT (No such file or directory) futex(0x7f401eba0c28, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1 RT_2], [], 8) = 0 tkill(553681, SIGABRT) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=553681, si_uid=1000} --- +++ killed by SIGABRT (core dumped) +++ ``` This happens as the default policy must be set when the agent is built with policy support, but the code path that copies that into the rootfs is only triggered if the rootfs itself is built with AGENT_POLICY=yes, which we're now doing for both confidential and non-confidential cases. Sadly this was not caught by CI till we the cache was not used for rootfs, which should be solved by the previous commit. Fixes: #9630, #9631 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-14 20:39:15 +02:00
Hyounggyu Choi	37060a7d2e	local-build: Stop using cached artifacts when local-build/* is updated This is to add an info for files at `tools/packaging/kata-deploy/local-build/* to a version of the components and ensure that the cached artefacts are not used when the files of interest are updated. Fixes: #9630 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-05-14 19:47:33 +02:00
Fabiano Fidêncio	9a3392993d	Merge pull request #9629 from ldoktor/tdx_not_supported_warning kata-deploy: Fix tdx_not_supported call	2024-05-14 17:27:56 +02:00
Greg Kurz	f14a1330d4	Merge pull request #9585 from littlejawa/debugging_the_runtime debugging: adding a script and instructions for debugging the GO shim	2024-05-14 15:31:07 +02:00
Lukáš Doktor	d9ae130031	kata-deploy: Fix tdx_not_supported call the `tdx_not_supported_warning` function does not exists, the `tdx_not_supported` should be called instead. Fixes: #9628 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-05-14 13:26:07 +02:00
Julien Ropé	e7cfc0865a	debugging: adding a script and instructions for debugging the GO shim Using a debugger with the kata runtime is complicated, but it can be done and can be very useful. This commits provides a helper script that simplifies it, and updates the developper's documentation to explain how to use it. Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-05-14 11:12:31 +02:00
Greg Kurz	e2117d3b71	Merge pull request #9571 from emanuellima1/fix-impl-rtc runtime-rs: Fix constructing the RTC struct	2024-05-14 09:17:27 +02:00
Gabriela Cervantes	f20a44bba3	gha: Fix indentation in gha run k8s common This PR fixes the indentation in gha run k8s common script to have uniformity across the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-13 20:07:47 +00:00
Fabiano Fidêncio	4d5e90038c	Merge pull request #9626 from fidencio/topic/prepare-for-3.5.0-release release: Bump VERSIONS file to 3.5.0	2024-05-13 12:52:12 +02:00
Fabiano Fidêncio	0e385452e5	release: Bump VERSIONS file to 3.5.0 Let's bump the VERSIONS file and start preparing for a new release of the project. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-13 10:49:09 +02:00
Fabiano Fidêncio	c64b07f981	Merge pull request #9622 from fidencio/topic/unbreak-nvidia-gpu-build build: nvidia-gpu: Fix cache usage of the headers tarball	2024-05-12 14:40:22 +02:00
cncal	232db2d906	runtime: fix duplicated devices requested to the agent By default, when a container is created with the `--privileged` flag, all devices in `/dev` from the host are mounted into the guest. If there is a block device(e.g. `/dev/dm`) followed by a generic device(e.g. `/dev/null`)，two identical block devices(`/dev/dm`) would be requested to the kata agent causing the agent to exit with error: > Conflicting device updates for /dev/dm-2 As the generic device type does not hit any cases defined in `switch`， the variable `kataDevice` which is defined outside of the loop is still the value of the previous block device rather than `nil`. Defining `kataDevice` in the loop fixes this bug. Signed-off-by: cncal <flycalvin@qq.com>	2024-05-12 16:38:37 +08:00
Fabiano Fidêncio	9713558477	k0s: Use a different port for kube-route's metrics kube-router decided to use :8080 for its metrics, and this seems to be a change that affected k0s 1.30.0+, leading to kube-router pod crashing all the time and anything can actually be started after that. Due to this issue, let's simply use a different port (:9999) and move on with our tests. Fixes: #9623 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-11 23:18:20 +02:00
Fabiano Fidêncio	4cd048444d	build: nvidia-gpu: Fix cache usage of the headers tarball Whenever we count on having the headers tarball, we must unpack the cached content into the expected directory, otherwise we'd simply fail, as we've been failing in our CI, at the end of the process where we generate the tarball from the cached components. It's weird to me, sincerely, that the headers tarball end up in such weird place (build/kernel-nvidia-gpu/builddir/), but I'll leave that to Zvonko to figure out whether something better can be done, as the intuit of this PR is simply unblock Kata Containers CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-11 17:59:53 +02:00
Zvonko Kaiser	693e307f72	deploy: Add artefact repository New env var so everyone can test the PUSH_TO_REGISTRY feature export PUSH_TO_REGISTRY=yes export ARTEFACT_REGISTRY=quay.io export ARTEFACT_REPOSITORY=my-fancy-kata-containers export ARTEFACT_REGISTRY_USERNAME=zvonkok export ARTEFACT_REGISTRY_PASSWORD=<super-secret> make ...-tarball Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-10 16:41:52 +00:00
Zvonko Kaiser	4dea73b433	Merge pull request #9616 from zvonkok/nv-kernel-hotfix deploy: Fix wrong pushing of artifacts	2024-05-10 18:38:09 +02:00
Zvonko Kaiser	4d0f42a145	deploy: Fix wrong pushing of artifacts Added explicit case statements for nvidia-gpu and nvidia-gpu-confidential Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-10 14:08:32 +00:00
Zvonko Kaiser	85374f55d2	gpu: Add build targets for GPU rootfs initrd/image Preparation for complete GPU rootfs build step #1/#N Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-10 09:47:21 +00:00
Zvonko Kaiser	8ec2cc9c0d	threat-model: Add VFIO, ACPI and KVM/VMM threat-model descriptions We're missing several topics in the current threat model lets update. Fixes: #8943 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-10 07:18:44 +00:00
Fabiano Fidêncio	20515fed70	Merge pull request #9484 from zvonkok/nvidia-runtimeclasses deploy: Add runtimeClasses relating to the NVIDIA GPU	2024-05-10 03:52:12 +02:00
Gabriela Cervantes	80e551ea74	metrics: Update launch times script This PR updates the launch times scripts by improving the variable definition as well as trying to use the same format across all the script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-09 21:29:32 +00:00
Emanuel Lima	59c1567f80	runtime-rs: Fix constructing the RTC struct RTC was being built in a wrong fashion on commit #2bc5e3c6e2ab0145fa9e8be95df0d5086c07a517 RTC was being constructed inside the QemuCmdLine struct, but it should've been built inside the devices vector. Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-05-09 15:00:47 -03:00
Fabiano Fidêncio	2f686b1179	Merge pull request #9608 from fidencio/topic/tdx-depend-on-distro-host-stack-part-II tdx: Adapt kata-deploy to use QEMU / OVMF from the distros	2024-05-09 10:25:19 +02:00
Zvonko Kaiser	da7e6a0f07	deploy: Add runtimeClasses relating to the NVIDIA GPU Fixes: #9483 For the added configurations we need to provide runtimeClasses. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 10:00:59 +02:00
Fabiano Fidêncio	96a100f910	Merge pull request #9482 from zvonkok/kernel-headers-tarball kernel: Add caching of kernel-headers	2024-05-09 09:58:30 +02:00
Fabiano Fidêncio	aba56a8adb	tests: measured-rootfs: Skip policy addition Let's skip the policy addition for now, in order to get the TDX CI back up and running, and then we can re-enable it as soon as we get https://github.com/kata-containers/kata-containers/issues/9612 fixed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	77f457c0e1	runtime: tdx: Drop sept-ve-disable=on This was needed when we were using an old (and not maintained anymore) host stack. Considering what we have as part of the distros, Today, this can simply be dropped, as I cannot find any reference of this one being needed in any up-to-date documentation. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	416d00228c	Revert "qemu: tdx: Adapt command line" (partially) This reverts commit `b7cccfa019`. The `private=on` bit has never made its way upstream, and was removed from the latest iteration that we're using. With that in mind, let's revert its usage in the code. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	1c3037fd25	Revert "govmm: tdx: Expose the private=on\|off knob" This reverts commit `582b5b6b19`. The `private=on` bit has never made its way upstream, and was removed from the latest iteration that we're using. With that in mind, let's revert its addition, and later on its usage in the code. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	a9720495de	kata-deploy: Ensure the distro QEMU and OVMF are used for TDX Here we're checking the distro's `/etc/os-release` or `/usr/lib/os-release` in order to get which distro we're deploying the Kata Containers artefacts to, and then to properly adjust the QEMU and OVMF with TDX support that's been shipped with the distros. Together with that, we're also printing the instructions provided by the distro on how to enable and use TDX. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	f48450b360	runtime: config: tdx: Add QEMU / OVMF placeholder var Let's add the PLACEHOLDER_FOR_DISTRO_{QEMU,OVMF}_WITH_TDX_SUPPORT variables instead of actually setting a path, so we can easily replace those as part of our deployment scripts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	84b94dc2b1	kata-deploy: Expose /host to the daemon-set We'll need to have access to the host os-release file (either under `/etc/os-release` or under `/usr/lib/os-release`), and the simplest approach that comes to my mind to do is doing what a debug pod would do, mounting `/` as `/host` and then allowing us to have access to those files, and then corectly set the TDX specific QEMU and OVMF (TDVF) paths for the tdx available configurations. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	f2d40da8e4	versions: build: Remove unused td-shim entry We haven't been using nor testing with td-shim, as Cloud Hypervisor does not officially support TDX yet, and TDVF is supposed to be used with QEMU, instead of td-shim. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	ea82740b19	versions: build: Remove TDX specific QEMU Let's remove everything related to the TDX specific QEMU building / shipping from our repo, as we'll be relying on the one coming from the distros. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Fabiano Fidêncio	4292c4c3b1	versions: build: Remove TDX specific OVMF (TDVF) Let's remove everything related to the TDVF building / shipping from our repo, as we'll be relying on the one coming from the distro. Later on, we may need to re-add TDVF logic, as we're already using upstream edk2 repo / content, but when that's needed we'll simply revert this commit. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-09 07:59:12 +02:00
Alex Lyn	946f0bdfff	Merge pull request #9609 from fidencio/topic/skip-pull-image-tests-on-tees tests: pull-image: Don't run on TEEs	2024-05-09 08:22:55 +08:00
GabyCT	3b8a910393	Merge pull request #9596 from lifupan/main db: fix the issue of failed to init pci root bus	2024-05-08 13:14:20 -06:00
Gabriela Cervantes	2fb406ed3a	metrics: Fix random write value for FIO This PR fixes the random write value for FIO for qemu by decreasing it to avoid the random failures of the GHA CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-08 18:54:41 +00:00
Fabiano Fidêncio	142342012c	tests: pull-image: Don't run on TEEs Let's skip those tests on TEEs as we've been facing a reasonable amount of issues, most likely on the containerd side, related to pulling the image on the guest. Once we're able to fix the issues on containerd, we can get back and re-enable those by reverting this commit. The decision of disabling the tests for TEEs is because the machines may end up in a state where human intervention is necessary to get them back to a functional state, and that's really not optimal for our CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-08 18:40:22 +02:00
Fabiano Fidêncio	c0bf9e9bc6	Merge pull request #9607 from fidencio/topic/tdx-depend-on-distro-host-stack-part-I ci: Stop building TDX specific QEMU and OVMF	2024-05-08 15:53:15 +02:00
Zvonko Kaiser	fb0b821771	kernel: Add caching of kernel-headers Fixes: #9481 We need to cache the kernel-headers for the NVIDIA GPU initrd/image build. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-05-08 11:30:39 +00:00
Fabiano Fidêncio	12dc9f83df	ci: Stop building TDX specific QEMU and OVMF This is the first step of the work to start relying on the artefacts coming from the distros (CentOS 9 Stream, and Ubuntu) themselves. Let's have this first one merged, as this will not run the CI due to the changes being on the yaml itself, and then follow-up with the changes needed on other parts of the project (kata-deploy, runtime, etc). Fixes: #9590 -- part I Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-05-08 11:39:32 +02:00
Alex Lyn	875e6e3815	Merge pull request #9601 from cncal/fix_redundant_log qemu: the error is logged only when it occurs	2024-05-08 08:59:01 +08:00
GabyCT	22087f9db9	Merge pull request #9598 from lifupan/main_shim runtime-rs: fix the issue of the leak of dead shim	2024-05-07 10:14:11 -06:00
GabyCT	a564422b7b	Merge pull request #9582 from cncal/main build: fix the confusing build message if yq doesn't exist in GOPATH/bin	2024-05-07 09:34:27 -06:00
Fabiano Fidêncio	cd84414c63	Merge pull request #9600 from GabyCT/topic/deleteoci versions: Remove oci information from versions file	2024-05-07 13:15:35 +02:00
Fabiano Fidêncio	ddf6b367c7	Merge pull request #9568 from kata-containers/dependabot/go_modules/src/runtime/go_modules-22ef55fa20 build(deps): bump the go_modules group across 5 directories with 8 updates	2024-05-07 13:14:48 +02:00
Steve Horsman	e967db60ab	Merge pull request #9592 from sprt/mariner-before-ch39 tests: adapt Mariner CI to unblock CH v39 upgrade	2024-05-07 11:52:55 +01:00
cncal	15d511af97	qemu: the error is logged only when it occurs Everytime I create contianer on arm64 machine, containerd/kata logs a redundant warning as follows: ``` shell time="2024-05-07" level=warning msg="<nil>" arch=arm64 name=containerd-shim-v2 pid=xxx sandbox=fdd1f05 source=virtcontainers/hypervisor ``` I added an error statement so that the error would be logged when it occurs. Signed-off-by: cncal <flycalvin@qq.com>	2024-05-07 14:28:04 +08:00
Gabriela Cervantes	aecede11fc	versions: Remove oci information from versions file This PR removes oci information from versions file as this is not longer being used in kata containers repository. Fixes #9599 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-06 20:14:00 +00:00
Gabriela Cervantes	b54dc26073	gha: Enable uninstall kbs client function for coco gha workflow This PR enables the uninstall kbs client function for coco gha tdx workflow. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-06 15:55:24 +00:00
Gabriela Cervantes	aaf9b54d97	gha: Add support to install KBS to k8s TDX GHA workflow This PR adds support to install KBS to k8s TDX GHA workflow in order to run confidential attestation tests. Fixes #9451 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-06 15:42:17 +00:00
Gabriela Cervantes	506e17a60d	tests: Add k8s negative policy test This PR adds a k8s negative policy test to the confidential attestation bats test. Fixes #9437 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-05-06 15:28:54 +00:00
Fupan Li	3694f3d9fe	runtime-rs: fix the issue of the leak of dead shim We should init and asign the runtime instance to runtime handler, otherwise, if the pause container failed to start, which means the runtime instance failed to start, then the following delete & shutdown request wouldn't be run, thus the dead shim would be left. Fixes: #9597 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-05-06 17:31:31 +08:00
Fupan Li	26bee78e8d	db: fix the issue of failed to init pci root bus dragonball reserves 2048G of mmio space for the pci root bus by default on physical addresses greater than 4G. However, for some machines with smaller physical address widths, such as 39-bit wide physical addresses, dragonball reserves the mmio space when initializing the memory. It is less than 2048G, so this commit dynamically calculates and allocates the mmio size of each pci root bus. Fixes: #9509 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-05-06 11:34:18 +08:00
Aurélien Bombo	0cc2b07a8c	tests: adapt Mariner CI to unblock CH v39 upgrade The CH v39 upgrade in #9575 is currently blocked because of a bug in the Mariner host kernel. To address this, we temporarily tweak the Mariner CI to use an Ubuntu host and the Kata guest kernel, while retaining the Mariner initrd. This is tracked in #9594. Importantly, this allows us to preserve CI for genpolicy. We had to tweak the default rules.rego however, as the OCI version is now different in the Ubuntu host. This is tracked in #9593. This change has been tested together with CH v39 in #9588. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-05-03 16:29:12 +00:00
cncal	48d873b52b	build: fix the confusing build message if yq doesn't exist in GOPATH/bin The build message shows that yq was not found when I tried to build runtime binaries, but I've actually installed yq by yum install. Signed-off-by: cncal <flycalvin@qq.com>	2024-05-03 08:34:45 +08:00
cncal	9caa7beb1f	runtime: make kata-runtime check error more understandable If device /dev/kvm does not exist, kata-runtime check would fail with an ambiguous error messae 'no such file or directory'. I added a little more details to make it understandable and it will belike: ``` ERRO[0000] cannot open kvm device: no such file or directory arch=arm64 check-type=full device=/dev/kvm name=kata-runtime pid=2849085 source=runtime ERRO[0000] no such file or directory arch=arm64 name=kata-runtime pid=2849085 source=runtime no such file or directory ``` Signed-off-by: cncal <flycalvin@qq.com>	2024-05-03 08:29:08 +08:00
Zvonko Kaiser	e5e0983b56	Merge pull request #9476 from zvonkok/nvidia-config-tomls config: Add NVIDIA GPU SNP, TDX configuration files	2024-05-02 10:27:10 +02:00
Fabiano Fidêncio	f04a7a55ed	Merge pull request #9563 from fidencio/topic/agent-use-policy-by-default build: Build the shipped agent with policy enabled	2024-05-01 12:22:05 +02:00
Fabiano Fidêncio	33a8701904	Merge pull request #9573 from littlejawa/kata_deploy_crio_conf kata-deploy: configure debugging for crio	2024-05-01 12:19:10 +02:00
Julien Ropé	c2aed995b7	kata-deploy: configure debugging for crio Fix the configuration for crio's log_level Fixes: #9556 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-04-30 17:48:43 +02:00
stevenhorsman	3c2232d898	runtime: fix testVersionString logic - The testVersionString logic use regex to check that the ociVersion is displayed correctly, but with the new go module that version has a `+` in, so we need to quote this to escape special characters Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-30 10:54:49 +01:00
dependabot[bot]	391bc35805	build(deps): bump the go_modules group across 5 directories with 8 updates Bumps the go_modules group with 2 updates in the /src/runtime directory: [github.com/containerd/containerd](https://github.com/containerd/containerd) and [github.com/containers/podman/v4](https://github.com/containers/podman). Bumps the go_modules group with 4 updates in the /src/tools/csi-kata-directvolume directory: [golang.org/x/sys](https://github.com/golang/sys), google.golang.org/protobuf, [golang.org/x/net](https://github.com/golang/net) and [google.golang.org/grpc](https://github.com/grpc/grpc-go). Bumps the go_modules group with 2 updates in the /src/tools/log-parser directory: [golang.org/x/sys](https://github.com/golang/sys) and gopkg.in/yaml.v3. Bumps the go_modules group with 2 updates in the /tests directory: [golang.org/x/sys](https://github.com/golang/sys) and gopkg.in/yaml.v3. Bumps the go_modules group with 2 updates in the /tools/testing/kata-webhook directory: [golang.org/x/sys](https://github.com/golang/sys) and [golang.org/x/net](https://github.com/golang/net). Updates `github.com/containerd/containerd` from 1.7.2 to 1.7.11 - [Release notes](https://github.com/containerd/containerd/releases) - [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md) - [Commits](https://github.com/containerd/containerd/compare/v1.7.2...v1.7.11) Updates `github.com/containers/podman/v4` from 4.2.0 to 4.9.4 - [Release notes](https://github.com/containers/podman/releases) - [Changelog](https://github.com/containers/podman/blob/v4.9.4/RELEASE_NOTES.md) - [Commits](https://github.com/containers/podman/compare/v4.2.0...v4.9.4) Updates `google.golang.org/protobuf` from 1.29.1 to 1.33.0 Updates `github.com/cyphar/filepath-securejoin` from 0.2.3 to 0.2.4 - [Release notes](https://github.com/cyphar/filepath-securejoin/releases) - [Commits](https://github.com/cyphar/filepath-securejoin/compare/v0.2.3...v0.2.4) Updates `golang.org/x/sys` from 0.15.0 to 0.19.0 - [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0) Updates `google.golang.org/protobuf` from 1.31.0 to 1.33.0 Updates `golang.org/x/net` from 0.19.0 to 0.23.0 - [Commits](https://github.com/golang/net/compare/v0.19.0...v0.23.0) Updates `google.golang.org/grpc` from 1.59.0 to 1.63.2 - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](https://github.com/grpc/grpc-go/compare/v1.59.0...v1.63.2) Updates `golang.org/x/sys` from 0.0.0-20191026070338-33540a1f6037 to 0.1.0 - [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0) Updates `gopkg.in/yaml.v3` from 3.0.0-20200313102051-9f266ea9e77c to 3.0.0 Updates `golang.org/x/sys` from 0.0.0-20220429233432-b5fbb4746d32 to 0.19.0 - [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0) Updates `gopkg.in/yaml.v3` from 3.0.0-20210107192922-496545a6307b to 3.0.0 Updates `golang.org/x/sys` from 0.15.0 to 0.19.0 - [Commits](https://github.com/golang/sys/compare/v0.15.0...v0.19.0) Updates `golang.org/x/net` from 0.19.0 to 0.23.0 - [Commits](https://github.com/golang/net/compare/v0.19.0...v0.23.0) --- updated-dependencies: - dependency-name: github.com/containerd/containerd dependency-type: direct:production dependency-group: go_modules - dependency-name: github.com/containers/podman/v4 dependency-type: direct:production dependency-group: go_modules - dependency-name: google.golang.org/protobuf dependency-type: direct:production dependency-group: go_modules - dependency-name: github.com/cyphar/filepath-securejoin dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/sys dependency-type: indirect dependency-group: go_modules - dependency-name: google.golang.org/protobuf dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/net dependency-type: direct:production dependency-group: go_modules - dependency-name: google.golang.org/grpc dependency-type: direct:production dependency-group: go_modules - dependency-name: golang.org/x/sys dependency-type: indirect dependency-group: go_modules - dependency-name: gopkg.in/yaml.v3 dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/sys dependency-type: indirect dependency-group: go_modules - dependency-name: gopkg.in/yaml.v3 dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/sys dependency-type: indirect dependency-group: go_modules - dependency-name: golang.org/x/net dependency-type: indirect dependency-group: go_modules ... Signed-off-by: dependabot[bot] <support@github.com>	2024-04-30 09:46:13 +01:00
Wainer Moschetta	eae429a39b	Merge pull request #9552 from wainersm/kata_cc_dev runtime: new qemu-coco-dev configuration	2024-04-30 05:21:49 -03:00
Zvonko Kaiser	28078ded84	Merge pull request #9570 from stevenhorsman/dependabot-commit-check-skip workflow: static-checks: Skip commit checks for dependabout	2024-04-29 23:00:35 +02:00
Pavel Mores	1dd06cf40d	Merge pull request #9551 from pmores/support-iommu runtime-rs: support IOMMU in qemu VMs	2024-04-29 15:26:11 +02:00
stevenhorsman	0bec8721cc	workflow: Skip commit checks for dependabout Dependabot doesn't follow all our commit format guidelines, so add a check and skip these if the author is `dependabot[bot]` Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-29 13:45:51 +01:00
Wainer dos Santos Moschetta	631f6f6ed6	gha: switch CoCo tests on non-TEE to use qemu-coco-dev With the addition of the 'qemu-coco-dev' runtimeClass we no longer need to run CoCo tests on non-TEE environments with 'qemu'. As a result the tests also no longer need to set the "io.katacontainers.config.hypervisor.image" annotation to pods. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-29 05:45:11 -03:00
Wainer dos Santos Moschetta	c6708726ff	kata-deploy: install the new kata-qemu-coco-dev runtimeclass Created the runtimeclasses/kata-qemu-coco-dev.yaml file and updated the list of SHIMS. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-29 05:45:11 -03:00
Wainer dos Santos Moschetta	42fb5d7760	runtime: new qemu-coco-dev configuration Created a new configuration to configure Kata for CoCo without requiring TEE hardware so to allow developers implement/test/debug platform agnostic code on their workstations. It will also ease testing of CoCo features on CI with non-TEE supported VMs. This is based off qemu configuration. The following differences applied: - switched to confidential guest image/initrd - switched to confidential kernel - switched to 9p shared_fs Fixes #9487 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-29 05:45:10 -03:00
Fabiano Fidêncio	d3b300ff95	build: tests: Remove agent-opa Now that the `kata-agent` is being built with policy support, let's stop building the `kata-opa-agent`, reducing the amount of things we need to test and maintain. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-28 12:52:54 +02:00
Fabiano Fidêncio	b1710ee2c0	build: Build the shipped agent with policy enabled Now that the OPA binary is not required anymore, let's start shipping the agent with the policy enabled by default. The agent without policy enabled has 30MB, while it's 34MB with the policy enabled. This 4MB (~10%) increase is, IMHO, worth it in order to reduce the amount of components we have to maintain and test, including the possibility to also reduce the amount of possible rootfs / initrd images. Whoever wants to use the agent without policy enabled can simply do that by building their own agent. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-28 12:52:54 +02:00
Fabiano Fidêncio	7b039eb1b9	Merge pull request #9559 from fidencio/topic/remove-opa-stuff rootfs: Stop building and shipping OPA	2024-04-28 12:52:07 +02:00
Fabiano Fidêncio	fe21d7a58b	rootfs: Stop building and shipping OPA Since OPA binary was replaced by the regorus crate, we can finally stop building and shipping the binary. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-26 18:51:28 +02:00
Fabiano Fidêncio	7dd2fde22d	Revert "rootfs: Make OPA build working in docker for s390x and ppc64le" This reverts commit `d523e865c0`, as we will not depend on the OPA binary anymore. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-26 18:51:27 +02:00
Hyounggyu Choi	62bad976e0	Merge pull request #9562 from BbolroC/bump-golang build: Update golang version to 1.22.2	2024-04-26 17:58:04 +02:00
Steve Horsman	34a1cdc5c7	Merge pull request #9528 from cncal/patch-1 doc: fix missing document link	2024-04-26 15:22:15 +01:00
Hyounggyu Choi	80cb4a6c18	build: Update golang version to 1.22.2 As we have an issue with a golang version for `run-cri-containerd`, it is required to bump the language. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-26 15:50:29 +02:00
Pavel Mores	908ec31d9b	runtime-rs: fix iommu_platform support for qemu vhost-user-fs device iommu_platform support was already added on initial DeviceVhostUserFs introduction, however it incorrectly enabled iommu_platform also on non-CCW (e.g. PCI) systems. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:48:00 +02:00
Pavel Mores	174fc8f44b	runtime-rs: support iommu_platform for qemu virtio-net device Note that it's only supported on CCW systems. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:48:00 +02:00
Pavel Mores	0d038f20cc	runtime-rs: support iommu_platform for qemu virtio-serial device iommu_platform is only turned on for CCW systems. PartialEq is added to VirtioBusType to enable the '==' operator. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:48:00 +02:00
Pavel Mores	66a2dc48ae	runtime-rs: support iommu_platform for qemu vhost-vsock device iommu_platform addition is controlled solely by the configuration file. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:48:00 +02:00
Pavel Mores	d1e6f9cc4e	runtime-rs: add IOMMU to qemu VM if configured The adding itself is done by a new function add_iommu() that conforms with the add_() convention. Note though that this function is called internally, by the QemuCmdLine constructor, simply because there's nothing to trigger its invocation from QemuInner (unlike the other add_() functions so far). Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:48:00 +02:00
Pavel Mores	0859f47a17	runtime-rs: add representation of '-device intel-iommu' to qemu-rs Following the golang shim example, the values are hardcoded. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:47:51 +02:00
Pavel Mores	702bf0d35e	runtime-rs: support qemu machine's 'kernel_irqchip' param We will want to set kernel_irqchip when enabling IOMMU and this commit adds the requisite support. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-04-26 14:42:54 +02:00
Alex Lyn	f72c6ba814	Merge pull request #9519 from emanuellima1/impl-rtc runtime-rs: Add RTC to QEMU cmdline	2024-04-26 17:44:47 +08:00
Dan Mihai	b42ddaf15f	Merge pull request #9530 from microsoft/saulparedes/improve_caching genpolicy: changing caching so the tool can run concurrently with itself	2024-04-25 13:06:23 -07:00
David Esparza	ae317a319f	Merge pull request #9549 from JakubLedworowski/fix-tarball-dockerfile build: Fix tarball not building correctly in docker	2024-04-25 09:40:20 -06:00
James O. D. Hunt	5bd614530f	Merge pull request #9525 from jodh-intel/gha-k8s-ch-dm gha: Enable k8s tests for cloud hypervisor with devicemapper	2024-04-25 09:28:09 +01:00
Fabiano Fidêncio	b4360e7e37	Merge pull request #9510 from microsoft/danmihai1/regorus-policy2 agent: use regorus instead of opa	2024-04-24 21:40:29 +02:00
James O. D. Hunt	ff7349b6f0	gha: Enable k8s tests for cloud hypervisor with devicemapper Enable the k8s tests for cloud hypervisor with devicemapper. Fixes: #9221. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Co-authored-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-24 16:32:51 +01:00
Dan Mihai	2400a4d249	Merge pull request #9428 from arc9693/archana1/genplicyfixes genpolicy: implement default methods for K8sResource trait	2024-04-24 08:04:19 -07:00
Dan Mihai	ff385eac41	agent: remove unnecessary comment Remove reminder to initialize Policy earlier, because currently there are no plans to initialize earlier. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-24 14:53:51 +00:00
Jakub Ledworowski	73366da9f9	build: Fix tarball not building correctly in docker When docker is installed on the host system using script from https://get.docker.com/ it automatically creates a docker group with gid=999. Then during docker build process of tarball, eg. make qemu-tdx-experimental-tarball docker is also installed inside the image with the same script, which also automatically adds docker group with gid=999. Then, the build tries to add a new group docker_on_host with gid=999, which already exists, which breaks the build. Signed-off-by: Jakub Ledworowski <jakub.ledworowski@intel.com>	2024-04-24 15:35:36 +02:00
Calvin Liu	56a73ee704	doc: fix missing document link Document section hardware-requirements locates to /README.md for now. Signed-off-by: Calvin Liu <flycalvin@qq.com>	2024-04-24 17:34:30 +08:00
Fabiano Fidêncio	4e35f11a3d	Merge pull request #9535 from fidencio/topic/fix-crio-debug-drop-in kata-deploy: Stop append `log_level = "debug"` for CRI-O	2024-04-24 10:03:36 +02:00
Dan Mihai	89c85dfe84	Merge pull request #9432 from UiPath/fix-clh-wait clh: isClhRunning waits for full timeout when clh exits	2024-04-23 13:02:45 -07:00
Hyounggyu Choi	608df9b7df	Merge pull request #9494 from BbolroC/guest-pull-gha-s390x CC: Enable guest-pull tests on non-TEE for s390x	2024-04-23 21:22:37 +02:00
Dan Mihai	e5c3f5fa9b	tests: no generated policy for untested platforms Avoid auto-generating Policy on platforms that haven't been tested yet with auto-generated Policy. Support for auto-generated Policy on these additional platforms is coming up in future PRs, so the tests being fixed here were prematurely enabled. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-23 16:07:03 +00:00
Emanuel Lima	2bc5e3c6e2	runtime-rs: Add RTC to QEMU cmdline Add RTC by hardcoding the ooptions base=utc,driftfix=slew,clock=host Signed-off-by: Emanuel Lima <emlima@redhat.com>	2024-04-23 10:46:30 -03:00
Fabiano Fidêncio	d190c9d4d9	kata-deploy: Stop append `log_level = "debug"` for CRI-O This should only be done once, and if CRI-O restarts, there's a big chance kata-deploy will also restart and the user would end up with a file that looks like: ``` [crio] log_level = "debug" [crio] log_level = "debug" [crio] log_level = "debug" ... ``` And that would simply cause CRI-O to not start. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-23 14:51:35 +02:00
Greg Kurz	42a79801f3	Merge pull request #9524 from littlejawa/fix_createruntime_hook_not_called runtime: Call CreateRuntime hooks at container creation time	2024-04-23 13:43:36 +02:00
Fupan Li	469c4e4f44	Merge pull request #9335 from Tim-Zhang/fix-passfd-fifo-open passfd-io: fix FIFO opening and vsock handling	2024-04-23 09:04:45 +08:00
Alex Lyn	bc2cf95e7a	Merge pull request #9517 from amshinde/update-storage-source-pciblock runtime-rs: Update storage source for pci block devices	2024-04-23 07:32:36 +08:00
Dan Mihai	5d31eb4847	agent: use regorus 0.1.4 Use regorus 0.1.4 from crates.io, instead of its source code repository. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-22 23:21:17 +00:00
Dan Mihai	ed6412b63c	tests: k8s: reduce the policy tests output noise Hide some of the kubectl output, to reduce the size and redundancy of this output. Fixes: #9388 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-22 19:59:33 +00:00
Dan Mihai	df23eb09a6	agent: use regorus instead of opa Implement Agent Policy using the regorus crate instead of the OPA daemon. The OPA daemon will be removed from the Guest rootfs in a future PR. Fixes: #9388 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-22 19:58:30 +00:00
Dan Mihai	58e608d61a	tests: remove k8s-policy-set-keys.bats Remove k8s-policy-set-keys.bats in preparation for using the regorus crate instead of the OPA daemon for evaluating the Agent Policy. This test depended on sending HTTP requests to OPA. Fixes: #9388 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-22 19:49:38 +00:00
Dan Mihai	b509c1beee	agent: lock anyhow version to 1.0.58 Lock anyhow version to 1.0.58 because: - Versions between 1.0.59 - 1.0.76 have not been tested yet using Kata CI. However, those versions pass "make test" for the Kata Agent. - Versions 1.0.77 or newer fail during "make test" - see https://github.com/kata-containers/kata-containers/issues/9538. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-22 19:49:15 +00:00
Archana Shinde	cc6b671101	runtime-rs: Update storage source for pci block devices In case of block devices using virtio-block, we need to pass the pci-path as the storage source field to the agent. Current the virt-path is being passed which works just for mmio block devices. In the future when support is added for scsi, block-ccw and pmem devices, the storage source would need to be handled accordingly. Fixes: #9034 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-04-22 11:36:58 -07:00
Hyounggyu Choi	f10744df99	CC: Enable guest-pull tests on non-TEE for s390x This commit is to add a new CI job to run-k8s-tests-on-zvsi.yaml. Why the job is not configured in run-kata-coco-tests.yaml by having it integrated with `run-k8s-tests-coco-nontee` is: - It uses k3s instead of AKS - It runs on a self-hosted runner These differences make the integrated job not easy to read and maintain when it comes to incorporating other platforms in the near future. Fixes: #9467 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-22 17:15:20 +02:00
Greg Kurz	6ca0f09710	Merge pull request #9518 from microsoft/danmihai1/agent-cargo-lock agent: update cargo.lock	2024-04-22 13:36:06 +02:00
Tim Zhang	aeba483ec8	agent: avoid fd leakage of passfd-io In do_create_container and do_exec_process, we should create the proc_io first, in case there's some error occur below, thus we can make sure the io stream closed when error occur. Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-04-22 17:39:33 +08:00
Tim Zhang	8441187d5e	runtime-rs: fix FIFO handling Fixes: #9334 In linux, when a FIFO is opened and there are no writers, the reader will continuously receive the HUP event. This can be problematic. To avoid this problem, we open stdin in write mode and keep the stdin-writer We need to open the stdout/stderr as the read mode and keep the open endpoint until the process is delete. otherwise, the process would exit before the containerd side open and read the stdout fifo, thus runD would write all of the stdout contents into the stdout fifo and then closed the write endpoint. Then, containerd open the stdout fifo and try to read, since the write side had closed, thus containerd would block on the read forever. Here we keep the stdout/stderr read endpoint File in the common_process, which would be destroied when containerd send the delete rpc call, at this time the containerd had waited the stdout read return, thus it can make sure the contents in the stdout/stderr fifo wouldn't be lost. Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-04-22 17:39:33 +08:00
Tim Zhang	d68eb7f0ad	agent: Fix close_stdin for passfd-io In scenario passfd-io, we should wait for stdin to close itself instead of manually intervening in it. Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-04-22 17:39:32 +08:00
Steve Horsman	ff9985fc50	Merge pull request #9490 from wainersm/port_attestation_nontee_job gha: move attestation tests to run-k8s-tests-coco-nontee	2024-04-22 10:23:11 +01:00
Archana Choudhary	4a010cf71b	genpolicy: add default implementations for K8sResource trait This commit adds default implementations for following methods of K8sResource trait: - generate_policy - serialize Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:59:02 +00:00
Archana Choudhary	6edc3b6b0a	genpolicy: add default implementation for use_sandbox_pidns This patch adds a default implementation for the use_sandbox_pidns and updates the structs that implement the K8sResource trait to use the default. Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:59:02 +00:00
Archana Choudhary	d5d3f9cda7	genpolicy: add default implementation for use_host_network - Provide default implementation for use_host_network - Remove default implementation from structs implementing the trait K8sResource Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:59:02 +00:00
Archana Choudhary	9a3eac5306	genpolicy: add default impl for get_containers - Provide default impl for get_containers - Remove default impl from structs implementing the trait K8sResource Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:59:02 +00:00
Archana Choudhary	2db3470602	genpolicy: add default impl for get_container_mounts_and_storages - Provide default impl for get_container_mounts_and_storages - Remove default impl from structs implementing the trait K8sResource Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:59:02 +00:00
Archana Choudhary	09b0b4c11d	genpolicy: add default implementation for get_sandbox_name - Provide default implementation for get_sandbox_name in K8sResource trait - Remove default implementation from structs implementing the trait K8sResource Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:55:32 +00:00
Archana Choudhary	43e9de8125	genpolicy: add default implementation for get_annotations - Provide default implementation for get_annontations. - Remove default implementation from structs implementing the trait K8sResource Fixes: #8960 Signed-off-by: Archana Choudhary <archana1@microsoft.com>	2024-04-21 12:55:32 +00:00
Saul Paredes	2149cb6502	genpolicy: changing caching so the tool can run concurrently with itself Based on 3a1461b0a5186a92afedaaea33ff2bd120d1cea0 Previously the tool would use the layers_cache folder for all instances and hence delete the cache when it was done, interfereing with other instances. This change makes it so that each instance of the tool will have its own temp folder to use. Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-04-19 15:46:30 -07:00
Wainer dos Santos Moschetta	1e35291fd5	gha: move attestation tests to run-k8s-tests-coco-nontee The new run-k8s-tests-coco-nontee job should be the home of attestation tests. Changed run-k8s-tests-coco-nontee to get KBS installed and by the time the KBS variable is exported in the environment then the attestation tests will kick in (likewise they will skip in run-k8s-tests-on-aks). Fixes #9455 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-19 14:51:30 -03:00
Steve Horsman	7e12d588c0	Merge pull request #9485 from sparky005/update_golang.org/x/net update golang.org/x/net	2024-04-19 11:26:13 +01:00
Amulya Meka	12964256a4	Merge pull request #9521 from Amulyam24/gha gha: tag k8s tests on ppc64le to ppc64le-runner-01	2024-04-19 15:08:08 +05:30
Julien Ropé	70e798ed35	runtime: Call CreateRuntime hooks at container creation time CreateRuntime hooks are called at the CreateSandbox time, but not after CreateContainer. Fixes: #9523 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-04-19 10:25:02 +02:00
Alex Lyn	3456483df9	Merge pull request #9513 from stevenhorsman/bump-stale-version gha: stale: Bump stalebot version	2024-04-19 15:15:10 +08:00
Alex Lyn	c147f0f4ed	Merge pull request #9516 from sprt/rlz-340 release: bump version for 3.4.0 release	2024-04-19 15:12:26 +08:00
Amulyam24	8255ed248a	gha: tag k8s tests on ppc64le to ppc64le-runner-01 This PR aims at running the k8s tests to one runner on ppc64le. Fixes: #9520 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-04-19 12:04:25 +05:30
Hyounggyu Choi	304dc1e4da	doc: Update how-to-run-kata-containers-with-SE-VMs.md This is to update a document `how-to-run-kata-containers-with-SE-VMs` on using confidential artifacts to build a secure image. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-19 08:31:12 +02:00
Hyounggyu Choi	8fbed9f6a4	local-build: Use confidential kernel and initrd for boot-image-se This is to make `boot-image-se-tarball` use confidential kernel and initrd instead of vanilla version of artifacts. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-19 07:09:04 +02:00
Dan Mihai	4242801b1c	agent: update cargo.lock Update Kata Agent's Cargo.lock after the recent changes to Cargo.toml. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-18 17:12:48 +00:00
Aurélien Bombo	95971e4a42	release: bump version for 3.4.0 release Release v3.4.0. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-04-18 17:08:06 +00:00
Steve Horsman	6dd038fd58	Merge pull request #9501 from zvonkok/check-fixes kata: Remove check for "Fixes" in PR	2024-04-18 17:48:50 +01:00
Hyounggyu Choi	2b9c439fcf	Merge pull request #9508 from BbolroC/gha-s390x-k8s-label gha: Make integration tests for s390x run on s390x-large runners	2024-04-18 18:05:01 +02:00
Adil Sadik	1c5ca0c915	runtime: update golang.org/x/net updates golang.org/x/net to newer version that closes some reported vulnerabilities and security issues Fixes #9486 Signed-off-by: Adil Sadik <sparky.005@gmail.com>	2024-04-18 10:55:02 -04:00
Tim Zhang	221c5b51fe	dragonball: fix EPOLLHUP/EPOLLERR events handling in vsock 1. EPOLLHUP events also need to be read and will be got len 0. 2. We should kill the connection when EPOLLERR events are received. Signed-off-by: Tim Zhang <tim@hyper.sh> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-04-18 20:47:02 +08:00
Hyounggyu Choi	49a0d57f66	gha: Make integration tests for s390x run on s390x-large runners This is to make a workflow `run-k8s-tests` and `run-cri-containerd` (s390x and zvsi) run only on the runners labeled by `s390x-large`. Fixes: #9507 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-18 14:35:24 +02:00
stevenhorsman	cf5c3dc155	gha: stale: Bump stalebot version - Bump the stalebot action version to v9 as that fixes the ``` Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/stale@v8. ``` warning. Fixes: #9512 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-18 11:41:09 +01:00
Steve Horsman	bf16b18180	Merge pull request #9503 from stevenhorsman/stale-pr-remove-date gha: stale: Remove the start-date	2024-04-18 09:36:27 +01:00
Hyounggyu Choi	566a6de594	Merge pull request #9505 from BbolroC/remove-crio-nightly-test-s390x gha: Remove k8s-cri-containerd-rhel9-e2e-tests for s390x	2024-04-18 09:31:07 +02:00
Hyounggyu Choi	cc22dc33f2	Merge pull request #9489 from BbolroC/install-opa-in-docker rootfs: Make OPA build working in docker for s390x and pp…	2024-04-18 00:26:11 +02:00
Dan Mihai	5ceed689eb	Merge pull request #9492 from microsoft/danmihai1/pod-tests tests: k8s: inject agent policy failures (part 3)	2024-04-17 14:01:11 -07:00
Hyounggyu Choi	e046f5e652	gha: Remove k8s-cri-containerd-rhel9-e2e-tests for s390x This commit is simply to remove a CI workflow `k8s-cri-containerd-rhel9-e2e-tests`. Fixes: #9504 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-17 15:36:42 +02:00
Zvonko Kaiser	eda3bfe2ef	config: Add NVIDIA GPU SNP, TDX configuration files Fixes: #9475 For TDX and SNP add NVIDIA specific configuration files Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-04-17 12:49:13 +00:00
Wainer Moschetta	2d8e7933c5	Merge pull request #9461 from GabyCT/topic/uninstallkbs tests/k8s: Add uninstall kbs client command function	2024-04-17 09:36:37 -03:00
Zvonko Kaiser	d7b24c04e5	Merge pull request #9473 from zvonkok/gpu-image-initrd-versions version: add initrd, image NVIDIA sections	2024-04-17 13:22:05 +02:00
stevenhorsman	7235988605	gha: stale: Remove the start-date As documented in https://github.com/actions/stale?tab=readme-ov-file#start-date > The start date is used to ignore the issues and pull requests created before the start date. > Particularly useful when you wish to add this stale workflow on an existing repository > and only wish to stale the new issues and pull requests. As we don't want need to treat PRs older than May 2023 as a special case, then remove this option. Fixes: #9502 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-17 11:19:56 +01:00
Zvonko Kaiser	395e93acd5	kata: Remove Issue - PR dependency We've discussed this over and over. Let's try to get to an agreement here. I will use this issue to remove the mandatory Issue - PR dependency. Fixes: #9500 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-04-17 09:53:08 +00:00
Archana Shinde	af3b19ed18	Merge pull request #9084 from amshinde/document-intel-gpu-vfio docs: Document Intel Discrete GPUs usage with Kata	2024-04-16 16:17:03 -07:00
Archana Shinde	973a15332a	spell-check: Add missing words to spell-check Add missing words to spell-check dictionaries Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-04-16 11:50:02 -07:00
Archana Shinde	6f97dc1f60	static-checks: Rename file in doc to make static checks happy Configuration file for qemu with runtime-rs was recently renamed. Doc contains name for old file. This was somehow not caught in the CI earlier. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-04-16 11:50:02 -07:00
Archana Shinde	87f0097b18	docs: Document Intel Discrete GPUs usage with Kata Document describes the steps needed to pass an entire Intel Discrete GPU as well a GPU SR-IOV interface to a Kata Container. Fixes: #9083 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-04-16 11:50:02 -07:00
Dan Mihai	2c4d1ef76b	tests: k8s: inject agent policy failures (part 3) Auto-generate the policy and then simulate attacks from the K8s control plane by modifying the test yaml files. The policy then detects and blocks those changes. These test cases are using K8s Pods. Additional policy failures are injected during CI using other types of K8s resources - e.g., using Jobs and Replication Controllers - from separate PRs. Fixes: #9491 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-16 18:15:12 +00:00
Dan Mihai	c26dad8fe5	Merge pull request #9294 from burgerdev/burgerdev/genpolicy-configurable-pause genpolicy: support insecure registries and custom pause containers	2024-04-16 09:39:33 -07:00
GabyCT	9238daf729	Merge pull request #9464 from microsoft/danmihai1/rc-tests tests: k8s: inject agent policy failures (part2)	2024-04-16 10:01:39 -06:00
Hyounggyu Choi	d523e865c0	rootfs: Make OPA build working in docker for s390x and ppc64le The commit is to make the OPA build from source working in `ubuntu-rootfs-osbuilder`. To achieve the goal, the configuration is changed as follows: - Switch the make target to `ci-build-linux-static` not triggering docker-in-docker build - Install go in the builder image for s390x and ppc64le Fixes: #9466 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-16 16:49:12 +02:00
Greg Kurz	aca6a1bcb5	Merge pull request #9353 from pmores/pr-8866-follow-up runtime-rs: refactor qemu driver	2024-04-16 16:07:36 +02:00
Fabiano Fidêncio	7bb5490676	Merge pull request #9479 from wainersm/fix_coco_nontee_jobs gha: make run-kata-coco-tests inherit secrets	2024-04-16 13:46:52 +02:00
Hyounggyu Choi	7b11fd2546	Merge pull request #9471 from BbolroC/coco-kernel-version-s390x version: Add coco name and version for {image,initrd} for s390x	2024-04-15 16:03:20 +02:00
Wainer dos Santos Moschetta	77541008fc	gha: make run-kata-coco-tests inherit secrets The new CoCo non-tee job introduced on commit `0d5399ba92` need to read secrets like AZ_TENANT_ID, so run-kata-coco-tests workflow should inherit the secrets from the caller workflow. Fixes #9477 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-15 10:53:44 -03:00
Zvonko Kaiser	78e3ebb011	version: add initrd, image NVIDIA sections Fixes: #9472 For initrd and image, the related NVIDIA will not use the default targets and we will pin them to a specific release. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-04-15 13:31:35 +00:00
Wainer Moschetta	c85e1ca674	Merge pull request #9404 from ldoktor/ci-mcp-timeout ci.ocp: Increase the MCP update time	2024-04-15 09:42:14 -03:00
Hyounggyu Choi	3ec209dcf1	Merge pull request #9469 from BbolroC/coco-kernel-config-s390x kernel: Adjust s390x config for confidential containers	2024-04-15 13:55:28 +02:00
Hyounggyu Choi	8fce600493	version: Add coco name and version for {image,initrd} for s390x In order to build a coco {image,initrd}, it is required to specify its name and version in versions.yaml. This commit is to add the configuration for them, respectively. Fixes: #9470 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-15 12:53:00 +02:00
Hyounggyu Choi	a792dc3e2b	kernel: Adjust s390x config for confidential containers `CONFIG_TN3270_TTY` and `CONFIG_S390_AP_IOMMU` are dropped for s390x in 6.7.x which is used for a confidential kernel. But they are still used for a vanilla kernel. So we need to add them to the whitelist. Fixes: #9465 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-15 10:28:59 +02:00
Hyounggyu Choi	32f58abfde	Merge pull request #9403 from BbolroC/runtime-rs-ci-qemu CI: Enable GHA cri-containerd workflow for runtime-rs with QEMU	2024-04-15 09:31:25 +02:00
Xuewei Niu	402d8a968e	Merge pull request #9430 from UiPath/fix-agent-shutdown agent: shutdown vm on exit when agent is used as init process	2024-04-15 10:47:07 +08:00
Wainer Moschetta	0a04f54a8e	Merge pull request #9454 from GabyCT/topic/pulltype gha: Define unbound PULL TYPE variable	2024-04-12 14:48:56 -03:00
Wainer Moschetta	a0b21d0e14	Merge pull request #9424 from wainersm/cc_guest_pull-encrypted CC: run guest-pull tests on non-TEE jobs	2024-04-12 09:34:35 -03:00
Hyounggyu Choi	cf20a6a4ae	gha: Add qemu-runtime-rs to VMM matrix for run-cri-containerd This commit expands the VMM matrix for run-cri-containerd, adding a new item `qemu-runtime-rs` for a test scenario where the VMM is QEMU and runtime-rs is employed. This expansion affects the workflows for both x86_64 and s390x platforms. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-12 12:25:53 +02:00
Hyounggyu Choi	606f8e1ab2	runtime-rs: Adjust configuration for qemu-runtime-rs To make `qemu-runtime-rs` working for CI, we have to rename a configuration template file and `CONFIG_FILE_QEMU` in Makefile. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-12 12:25:53 +02:00
Hyounggyu Choi	3c217c6c15	ci\|cri-containerd: Introduce qemu-runtime-rs for KATA_HYPERVISOR `qemu-runtime-rs` will be utilized to handle a test scenario where the VMM is QEMU and runtime-rs is employed. Note: Some of the tests are skipped. They are going to be reintegrated in the follow-up PR (Check out #9375). Fixes: #9371 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-12 12:25:53 +02:00
Alexandru Matei	9e01732f7a	agent: shutdown vm on exit when agent is used as init process Linux kernel generates a panic when the init process exits. The kernel is booted with panic=1, hence this leads to a vm reboot. When used as a service the kata-agent service has an ExecStop option which does a full sync and shuts down the vm. This patch mimicks this behavior when kata-agent is used as the init process. Fixes: #9429 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-04-12 11:32:31 +03:00
Alexandru Matei	54923164b5	clh: isClhRunning waits for full timeout when clh exits isClhRunning uses signal 0 to test whether the process is still alive or not. This doesn't work because the process is a direct child of the shim. Once it is dead the process becomes zombie. Since no one waits for it the process lingers until its parent dies and init reaps it. Hence sending signal 0 in isClhRunning will always return success whether the process is dead or not. This patch calls wait to reap the process, if it succeeds that means it is our child process, if not we send the signal. Fixes: #9431 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-04-12 11:31:53 +03:00
Dan Mihai	e51cbdcff9	tests: k8s: inject agent policy failures (part2) Auto-generate the policy and then simulate attacks from the K8s control plane by modifying the test yaml files. The policy then detects and blocks those changes. These test cases are using K8s Replication Controllers. Additional policy failures will be injected using other types of K8s resources - e.g., using Pods and/or Jobs - in separate PRs. Fixes: #9463 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-11 21:08:53 +00:00
Markus Rudy	77540503f9	genpolicy: add support for insecure registries genpolicy is a handy tool to use in CI systems, to prepare workloads before applying them to the Kubernetes API server. However, many modern build systems like Bazel or Nix restrict network access, and rightfully so, so any registry interaction must take place on localhost. Configuring certificates for localhost is tricky at best, and since there are no privacy concerns for localhost traffic, genpolicy should allow to contact some registries insecurely. As this is a runtime environment detail, not a target environment detail, configuring insecure registries does not belong into the JSON settings, so it's implemented as command line flags. Fixes: #9008 Signed-off-by: Markus Rudy <webmaster@burgerdev.de>	2024-04-11 22:29:03 +02:00
Wainer dos Santos Moschetta	4f74617897	tests: pass --overwrite-existing to aks get-credentials By passing --overwrite-existing to `aks get-credentials` it will stop asking if I want to overwrite the existing credentials. This is handy for running the scripts locally. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta	3508f3a43a	tests/k8s: use CoCo image on guest-pull when non-TEE When running on non-TEE environments (e.g. KATA_HYPERVISOR=qemu) the tests should be stressing the CoCo image (/opt/kata/share/kata-containers/kata-containers-confidential.img) although currently the default image/initrd is built to be able to do guest-pull as well. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta	c24f13431d	tests/k8s: enable guest-pull tests on non-TEE Enabled guest-pull tests on non-TEE environment. It know requires the SNAPSHOTTER environment variable to avoid it running on jobs where nydus-snapshotter is not installed Fixes: #9410 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-11 15:31:40 -03:00
Wainer dos Santos Moschetta	0d5399ba92	gha: Create CoCo tests jobs on non-TEE Created the new run-k8s-tests-coco-nontee jobs for running CoCo tests on non-TEE. It currently generates the run-k8s-tests-coco-nontee(qemu, nydus, guest-pull) job only to run the guest-pull tests. Fixes: #9410 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-11 15:31:40 -03:00
Gabriela Cervantes	5420595d03	tests/k8s: Add uninstall kbs client command function This PR adds the function to uninstall kbs client command function specially when we are running with baremetal devices. Fixes #9460 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-11 17:06:11 +00:00
Steve Horsman	6b2d655857	Merge pull request #9457 from justxuewei/fs_manager_tests agent: Fix the issue with the "test_new_fs_manager" test	2024-04-11 17:02:58 +01:00
Fabiano Fidêncio	5611233ed8	Merge pull request #9439 from microsoft/danmihai1/job-tests tests: k8s: inject agent policy failures	2024-04-11 17:21:54 +02:00
Markus Rudy	bc2292bc27	genpolicy: make pause container image configurable CRIs don't always use a pause container, but even if they do the concrete container choice is not specified. Even if the CRI config can be tweaked, it's not guaranteed that registries in the public internet can be reached. To be portable across CRI implementations and configurations, the genpolicy user needs to be able to configure the container the tool should append to the policy. Signed-off-by: Markus Rudy <webmaster@burgerdev.de>	2024-04-11 16:26:35 +02:00
Markus Rudy	8b30fa103f	genpolicy: parse json settings during config init Decouple initialization of the Settings struct from creating the AgentPolicy struct, so that the settings are available for evaluating, extending or overriding command line arguments. Signed-off-by: Markus Rudy <webmaster@burgerdev.de>	2024-04-11 16:17:33 +02:00
Xuewei Niu	50f78ec52c	agent: Fix the issue with the "test_new_fs_manager" test This patch introduces a one-time cpath to mitigate the cgroup residuals. It might break the device cgroup merging rules when the cgroup has children. Fixes: #9456 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-04-11 18:06:05 +08:00
GabyCT	08dcdc62de	Merge pull request #9423 from GabyCT/topic/improvecleanup tests: Improve the kbs_k8s_delete function	2024-04-10 14:28:21 -06:00
Gabriela Cervantes	4a2ee3670f	gha: Define unbound PULL TYPE variable This PR defines the PULL_TYPE variable to avoid failures of unbound variable when this is being test it locally. Fixes #9453 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-10 17:16:19 +00:00
GabyCT	dab837d71d	Merge pull request #9450 from GabyCT/topic/fixinnydus gha: Fix indentation in gha run script	2024-04-10 11:07:56 -06:00
David Esparza	9e1368dbc5	Merge pull request #9391 from dborquez/add-onednn-openvino-ml-benchs add onednn and openvino ml-benchmarks	2024-04-09 19:03:00 -06:00
Dan Mihai	ea31df8bff	Merge pull request #9185 from microsoft/saulparedes/genpolicy_add_containerd_pull genpolicy: Add optional toggle to pull images using containerd	2024-04-09 12:29:19 -07:00
Gabriela Cervantes	6ebdcf8974	gha: Fix indentation in gha run script This PR fixes an identation in gha run script. Fixes #9449 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-09 16:37:17 +00:00
Greg Kurz	89353249fc	Merge pull request #8988 from beraldoleal/ci-docs docs: adding an initial CI documentation	2024-04-09 18:26:15 +02:00
Dan Mihai	2252490a96	tests: k8s: inject agent policy failures Auto-generate the policy and then simulate attacks from the K8s control plane by modifying the test yaml files. The policy then detects and blocks those changes. These test cases are using K8s Jobs. Additional policy failures will be injected using other types of K8s resources - e.g., using Pods and/or Replication Controllers - in future PRs. Fixes: #9406 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-09 15:36:57 +00:00
David Esparza	facf3c9364	metrics: Add onednn benchmark. This PR adds onednn test to exercise additional ML benchmarks. Onednn is an Intel-optimized library for Deep Neural Networks. Fixes: #9390 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
David Esparza	3bde511d0d	metrics: Add openvino benchmark. This PR adds openvino test in order to exercise additional ML benchmarks. OpenVino bench used to optimize and deploy deep learning models. Fixes: #9389 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
David Esparza	b37c5f8ba1	metrics:libs: Add HTTPS and HTTP vars to docker build. Include HTTP and HTTPS env variables in the building docker images because they are required to download packages such as Phoronix. Added a restriction that verifies that docker building images is performed as root. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
David Esparza	3355dd9e2b	metrics:libs: Adds a function to set new kata configuration. Adds a function that receives as a single parameter the name of a valid Kata configuration file which will be established as the default kata configuration to start kata containers. Adds a second function that returns the path to the current kata configuration file. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
David Esparza	cb4380d1c9	metrics: common: Add function to clean the cache. The function clear the Page Cache only. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
David Esparza	3a419ba3b1	metrics: common: Add function to update kata config. Add an extra function that updates kata config to use the max num. of vcpus available and to use the available memory in the system. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-04-09 09:05:51 -06:00
Beraldo Leal	959e56525c	docs: adding an initial CI documentation This is actually a first attempt to document our CI, and all this content was based on the document created by Fabiano Fidencio (kudos to him). We are just moving the content and discussion from Google Docs to here. I used the "poetic license" to add some notes on what I believe our CI will look like in the future. Fixes #9006 Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Beraldo Leal <bleal@redhat.com>	2024-04-09 09:21:47 -04:00
Saul Paredes	51498ba99a	genpolicy: toggle containerd pull in tests - Add v1 image test case - Install protobuf-compiler in build check - Reset containerd config to default in kubernetes test if we are testing genpolicy - Update docker_credential crate - Add test that uses default pull method - Use GENPOLICY_PULL_METHOD in test Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-04-08 19:28:29 -07:00
Dan Mihai	f60c9eaec3	Merge pull request #9398 from microsoft/danmihai1/policy-test-cleanup tests: k8s: improve the Agent Policy tests	2024-04-08 15:37:07 -07:00
Gabriela Cervantes	fb4c359cc2	tests: Improve the kbs_k8s_delete function This PR improves the kbs_k8s_delete function to verify that the resources were properly deleted for baremetal environments. Fixes #9379 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-08 18:03:07 +00:00
Saul Paredes	c96ebf237c	genpolicy: add containerd pull method Add optional toggle to use existing containerd installation to pull and manage container images. This adds support to a wider set of images that are currently not supported by standard pull method, such as those that use v1 manifest. Fixes: #9144 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-04-08 09:56:59 -07:00
Greg Kurz	8b996b9307	Merge pull request #9331 from egernst/foobar katautils: check number of cores on the system intead of go runtime	2024-04-08 18:38:49 +02:00
Greg Kurz	934beb5ae4	Merge pull request #9421 from gkurz/bump-node-js-20 gha: Bump various actions to use Node.js 20	2024-04-08 18:22:28 +02:00
Wainer Moschetta	fba1d394d7	Merge pull request #9369 from ChengyuZhu6/sandbox-image agent:image: Support different pause image in the guest for guest pull	2024-04-08 11:06:21 -03:00
Steve Horsman	3242f55691	Merge pull request #8870 from LindaYu17/aa2main port attestation agent from CCv0 branch to main branch	2024-04-08 15:01:07 +01:00
James O. D. Hunt	42936cb92c	Merge pull request #9372 from jodh-intel/docs-kata-manager-update docs: kata-manager: Update with latest details	2024-04-08 13:23:23 +01:00
stevenhorsman	864e9c22ba	agent: doc: Add new config doc Document the new guest_components_rest_api config parameter Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-08 11:38:53 +01:00
stevenhorsman	29a5652e31	packaging: guest-components, set new environment variables - Set KBC_PROVIDER and ATTESTER rather than TEE_PLATFORM to avoid tss build issues for vTPM attester(s) - There are future plans to make a matching TEE_PLATFORM, so this can be simplified once that is available Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-08 11:38:53 +01:00
stevenhorsman	a284a20a14	tests: Filter CoCo tests on ppc64le/arm - At the moment we aren't supporting ppc64le or aarch64 for CoCo, so filter out these tests from running Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-08 11:38:53 +01:00
stevenhorsman	a0c03966c2	versions: Bump guest-components - Bump guest-components to try and test compatibility with the latest version Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-08 11:38:53 +01:00
stevenhorsman	101a5bf273	packaging: Update guest-components Dockerfile - Switch to Ubuntu 20.04 for building guest-components as The rootfs is based on 20.04, so we need matching GLIBC versions. See #8955 - Add dependencies needed by TDX verifier as we want to build for all platforms Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-04-08 11:38:53 +01:00
Gabriela Cervantes	6d85025e59	test/k8s: Add basic attestation test - Add basic test case to check that a ruuning pod can use the api-server-rest (and attestation-agent and confidential-data-hub indirectly) to get a resource from a remote KBS Fixes #9057 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Co-authored-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-04-08 11:38:53 +01:00
Biao Lu	f0edec84f6	agent: Launch api-server-rest If 'rest_api' is configured, let's start the api-server-rest after the attestation-agent and the confidential-data-hub have been started. Fixes: #7555 Signed-off-by: Biao Lu <biao.lu@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com> Co-authored-by: Alex Carter <alex.carter@ibm.com> Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-04-08 11:38:53 +01:00
Biao lu	4d752e6350	agent: Add config for api-server-rest Add configuration for 'rest api server'. Optional configurations are 'agent.rest_api=attestation' will enable attestation api 'agent.rest_api=resource' will enable resource api 'agent.rest_api=all' will enable all (attestation and resource) api Fixes: #7555 Signed-off-by: Biao Lu <biao.lu@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com> Co-authored-by: Alex Carter <alex.carter@ibm.com> Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-04-08 11:06:14 +01:00
Biao Lu	f476d671ed	agent: Launch the confidential data hub Let's introduce a new method to start the confidential data hub and the attestation agent. The former depends on the later, and it needs to be started before the RPC server. Starting the attestation components is based on whether the confidential containers guest components binaries are found in the rootfs. Fixes: #7544 Signed-off-by: Biao Lu <biao.lu@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com> Co-authored-by: Alex Carter <alex.carter@ibm.com> Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-04-08 11:06:14 +01:00
Greg Kurz	be8f0cb520	Merge pull request #9402 from deagon/feat/debug-threads qemu: show the thread name when enable the hypervisor.debug option	2024-04-08 11:04:36 +02:00
Hyounggyu Choi	e39be7a45e	Merge pull request #9415 from BbolroC/fix-dir-removal-error GHA: Implement secondary GITHUB_WORKSPACE cleanup on 1st failure	2024-04-08 10:44:44 +02:00
ChengyuZhu6	8c897f822c	agent:image: Support different pause image in the guest for guest pull Support different pause images in the guest for guest-pull, such as k8s pause image (registry.k8s.io/pause) and openshift pause image (quay.io/bpradipt/okd-pause). Fixes: #9225 -- part III Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-04-07 09:00:10 +08:00
GabyCT	9d2c5b180e	Merge pull request #9419 from GabyCT/topic/fxlatency metrics: Improve latency test cleanup	2024-04-05 16:31:00 -06:00
Wainer Moschetta	aae7048d4f	Merge pull request #9273 from ldoktor/kcli-coco-kbs tests: Support for kbs setup on kcli	2024-04-05 18:55:58 -03:00
Fabiano Fidêncio	f09bb98f51	Merge pull request #8840 from fidencio/topic/update-tdx-artefacts-to-the-new-host-os tdx: Update TDX artefacts to be used with the Ubuntu 23.10 / CentOS 9 stream OSVs.	2024-04-05 22:36:03 +02:00
Fabiano Fidêncio	cdb8531302	hypervisor: Simplify TDX protection detection Let's rely on the kvm module 'tdx' parameter to do so. This aligns with both OSVs (Canonical, Red Hat, SUSE) and the TDX adoption (https://github.com/intel/tdx-linux) stacks. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 19:51:27 +02:00
Fabiano Fidêncio	2ee03b5dc3	tdvf: Adapt the build command This is done in order to match the example from: https://github.com/intel/tdx-linux/wiki/Instruction-to-set-up-TDX-host-and-guest#build-tdvf-image Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 19:51:27 +02:00
Fabiano Fidêncio	b7cccfa019	qemu: tdx: Adapt command line This commit is a mess, but I'm not exactly sure what's the best way to make it less messy, as we're getting QEMU TDX to work while partially reverting `1e34220c41`. With that said, let me cover the content of this commit. Firstly, we're reverting all the changes related to "memory-backend-memfd-private", as that's what was used with the previous host stack, but it seems it didn't fly upstream. Secondly, in order to get QEMU to properly work with TDX, we need to enforce the 'private=on' knob and use the "memory-backend-ram", and we're doing so, and also making sure to test the `private=on` newly added knob. I'm sorry for the confusion, I understand this is not optimal, I just don't see an easy path to do changes without leaving the code broken during those changes. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 19:51:27 +02:00
Greg Kurz	424a5e243f	gha: Bump to `actions/[down\|up]load-artifact@v4` (all the rest) `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. This fixes all remaining sites. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:51 +02:00
Greg Kurz	dbc5dc7806	gha: Bump to `actions/[down\|up]load-artifact@v4` (k8s tests on garm) `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. As explained at [1] : > The contents of an Artifact are uploaded together into an immutable > archive. They cannot be altered by subsequent jobs. Both of these > factors help reduce the possibility of accidentally corrupting > Artifact files. This means that artifacts cannot have the same name. Adapt the `run-k8s-tests-on-garm` workflow accordingly by embedding all the other `${{ vmm.* }}` fields and `${{ inputs.tag }}` in the artifact names that would otherwise collide. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:51 +02:00
Greg Kurz	62a54ffa70	gha: Bump to `actions/[down\|up]load-artifact@v4` (kata static tarball) `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. As explained at [1] : > The contents of an Artifact are uploaded together into an immutable > archive. They cannot be altered by subsequent jobs. Both of these > factors help reduce the possibility of accidentally corrupting > Artifact files. This means that artifacts cannot have the same name. Adapt all `build-kata-static-tarball` workflows accordingly by embedding `${{ matrix.asset }}` in the artifact names that would otherwise collide. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:51 +02:00
Greg Kurz	7f2ce914a1	gha: Bump to `actions/checkout@v4` `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:50 +02:00
Greg Kurz	0a43d26c94	gha: Bump to `docker/login-action@v3` `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:50 +02:00
Greg Kurz	06c9c0d7db	gha: Bump to `docker/build-push-action@v5` `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:50 +02:00
Greg Kurz	8c21844aef	gha: Bump to `docker/setup-buildx-action@v3` `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:50 +02:00
Greg Kurz	03cbe6a011	gha: Bump to `docker/setup-qemu-action@v3` `Node.js 19` is deprecated. Bump to a new version based on `Node.js 20`. Fixes #9245 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-04-05 18:36:50 +02:00
Hyounggyu Choi	4493459937	GHA: Implement secondary GITHUB_WORKSPACE cleanup on 1st failure Occasionally, the removal of GITHUB_WORKSPACE fails for self-hosted runners because one of the subdirectories is not empty. This is likely due to another process occupying the directory at the time. Implementing a secondary cleanup resolves this issue. This commit focuses on the implementation for the secondary cleanup. Fixes: #9317 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-04-05 11:41:51 +02:00
Fabiano Fidêncio	6b4cc5ea6a	Revert "qemu: tdx: Workaround SMP issue with TDX 1.5" This reverts commit `d1b54ede29`. Conflicts: src/runtime/virtcontainers/qemu.go This commit was a hack that was needed in order to get QEMU + TDX to work atop of the stack our CI was running on. As we're moving to "the officially supported by distros" host OS, we need to get rid of this. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 10:23:52 +02:00
Fabiano Fidêncio	582b5b6b19	govmm: tdx: Expose the private=on\|off knob The private=on\|off knob is required in order to properly lauunch a TDX guest VM. This is a brand new property that is part of the still in-flight patches adding TDX support on QEMU. Please, see: `3fdd8072da` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 10:23:52 +02:00
Fabiano Fidêncio	fe5adae5d9	qemu-tdx: Update to v8.1.0 + TDX patches Let's update the QEMU to the one that's officially maintained by Intel till all the TDX patches make their way upstream. We've had to also update python to explicitly use python3 and add python3-venv as part of the dependencies. Fixes: #8810 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-04-05 10:23:51 +02:00
Alex Lyn	0e0a361f0e	Merge pull request #8782 from Apokleos/device-increate-count bugfix and refactor device increate count	2024-04-05 13:43:49 +08:00
Dan Mihai	6f9f8ae285	Merge pull request #9413 from microsoft/saulparedes/ensure_unique_rg_in_gha gha: ensure unique resource group name	2024-04-04 17:13:09 -07:00
GabyCT	80d926c357	Merge pull request #9411 from microsoft/danmihai1/k8s-job tests: k8s-job: wait for job successful create	2024-04-04 15:14:56 -06:00
Gabriela Cervantes	8e5d401be0	metrics: Improve latency test cleanup This PR improves the latency test cleanup in order to avoid random failures of leaving the pods. Fixes #9418 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-04 20:43:53 +00:00
Saul Paredes	f20caac1c0	gha: ensure unique resource group name There's an rg name duplication situation that got introduced by #9385 where 2 different test runs might have same rg name. Add back uniqueness by including the first letter of GENPOLICY_PULL_METHOD to cluster name. Fixes: #9412 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-04-04 13:13:32 -07:00
GabyCT	aae2679f09	Merge pull request #9409 from GabyCT/topic/ghrunset gha: Define GH_PR_NUMBER variable in gha run k8s common script	2024-04-04 09:46:48 -06:00
Eric Ernst	da01bccd36	katautils: check number of cores on the system intead of go runtime We used to utilize go runtime's "NumCPUs()", which will give the number of cores available to the Go runtime, which may be a subset of physical cores if the shim is started from within a cpuset. From the function's description: "NumCPU returns the number of logical CPUs usable by the current process." As an example, if containerd is run from within a smaller CPUset, the maximum size of a pod will be dictated by this CPUset, instead of what will be available on the rest of the system. Since the shim will be moved into its own cgroup that may have a different CPUset, let's stick with checking physical cores. This also aligns with what we have documented for maxVCPU handling. In the event we fail to read /proc/cpuinfo, let's use the goruntime. Fixes: #9327 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2024-04-03 16:09:16 -07:00
Dan Mihai	3e72b3f360	tests: k8s-job: wait for job successful create Don't just verify SuccessfulCreate - wait for it if needed. Fixes: #9138 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 22:11:15 +00:00
Gabriela Cervantes	73f27e28d1	gha: Define GH_PR_NUMBER variable in gha run k8s common script This PR defines the GH_PR_NUMBER variable in gha run k8s common script to avoid failures like unbound variable when running locally the scripts just like the GHA CI. Fixes #9408 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-03 18:25:00 +00:00
GabyCT	c5c229b330	Merge pull request #9397 from GabyCT/topic/removeconmon versions: Remove conmon information from versions.yaml	2024-04-03 11:14:43 -06:00
GabyCT	12947b1ba6	Merge pull request #9344 from GabyCT/topic/kerneldoc docs: Remove stale kernel information	2024-04-03 11:13:54 -06:00
Dan Mihai	07c23a05f2	Merge pull request #9385 from microsoft/saulparedes/add_genpolicy_yaml_params gha: add GENPOLICY_PULL_METHOD	2024-04-03 09:20:16 -07:00
Lukáš Doktor	b8382cea88	ci.ocp: Increase the MCP update time updating the machine config takes even longer than 1200s, use 60m to be sure everything is updated. Fixes: #9338 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-04-03 15:01:29 +02:00
Alex Lyn	935a1a3b40	runtime-rs: refactor decrease_attach_count with do_decrease_count Try to reduce duplicated code in decrease_attach_count with public new function do_decrease_count. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-03 17:19:19 +08:00
Alex Lyn	4f0fab938d	runtime-rs: refactor increase_attach_count with do_increase_count Try to reduce duplicated code in increase_attach_count with public new function do_increase_count. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-03 17:19:19 +08:00
Alex Lyn	fff64f1c3e	runtime-rs: introduce dedicated function do_decrease_count Introduce a dedicated public function do_decrease_count to reduce duplicated code in drivers' decrease_attach_count. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-03 17:19:08 +08:00
Alex Lyn	5750faaf31	runtime-rs: introduce dedicated function do_increase_count Since there are many implementations of reference counting in the drivers, all of which have the same implementation, we should try to reduce such duplicated code as much as possible. Therefore, a new function is introduced to solve the problem of duplicated code. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-03 17:09:17 +08:00
Dan Mihai	f800bd86f6	tests: k8s-sandbox-vcpus-allocation.bats policy Use the "allow all" policy for k8s-sandbox-vcpus-allocation.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:01:33 +00:00
Dan Mihai	4211d93b87	tests: k8s-nginx-connectivity.bats policy Use the "allow all" policy for k8s-nginx-connectivity.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:01:26 +00:00
Dan Mihai	5dcf64ef34	tests: k8s-volume.bats allow all policy Use the "allow all" policy for k8s-volume.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:01:18 +00:00
Dan Mihai	04085d8442	tests: k8s-sysctls.bats allow all policy Use the "allow all" policy for k8s-sysctls.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:01:10 +00:00
Dan Mihai	839993f245	tests: k8s-security-context.bats allow all policy Use the "allow all" policy for k8s-security-context.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:01:03 +00:00
Dan Mihai	02a050b47e	tests: k8s-seccomp.bats allow all policy Use the "allow all" policy for k8s-seccomp.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:56 +00:00
Dan Mihai	543e40b80c	tests: k8s-projected-volume.bats allow all policy Use the "allow all" policy for k8s-projected-volume.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:47 +00:00
Dan Mihai	3f94e2ee1b	tests: k8s-pod-quota.bats allow all policy Use the "allow all" policy for k8s-pod-quota.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:37 +00:00
Dan Mihai	ba23758a42	tests: k8s-optional-empty-secret.bats policy Use the "allow all" policy for k8s-optional-empty-secret.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:30 +00:00
Dan Mihai	e4ff6b1d91	tests: k8s-measured-rootfs.bats allow all policy Use the "allow all" policy for k8s-measured-rootfs.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:23 +00:00
Dan Mihai	2821326a7e	tests: k8s-liveness-probes.bats allow all policy Use the "allow all" policy for k8s-liveness-probes.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:15 +00:00
Dan Mihai	9af3e4cc4a	tests: k8s-inotify.bats allow all policy Use the "allow all" policy for k8s-inotify.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:08 +00:00
Dan Mihai	bd45e948cc	tests: k8s-guest-pull-image.bats policy Use the "allow all" policy for k8s-guest-pull-image.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 03:00:00 +00:00
Dan Mihai	be3797ef7c	tests: k8s-footloose.bats allow all policy Use the "allow all" policy for k8s-footloose.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 02:59:50 +00:00
Dan Mihai	18f5e55667	tests: k8s-empty-dirs.bats allow all policy Use the "allow all" policy for k8s-empty-dirs.bats, instead of relying on the Kata Guest image to use the same policy as its default. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 02:59:44 +00:00
Dan Mihai	ef22bd8a2b	tests: k8s: replace run_policy_specific_tests Check from: - k8s-exec-rejected.bats - k8s-policy-set-keys.bats if policy testing is enabled or not, to reduce the complexity of run_kubernetes_tests.sh. After these changes, there are no policy specific commands left in run_kubernetes_tests.sh. add_allow_all_policy_to_yaml() is moving out of run_kubernetes_tests.sh too, but it not used yet. It will be used in future commits. Fixes: #9395 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-03 02:59:28 +00:00
Guoqiang Ding	cd0c31e185	qemu: show the thread name when enable the hypervisor.debug option Add debug-threads=on in the name argument if debug enabled. Fixes: #9400 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-04-03 10:36:52 +08:00
Saul Paredes	8a92e81f98	gha: add GENPOLICY_PULL_METHOD Add GENPOLICY_PULL_METHOD that will be used to test pulling container images in genpolicy using the oci-distribution crate and/or the containerd interface. GENPOLICY_PULL_METHOD will start being used in a future PR. Fixes: #9384 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-04-02 19:03:28 -07:00
Gabriela Cervantes	f3957352f0	versions: Remove conmon information from versions.yaml This PR removes conmon information from versions.yaml as this is not longer being used in kata containers repository. Fixes #9396 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-04-02 16:25:45 +00:00
Dan Mihai	39805822fc	tests: k8s: reduce policy testing complexity Don't add the "allow all" policy to all the test YAML files anymore. After this change, the k8s tests assume that all the Kata CI Guest rootfs image files either: - Don't support Agent Policy at all, or - Include an "allow all" default policy. This relience/assumption will be addressed in a future commit. Fixes: #9395 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-04-02 16:18:31 +00:00
Alex Lyn	7795f9c016	Merge pull request #9365 from GabyCT/topic/removerunc versions: Remove runc version information	2024-04-02 09:21:56 +08:00
Alex Lyn	fa8049af6c	Merge pull request #9383 from Apokleos/unified-cgrp-cmdline kata-agent: enabling cgroups-v2 by systemd.unified_cgroup_hierarchy	2024-04-02 09:08:04 +08:00
Alex Lyn	07bfdf4a22	Merge pull request #9275 from Apokleos/swap-hooks-bindmnt kata-agent: Change order of guest hook and bind mount processing	2024-04-02 07:40:10 +08:00
Alex Lyn	c88014834b	kata-agent: enabling cgroups-v2 by systemd.unified_cgroup_hierarchy Configure the system to mount cgroups-v2 by default during system boot by the systemd system, We must add systemd.unified_cgroup_hierarchy=1 parameter to kernel cmdline, which will be passed by kernel_params in configuration.toml. To enable cgroup-v2, just add systemd.unified_cgroup_hierarchy=true[1] to kernel_params. Fixes: #9336 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-01 18:45:12 +08:00
alex.lyn	548f252bc4	runtime-rs: bugfix incorrect use of refcount before vfio attach When there's a pod with multiple containers, there may be case that attach point more than 2, we should not return Err in that case when we are doing attach ops, but just return Ok. Fixes: #8738 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-04-01 11:28:57 +08:00
Alex Lyn	aa9cd232cd	Merge pull request #9358 from GabyCT/topic/nerdrandom gha: Update journal log names for nerdctl artifacts	2024-04-01 09:50:16 +08:00
Alex Lyn	dfa8832406	Merge pull request #9345 from c3d/bug/9342-agent-test-errors agent: Fix errors in `make check`	2024-04-01 09:48:44 +08:00
Dan Mihai	3a7dbcfc17	Merge pull request #9367 from microsoft/danmihai1/infinite-io-stream-copy-loop runtime: remove stream copy infinite loop	2024-03-29 09:37:44 -07:00
Dan Mihai	600f9266f3	runtime: remove stream copy infinite loop This reverts commit `1c5693be86`. Avoid apparent infinite loop when ReadStreamRequest is blocked by policy - for some of the pods. When running the k8s-limit-range.bats test with Policy enabled, the Shim + VMM never get terminated on my cluster. Not sure why the sandbox clean-up works better for other tests, but the k8s-limit-range test pod gets stuck in an infinite loop: stdout io stream copy error happens: error = %wrpc error: code = PermissionDenied desc = \"ReadStreamRequest is blocked by policy ... policy check: ReadStreamRequest ... stdout io stream copy error happens: error = %wrpc error: code = PermissionDenied desc = \"ReadStreamRequest is blocked by policy ... policy check: ReadStreamRequest ... Fixes: #9380 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-28 22:43:28 +00:00
James O. D. Hunt	13966f4d1d	docs: kata-manager: Add help for permissions issue The 3.3.0 release installs the `kata-manager` script with overly restrictive permissions (see #9373), so add details to help users handle the situation. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-28 16:22:10 +00:00
James O. D. Hunt	5589e4e291	docs: kata-manager: Update with latest details Now that v3.3.0 has been released, simplify the `kata-manager` documentation. Fixes: #9227. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-28 16:22:10 +00:00
James O. D. Hunt	52fe60c94b	docs: kata-manager: Fix heading levels Add an extra heading indent so that there is only a single top-level heading. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-28 16:21:31 +00:00
Dan Mihai	ebb26edf42	Merge pull request #9347 from microsoft/danmihai1/reduce-exec-test-policy-prints genpolicy: reduce policy debug prints	2024-03-27 15:12:10 -07:00
Gabriela Cervantes	a32418bf32	versions: Remove runc version information This PR removes the runc version information as this is not longer being used in the kata containers scripts. Fixes #9364 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-27 20:32:38 +00:00
Steve Horsman	b3acbe0b7f	Merge pull request #8046 from fitzthum/clean-config runtime: remove unimplemented CoCo configurations	2024-03-27 19:39:48 +00:00
Tobin Feldman-Fitzthum	04d021bd12	packaging: remove SERVICEOFFLOAD option Since we're removing the unused service_offload parameter, don't set it in any of the packaging scripts. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:13 -05:00
Tobin Feldman-Fitzthum	9856fe5bea	runtime: remove ServiceOffload parameter Since we no longer use the service_offload configuration, remove the ServiceOffload field from the image struct. Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:13 -05:00
Tobin Feldman-Fitzthum	a18c7ca307	runtime: remove unimplemented CoCo configurations These experimental options were added 2 years ago in anticipation of features that would be added in CoCo. These do not match the features that were eventually added and will soon be ported to main. Fixes: #8047 Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>	2024-03-27 12:21:06 -05:00
Steve Horsman	53fa1fd82d	Merge pull request #9349 from fidencio/topic/ci-k8s-update-cpuid k8s: confidential: Update cpuid to its latest release	2024-03-27 16:57:36 +00:00
Chengyu Zhu	e66a5cb54d	Merge pull request #9332 from ChengyuZhu6/guest-pull-timeout Support to set timeout to pull large image in guest	2024-03-28 00:34:08 +08:00
Christophe de Dinechin	82c4079fd0	agent: Remove useless loop This is the report from `make check`: ``` error: this loop never actually loops --> src/signal.rs:147:9 \| 147 \| / loop { 148 \| \| select! { 149 \| \| _ = handle => { 150 \| \| println!("INFO: task completed"); ... \| 156 \| \| } 157 \| \| } \| \|_________^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#never_loop = note: `#[deny(clippy::never_loop)]` on by default ``` There is only one option: you get something or a timeout. You never retry, so the report is correct. Fixes: #9342 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2024-03-27 17:03:44 +01:00
Christophe de Dinechin	df5c88cdf0	agent: Remove lint error about `.flatten` running forever The lint report is the following: ``` error: `flatten()` will run forever if the iterator repeatedly produces an `Err` --> src/rpc.rs:1754:10 \| 1754 \| .flatten() \| ^^^^^^^^^ help: replace with: `map_while(Result::ok)` \| note: this expression returning a `std::io::Lines` may produce an infinite number of `Err` in case of a read error --> src/rpc.rs:1752:5 \| 1752 \| / reader 1753 \| \| .lines() \| \|________________^ = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#lines_filter_map_ok = note: `-D clippy::lines-filter-map-ok` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::lines_filter_map_ok)]` ``` This commit simply applies the suggestion. Fixes: #9342 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2024-03-27 17:03:44 +01:00
Christophe de Dinechin	bfb55312be	agent: Fix `.enumerate` errors during `make check` Running `make check` in the `src/agent` directory gives: ``` error: you seem to use `.enumerate()` and immediately discard the index --> rustjail/src/mount.rs:572:27 \| 572 \| for (_index, line) in reader.lines().enumerate() { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unused_enumerate_index = note: `-D clippy::unused-enumerate-index` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::unused_enumerate_index)]` help: remove the `.enumerate()` call \| 572 \| for line in reader.lines() { \| ~~~~ ~~~~~~~~~~~~~~ Checking tokio-native-tls v0.3.1 Checking hyper-tls v0.5.0 Checking reqwest v0.11.18 error: could not compile `rustjail` (lib) due to 1 previous error warning: build failed, waiting for other jobs to finish... make: *** [../../utils.mk:177: standard_rust_check] Error 101 ``` Fixes: #9342 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2024-03-27 17:03:44 +01:00
Greg Kurz	e1068da1a0	Merge pull request #9326 from gkurz/draft-release Only tag and publish the release when it is fully ready	2024-03-27 15:59:59 +01:00
ChengyuZhu6	c50d3ebacc	tests:k8s: Add a test to pull large images in the guest Add a test to pull large images in the guest. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 21:58:44 +08:00
ChengyuZhu6	8551ee9533	how-to: add createcontainer timeout to sandbox config documentation add createcontainer timeout annotation to sandbox config documentation. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 21:58:44 +08:00
ChengyuZhu6	c2dc13ebaa	runtime: support to configure CreateContainer Timeout in configurations support to configure CreateContainerRequestTimeout in the configurations. e.g.: [runtime] ... create_container_timeout = 300 Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout. In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 21:58:41 +08:00
Chengyu Zhu	87fc17d4d2	Merge pull request #9341 from ChengyuZhu6/guest-pull-doc docs: Add documents for kata guest image management	2024-03-27 21:20:22 +08:00
ChengyuZhu6	95b2f7f129	how-to: Add a document for kata guest image management usage Add a document for kata guest image management usage. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 20:09:37 +08:00
Greg Kurz	693c9487d4	docs: Adjust release documentation Most of the content of `docs/Stable-Branch-Strategy.md` got de-facto deprecated by the re-design of the release process described in #9064. Remove this file and all its references in the repo. The `## Versioning` section has some useful information though. It is moved to `docs/Release-Process.md`. The documentation of the `PATCH` field is adapted according to new workflow. Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-27 12:41:48 +01:00
Steve Horsman	45aba769c0	Merge pull request #9346 from cmaf/ci-remove-repo-docs Remove additional links to tests directory	2024-03-27 11:13:32 +00:00
Steve Horsman	a1a615a7c8	Merge pull request #9356 from stevenhorsman/agent-opa-ppc64le-s390x workflows: Build agent-opa for more archs	2024-03-27 08:53:28 +00:00
ChengyuZhu6	2224f6d63f	runtime: support to configure CreateContainer timeout in annotation Support to configure CreateContainerRequestTimeout in the annotations. e.g.: annotations: "io.katacontainers.config.runtime.create_container_timeout": "300" Note: The effective timeout is determined by the lesser of two values: runtime-request-timeout from kubelet config (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#:~:text=runtime%2Drequest%2Dtimeout) and create_container_timeout. In essence, the timeout used for guest pull=runtime-request-timeout<create_container_timeout?runtime-request-timeout:create_container_timeout. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 15:44:29 +08:00
ChengyuZhu6	39bd462431	runtime: support to set timeout for CreateContainerRequest In the situation to pull images in the guest #8484, it’s important to account for pulling large images. Presently, the image pull process in the guest hinges on `CreateContainerRequest`, which defaults to a 60-second timeout. However, this duration may prove insufficient for pulling larger images, such as those containing AI models. Consequently, we must devise a method to extend the timeout period for large image pull. Fixes: #8141 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-27 15:44:29 +08:00
Gabriela Cervantes	a997e282be	gha: Update journal log names for nerdctl artifacts This PR updates the journal log name for nerdctl artifacts to make sure that we have different names in case we add a parallel GHA job. Fixes #9357 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-26 20:03:54 +00:00
GabyCT	c163d9f114	Merge pull request #9329 from GabyCT/topic/seun scripts: Fix unbound variables in k8s setup script	2024-03-26 11:19:33 -06:00
stevenhorsman	9aa675abb9	workflows: Build agent-opa for more archs Since https://github.com/kata-containers/kata-containers/pull/7769, we support building the OPA binary into the ppc64le and s390x arch versions of the rootfs, so build the policy enabled agent to match for those architectures too. Fixes: #9355 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-03-26 17:02:14 +00:00
Lukáš Doktor	a671b3fc6e	tests: Use full svc address to check kbs service the service might not listen on the default port, use the full service address to ensure we are talking to the right resource. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-26 16:59:02 +01:00
Lukáš Doktor	6b0eaca4d4	tests: Add support for nodeport ingress for the kbs setup this can be used on kcli or other systems where cluster nodes are accessible from all places where the tests are running. Fixes: #9272 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-26 16:59:00 +01:00
Greg Kurz	5009fabde4	release: Keep it draft until all artifacts have been published The automated release workflow starts with the creation of the release in GitHub. This is followed by the build and upload of the various artifacts, which can be very long (like hours). During this period, the release appears to be fully available in https://github.com/kata-containers/kata-containers/ even though it lacks all the artifacts. This might be confusing for users or automation consuming the release. Create the release as draft and clear the draft flag when all jobs are done. This ensure that the release will only be tagged and made public when it is fully usable. If some job fails because of network timeout or any other transient error, the correct action is to restart the failed jobs until they eventually all succeed. This is by far the quicker path to complete the release process. If the workflow is canceled for some reason, the draft release is left behind. A new run of the workflow will create a brand new draft release with the same name (not an issue with GitHub). The draft release from the previous run should be manually deleted. This step won't be automated as it looks safer to leave the decision to a human. [1] https://github.com/kata-containers/kata-containers/releases Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-26 14:48:05 +01:00
Pavel Mores	4c72b02e53	runtime-rs: remove the now-unused code of NetDevice The remaining code in network.rs was mostly moved to utils.rs which seems better home for these utility functions anyway (and a closely related function open_named_tuntap() has already lived there). ToString implementation for Address was removed after some consideration. Address should probably ideally implement Display (as per RFC 565) which would also supply a ToString implementation, however it implements Debug instead, probably to enable automatic implementation of Debug for anything that Address is a member of, if for no other reason. Rather than having two identical functions this commit simply switches to using the Debug implementation for printing Address on qemu command line. Fixes #9352 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:52:40 +01:00
Pavel Mores	c94e55d45a	runtime-rs: make QemuCmdLine own vsock file descriptor Make file descriptors to be passed to qemu owned by QemuCmdLine. See commit 52958f17cd for more explanation. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	0cf0e923fc	runtime-rs: refactor QemuCmdLine::add_network_device() signature add_network_device() doesn't need to be passed NetworkInfo since it already has access to the full HypervisorConfig. Also, one of the goals of QemuCmdLine interface's design is to avoid coupling between QemuCmdLine and the hypervisor crate's device module, if at all possible. That's why add_network_device() shouldn't take device module's NetworkConfig but just parts that are useful in add_network_device()'s implementation. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	a4f033f864	runtime-rs: add should_disable_modern() utility function is_running_in_vm() is enough to figure out whether to disable_modern but it's clumsy and verbose to use. should_disable_modern() streamlines the usage by encapsulating the verbosity. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	12e40ede97	runtime-rs: reimplement add_network_device() using Netdev & DeviceVirtioNet This commit replaces the existing NetDevice-based implementation with one using Netdev and DeviceVirtioNet. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	0a57e2bb32	runtime-rs: refactor NetDevice in qemu driver In keeping with architecture of QemuCmdLine implementation we split the functionality into two objects: Netdev to represent and generate the -netdev part and DeviceVirtioNet for the -device virtio-net-<transport> part. This change is a pure refactor, existing functionality does not change. However, we do remove some stub generalizations and govmm-isms, notably: - we remove the NetDev enum since the only network interface types that kata seems to use with qemu are tuntap and macvtap, both of which are implemented by the same -netdev tap - enum DeviceDriver is also left out since it doesn't seem reasonable to try to represent VFIO NICs (which are completely different from virtio-net ones) with the same struct as virtio-net - we also remove VirtioTransport because there's no use for it so far, but with the expectation that it will be added soon. We also make struct Netdev the owner of any vhost-net and queue file descriptors so that their lifetime is tied ultimately to the lifetime of QemuCmdLine automatically, instead of returning the fds to the caller and forcing it to achieve the equivalent functionality but manually. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	7f23734172	runtime-rs: reduce generate_netdev_fds() dependencies generate_netdev_fds() takes NetworkConfig from which it however only needs a host-side network device name. This commit makes it take the device name directly, making the function useful to callers who don't have the whole NetworkConfig but do have the requisite device name. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:41 +01:00
Pavel Mores	d4ac45d840	runtime-rs: refactor clear_fd_flags() The idea of this function is to make sure O_CLOEXEC is not set on file descriptors that should be inherited by a child (=hypervisor) process. The approach so far is however rather heavy-handed - clearing all flags is unjustifiably aggresive for a low-level function with no knowledge of context whatsoever. This commit refactors the function so that it only does what's expected and renames it accordingly. It also clarifies some of its call sites. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-26 12:50:14 +01:00
Fabiano Fidêncio	cfe75f9422	k8s: confidential: Update cpuid to its latest release Since v2.2.6 it can detect TDX guests on Azure, so let's bump it even if Azure peer-pods are not currently used as part of our CI. Fixes: #9348 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-26 10:21:12 +01:00
Chengyu Zhu	d16971e37e	Merge pull request #9325 from ChengyuZhu6/image_service agent:image: Refactor code to improve memory efficiency of image service	2024-03-26 10:38:37 +08:00
Dan Mihai	6c72c29535	genpolicy: reduce policy debug prints Kata CI has full debug output enabled for the cbl-mariner k8s tests, and the test AKS node is relatively slow. So debug prints from policy are expensive during CI. Fixes: #9296 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-26 02:21:26 +00:00
Alex Lyn	cec943fc26	Merge pull request #9244 from Apokleos/dgb-gpu runtime-rs/dragonball: add support building kernel with upcall and GPU hotplug	2024-03-26 08:53:54 +08:00
Chelsea Mafrica	4e3deb5a3b	tools: Fix path for installing yq in packaging script The lib.sh script uses the right directory but the wrong path for the script that installs yq; fix it. Fixes #9165 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-03-25 15:09:52 -07:00
Chelsea Mafrica	cfb977625e	docs: Remove links to tests repo Remove links to tests repo and update with corresponding location in the current repo. Fixes #9165 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-03-25 15:09:52 -07:00
Chelsea Mafrica	d69514766e	src: Remove references to files in tests repo Change scripts and source that uses files in the tests repo to use the corresponding file in the current repo. Fixes #9165 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2024-03-25 15:09:52 -07:00
Gabriela Cervantes	ddef2be4f1	docs: Remove stale kernel information This PR removes stale kernel information from the README document. Fixes #9343 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-25 15:57:00 +00:00
Greg Kurz	e9e94d2dbd	release: Give a pretty name to all steps For a prettier rendering in the web UI. Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-25 15:50:35 +01:00
Greg Kurz	dce6ea57b2	release: Simplify the `create-new-release` action of `release.sh` Now that the version is an invariant for the entire workflow, it isn't required to obtain it with an environment variable. Just rely on the content of the `VERSION` file like other actions. Fixes #9064 - part VI Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-25 15:50:35 +01:00
Alex Lyn	5c54315a87	dragonball: fix CI failure due to poor UT adaptation. Fixes: #9144 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-25 20:25:27 +08:00
Alex Lyn	079d894496	kernel: bump version in kata config version Fixes: #9140 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-25 20:25:27 +08:00
Alex Lyn	070c3fa657	docs: add doc about building kernel with upcall and GPU hotplug We need some docs about how to build a guest kernel to support both Upcall and Nvidia GPU Passthrough(hotplug) at the same time. This patch is to do such thing to help users to build a guest kernel with support both Upcall and Nvidia GPU hotplug/unlplug. Fixes: #9140 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-25 20:25:17 +08:00
ChengyuZhu6	06b9935402	docs: Add a document for kata guest image management design Add a document for kata guest image management design. Related feature: #8484 Fixes: #9225 -- part I Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-25 18:17:23 +08:00
Chengyu Zhu	4029d154ba	Merge pull request #9313 from ChengyuZhu6/rtest agent: Refactor unit tests to leverage rstest for parameterization	2024-03-25 10:31:45 +08:00
Alex Lyn	bc309b9865	kernel: add CONFIG_CRYPTO_ECDSA into whitelist CONFIG_CRYPTO_ECDSA is not supported in older kernels such as 5.10.x which may cause building broken problem if we build such kernel with NVIDIA GPU in version 5.10.x So this patch is to add CONFIG_CRYPTO_ECDSA into whitelist.conf to avoid break building guest kernel with NVIDIA GPU. Fixes: #9140 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-25 08:05:31 +08:00
ChengyuZhu6	f47408fdf4	agent:image: Refactor code to improve memory efficiency of image service Currently, `.lock().await.clone()` results in `Option<ImageService>` being duplicated in memory with each call to `singleton()`. Consequently, if kata-agent receives numerous image pulling requests simultaneously, it will lead to the allocation of multiple `Option<ImageService>` instances in memory, thereby consuming additional memory resources. In image.rs, we introduce two public functions: `merge_bundle_oci()` and `init_image_service()`. These functions will encapsulate the operations on `IMAGE_SERVICE`, ensuring that its internal details remain hidden from external modules such as `rpc.rs`. Fixes: #9225 -- part II Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-25 07:46:50 +08:00
ChengyuZhu6	7a49ec1c80	agent:util: Refactor the unit tests to leverage rstest Refactor the unit tests in util.rs to leverage rstest for parameterization. Fixes: #9314 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-23 10:49:53 +08:00
ChengyuZhu6	2df2b4d30d	agent:namespace: Refactor unit tests to leverage rstest Refactor the unit tests in `namespace.rs` to leverage rstest for parameterization. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-23 10:49:48 +08:00
Hyounggyu Choi	d915a79e2d	Merge pull request #9280 from BbolroC/enable-qemu-on-s390x runtime-rs: Enable qemu on s390x	2024-03-22 23:58:42 +01:00
Fabiano Fidêncio	25cd28a32b	Merge pull request #9337 from fidencio/topic/bump-nydus-snapshotter versions: Update nydus-snapshotter to v0.13.11	2024-03-22 22:18:18 +01:00
Hyounggyu Choi	81aaa34bd6	runtime-rs: Add DeviceVirtioSerial and DeviceVirtconsole It is observed that virtiofsd exits immediately on s390x if there is no attached console devices. This commit resolves the issue by migrating `appendConsole()` from runtime and being triggered in `start_vm()`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-22 19:27:13 +01:00
Hyounggyu Choi	2cfe745efb	runtime-rs: Enable memory backend option for Machine for s390x For s390x, it requires an additional option `memory-backend` for `-machine`. Otherwise, virtiofsd exits with HandleRequest(InvalidParam). This commit is to add a field `memory_backend` to `struct Machine` and turn it on for s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-22 19:27:13 +01:00
Hyounggyu Choi	9bcfaad625	runtime-rs: Add ccw block device for rootfs Like nvdimm for x86_64, a block device for s390x should be treated differently with `virtio-blk-ccw`. This is to generate a QEMU command line parameter for a block device by using `-blockdev` and `-device` if the `vm_rootfs_driver` is set to `virtio-blk-ccw`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-22 19:27:13 +01:00
David Esparza	3e40051634	Merge pull request #9255 from dborquez/thread_pid_function runtime-rs: ch: Implement full thread/tid/pid handling	2024-03-22 10:05:02 -06:00
Fabiano Fidêncio	d0949759ec	versions: Update nydus-snapshotter to v0.13.11 This version brings in a fix for cleaning up k3s/rke2 environments, which directly impacts the TDX machine that's part of our CI. Fixes: #9318 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-22 14:56:18 +01:00
Greg Kurz	e4f6a778a8	Merge pull request #9321 from fidencio/topic/releases-follow-up-VI Revert "release: Skip --generate-notes for this release"	2024-03-22 10:44:40 +01:00
GabyCT	a67382fd00	Merge pull request #9324 from GabyCT/topic/udevguide docs: Update libseccomp instructions in Developers Guide	2024-03-21 14:25:41 -06:00
Gabriela Cervantes	d54cdd3f0c	scripts: Fix unbound variables in k8s setup script This PR fixes the unbound variables error when trying to run the setup script locally in order to avoid errors. Fixes #9328 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-21 19:10:16 +00:00
Chengyu Zhu	9a4cb96262	Merge pull request #9312 from ChengyuZhu6/show-feature agent: Add guest-pull to the list of agent features in announce()	2024-03-21 23:35:29 +08:00
David Esparza	b498e140a1	runtime-rs: ch: Implement full thread/tid/pid handling Add in the full details once cloud-hypervisor/cloud-hypervisor#6103 has been implemented, and the feature is available in a Cloud Hypervisor release. Fixes: #8799 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-03-21 08:24:53 -06:00
James O. D. Hunt	1e684f5848	Merge pull request #9259 from jodh-intel/tests-add-static-checks-announce tests: static checker: Add announce message	2024-03-21 13:59:36 +00:00
ChengyuZhu6	754399d909	agent: Add guest-pull to the list of agent features in announce() Add guest-pull to the list of agent features in announce(). Fixes: #9225 -- part IV Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-21 20:01:52 +08:00
Xuewei Niu	9c4f9dcb35	Merge pull request #9311 from studychao/chao/fix_mtrr Dragonballl: introduce MTRR regs support	2024-03-21 17:24:27 +08:00
Hyounggyu Choi	9b2c08935b	runtime-rs: Pass different device argument based on bus type Currently, `*-pci` is used as an argument for the device config. It is not true for a case where a different type of bus is used. s390x uses `ccw`. This commit is to make it flexible to generate the device argument based on the bus type. A structure `DeviceVhostUserFsPci` and `VhostVsockPci` is renamed to `DeviceVhostUserFs` and `VhostVsock` because the structure name is not bound to a certain bus type any more. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-21 09:25:37 +01:00
GabyCT	03f3d3491d	Merge pull request #9265 from GabyCT/topic/fixnydusclean gha: Fix nydus namespace clean up	2024-03-20 16:17:38 -06:00
GabyCT	702a8a440f	Merge pull request #9309 from GabyCT/topic/fixlograndom gha: Update journal log names for kubernetes artifacts	2024-03-20 16:17:17 -06:00
Gabriela Cervantes	05f4dc1902	docs: Update libseccomp instructions in Developers Guide This PR updates the libseccomp instructions in the Developers Guide. Fixes #9323 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-20 20:44:24 +00:00
GabyCT	163103d59e	Merge pull request #9307 from GabyCT/topic/fixdocreq docs: Update links in the Documentation Requirements document	2024-03-20 14:29:04 -06:00
Gabriela Cervantes	af18221ab7	docs: Update links in the Documentation Requirements document This PR updates the url links in the Documentation Requirements document. Fixes #9306 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-20 15:45:49 +00:00
Gabriela Cervantes	a855ecf21b	gha: Update journal log names for kubernetes artifacts This PR updates the journal log names for kubernetes artifacts in order to make sure that we have different names when we are running parallel GHA jobs. Fixes #9308 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-20 15:44:20 +00:00
Gabriela Cervantes	4fb8f8705f	gha: Fix nydus namespace clean up This PR terminates the nydus namespace to avoid the error of that the flag needs an argument. Fixes #9264 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-20 15:41:39 +00:00
Fabiano Fidêncio	0278fc8a91	Revert "release: Skip --generate-notes for this release" This reverts commit `0fa59ff94b`, as now we'll be able to use the `--generate-notes`, hopefully, without blowing the allowed limit. Fixes: #9064 - part VI Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-20 15:48:22 +01:00
James O. D. Hunt	577abd014b	tests: static checker: Add announce message Added an announcement message to the `static-checks.sh` script. It runs platform / architecture specific code so it would be useful to display details of the platform the checker is running on to help with debugging. Fixes: #9258. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-20 13:41:26 +00:00
James O. D. Hunt	4af4a8ad2b	tests: static checker: Create setup function Move some of the common code into a setup function. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-20 11:58:28 +00:00
Fabiano Fidêncio	1aec4f737a	Merge pull request #9316 from fidencio/topic/releases-follow-up-V release: Skip --generate-notes for this release	2024-03-20 10:50:14 +01:00
Fabiano Fidêncio	0fa59ff94b	release: Skip --generate-notes for this release This release is a special case, as we've slacked for 6 months and the release content is way too long ... long enough to exceed the allowed limit for the release notes. With this in mind we'll just remove the `--generate-notes` for now, and then revert this commit as soon as the release is out, as releases should be happening every month and, ideally, we won't reach this situation never ever again. Fixes: #9064 - part V Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-20 10:32:11 +01:00
Hyounggyu Choi	7b3d1adb8c	libs: Bump sysinfo to v0.30.5 It has been observed that the runtime stops running around `sysinfo::total_memory()` while adjusting a config on s390x. This is to update the crate to the latest version which happened to resolve the issue. (No explicit release note for this) Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-20 09:27:13 +01:00
Chao Wu	5a4b858ece	Dragonballl: introduce MTRR regs support MTRR, or Memory-Type Range Registers are a group of x86 MSRs providing a way to control access and cache ability of physical memory regions. During our test in runtime-rs + Dragonball, we found out that this register support is a must for passthrough GPU running CUDA application, GPU needs that information to properly use GPU memory. fixes: #9310 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-03-20 14:18:16 +08:00
Fabiano Fidêncio	19eb45a27d	Merge pull request #8484 from ChengyuZhu6/guest-pull Merge basic guest pull image code to main	2024-03-19 23:15:39 +01:00
Hyounggyu Choi	6e782826c7	Merge pull request #9305 from BbolroC/handle-comment-for-skipped-tests CI\|k8s: Handle skipped tests with a comment for filter_out_per_arch	2024-03-19 22:54:03 +01:00
Fabiano Fidêncio	8911d3565f	gha: tests: Filter out confidential tests for aarch64 / ppc64le Those two architectures are not TEE capable, thus we can just skip running those tests there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-19 18:06:01 +01:00
Fabiano Fidêncio	d14e9802b6	gha: k8s: Set {https,no}_proxy correctly for TDX This is needed as the TDX machine is hosted inside Intel and relies on proxies in order to connect to the external world. Not having those set causes issues when pulling the image inside the guest. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-19 18:06:00 +01:00
Fabiano Fidêncio	291b14bfb5	kata-deploy: Add the ability to set {https,no}_proxy if needed Let's make sure those two proxy settings are respected, as those will be widely used when pulling the image inside the guest on the Confidential Containers case. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	5bad18f9c9	agent: set https_proxy/no_proxy before initializing agent policy When the https_proxy/no_proxy settings are configured alongside agent-policy enabled, the process of pulling image in the guest will hang. This issue could stem from the instantiation of `reqwest`’s HTTP client at the time of agent-policy initialization, potentially impacting the effectiveness of the proxy settings during image guest pulling. Given that both functionalities use `reqwest`, it is advisable to set https_proxy/no_proxy prior to the initialization of agent-policy. Fixes: #9212 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	db9f18029c	README: Add https_proxy and no_proxy to agent README Add agent.https_proxy and agent.no_proxy to the table in the agent README. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	e23737a103	gha: refactor code with yq for better clarity refactor code with yq for better clarity: Before: ```bash yq write -i "${tools_dir}/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml" 'spec.template.spec.containers[0].env[7].value' "${KATA_HYPERVISOR}:${SNAPSHOTTER}" ``` After: ```bash yq write -i \ "${tools_dir}/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml" \ 'spec.template.spec.containers[0].env[7].value' \ "${KATA_HYPERVISOR}:${SNAPSHOTTER}" ``` Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	2c0bc8855b	tests: Make sure to install yq before using it Make sure to install yq before using it to modify YAML files. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	c52b356482	tests: add guest pull image test Add a test case of pulling image inside the guest for confidential containers. Signed-off-by: Da Li Liu <liudali@cn.ibm.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com> Co-authored-by: Megan Wright <Megan.Wright@ibm.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	e8c4effc07	tests: refactor the check for hypervisor to a function Extract two reusable functions for confidential tests in confidential_common.sh - check_hypervisor_for_confidential_tests: verifies if the input hypervisor supports confidential tests. - confidential_setup: performs the common setup for confidential tests. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Co-authored-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	6e5e4e55d0	rootfs: add ca file to guest rootfs To access the URL, the component to pull image in the guest needs to send a request to the remote. Therefore, we need to add CA to the rootfs. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	8724d7deeb	packaging: Enable to build agent with PULL_TYPE feature Enable to build kata-agent with PULL_TYPE feature. We build kata-agent with guest-pull feature by default, with PULL_TYPE set to default. This doesn't affect how kata shares images by virtio-fs. The snapshotter controls the image pulling in the guest. Only the nydus snapshotter with proxy mode can activate this feature. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:06:00 +01:00
ChengyuZhu6	cd6a84cfc5	kata-deploy: Setting up snapshotters per runtime handler Setting up snapshotters per runtime handler as the commit (`6cc6ca5a7f`) described. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:05:59 +01:00
ChengyuZhu6	ba242b0198	runtime: support different cri container type check To support handle image-guest-pull block volume from different CRIs, including cri-o and containerd. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:05:59 +01:00
ChengyuZhu6	874d83b510	agent/image: Use guest provided pause image By default the pause image and runtime config will provided by host side, this may have potential security risks when the host config a malicious pause image, then we will use the pause image packaged in the rootfs. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Arron Wang <arron.wang@intel.com> Co-authored-by: Julien Ropé <jrope@redhat.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com>	2024-03-19 18:05:59 +01:00
ChengyuZhu6	c269b9e8c6	agent: Add guest-pull feature for kata-agent Add "guest-pull" feature option to determine that the related dependencies would be compiled if the feature is enabled. By default, agent would be built with default-pull feature, which would support all pull types, including sharing images by virtio-fs and pulling images in the guest. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 18:05:59 +01:00
Aurélien	192250c52e	Merge pull request #9299 from sprt/sprt/mariner-normal-tests ci: aks: also run tests in normal instance for Mariner	2024-03-19 11:34:20 -05:00
ChengyuZhu6	965da9bc9b	runtime: support to pass image information to guest by KataVirtualVolume support to pass image information to guest by KataVirtualVolumeImageGuestPullType in KataVirtualVolume, which will be used to pull image on the guest. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	cfd14784a0	agent: Introduce ImagePullHandler to support IMAGE_GUEST_PULL volume As we do not employ a forked containerd in confidential-containers, we utilize the KataVirtualVolume which storing the image information as an integral part of `CreateContainer`. Within this process, we store the image information in rootfs.storage and pass this image url through `CreateContainerRequest`. This approach distinguishes itself from the use of `PullImageRequest`, as rootfs.storage is already set and initialized at this stage. To maintain clarity and avoid any need for modification to the `OverlayfsHandler`,we introduce the `ImagePullHandler`. This dedicated handler is responsible for orchestrating the image-pulling logic within the guest environment. This logic encompasses tasks such as calling the image-rs to download and unpack the image into `/run/kata-containers/{container_id}/images`, followed by a bind mount to `/run/kata-containers/{container_id}`. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	462051b067	agent/image: merge container spec for images pulled inside guest When being passed an image name through a container annotation, merge its corresponding bundle OCI specification and process into the passed container creation one. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Arron Wang <arron.wang@intel.com> Co-authored-by: Jiang Liu <gerry@linux.alibaba.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: wllenyj <wllenyj@linux.alibaba.com> Co-authored-by: jordan9500 <jordan.jackson@ibm.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	cec1916196	agent: Support https_proxy/no_proxy config for image download in guest Containerd can support set a proxy when downloading images with a environment variable. For CC stack, image download is offload to the kata agent, we need support similar feature. Current we add https_proxy and no_proxy, http_proxy is not added since it is insecure. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Arron Wang <arron.wang@intel.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	9cddd5813c	agent/image: Enable image-rs crate to pull image inside guest With image-rs pull_image API, the downloaded container image layers will store at IMAGE_RS_WORK_DIR, and generated bundle dir with rootfs and config.json will be saved under CONTAINER_BASE/cid directory. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Arron Wang <arron.wang@intel.com> Co-authored-by: Jiang Liu <gerry@linux.alibaba.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	2b3a00f848	agent: export the image service singleton instance Export the image service singleton instance. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Jiang Liu <gerry@linux.alibaba.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>	2024-03-19 17:22:36 +01:00
ChengyuZhu6	1f1ca6187d	agent: Introduce ImageService Introduce structure ImageService, which will be used to pull images inside the guest. Fixes: #8103 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> co-authored-by: wllenyj <wllenyj@linux.alibaba.com> co-authored-by: stevenhorsman <steven@uk.ibm.com>	2024-03-19 17:22:33 +01:00
Hyounggyu Choi	b381743dd5	CI\|k8s: Handle skipped tests with a comment for filter_out_per_arch This commit updates `filter_k8s_test.sh` to handle skipped tests that include comments. In addition to the existing parameter expansion, the following expansions have been added: - Removal of a comment - Stripping of trailing spaces Fixes: #9304 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-19 17:21:25 +01:00
Chelsea Mafrica	42dfe0e8d1	Merge pull request #9286 from jodh-intel/agent-show-enabled-features agent: Show features enabled at build time	2024-03-19 08:54:49 -07:00
Wainer Moschetta	e6501aa4ad	Merge pull request #9229 from ldoktor/ocp-ci ocp.ci: Various fixes and improvements to the OCP pipeline	2024-03-19 11:13:01 -03:00
James O. D. Hunt	46aec0f15a	Merge pull request #9293 from jodh-intel/kata-manager-fix-containerd-for-docker kata-manager: Fix Docker install	2024-03-19 10:06:44 +00:00
Fabiano Fidêncio	e0a6b6449f	Merge pull request #9302 from BbolroC/fix-permission-issue-on-s390x-runners gha: Place pre-action on s390x runner for kata-deploy during release	2024-03-19 10:42:23 +01:00
Hyounggyu Choi	f2bc819644	gha: Place pre-action on s390x runner for kata-deploy during release This is to place a pre-action step for the kata-deploy job in order to clean up the github workspace directory before checking out the repo. Fixes: #9301 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-19 10:18:38 +01:00
Alex Lyn	7af2df408e	Merge pull request #9295 from likebreath/0318/fix_clh_default_netconfig runtime-rs: ch: Provide valid default value for NetConfig	2024-03-19 15:17:18 +08:00
Xuewei Niu	99d0e5fff8	Merge pull request #9270 from zvonkok/kata-agent-bind-mount kata-agent: optional bind flag	2024-03-19 10:39:23 +08:00
Aurélien Bombo	71a1be9c57	ci: aks: also run tests in normal instance for Mariner Currently we're only running the small instance tests. This adds the normal instance tests as well. Fixes: #9298 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-03-18 23:33:17 +00:00
Bo Chen	ad4262e86b	runtime-rs: ch: Provide valid default value for NetConfig The current default value of IP `0.0.0.0` with mask `0.0.0.0` will cause ioctl error when being used to create and configure TAP device, with newer version of Cloud Hypervisor [1]. This patch replaces them with valid value that are the same as the Go-lang runtime [2]. [1] https://github.com/cloud-hypervisor/cloud-hypervisor/pull/5924 [2] `e3f7852738/src/runtime/virtcontainers/pkg/cloud-hypervisor/client/model_net_config.go (L40-L57)` Fixes: #9254 Signed-off-by: Bo Chen <chen.bo@intel.com>	2024-03-18 15:47:58 -07:00
Fabiano Fidêncio	e3f7852738	Merge pull request #9289 from fidencio/topic/releases-follow-up-IV releases: Simply the release in order to avoid pushing a commit updating the VERSION file	2024-03-18 17:38:58 +01:00
James O. D. Hunt	a6c3f75872	kata-manager: Fix Docker install Fix the Docker install by removing the second (erroneous) call to `containerd_installed()` in `handle_docker()`. Without this fix, installing using Docker (`-D`) will work iff you already have containerd installed. However, if you do not have containerd installed, the `containerd_installed()` function returns 1, which exits the script as we're running with `set -e`, leaving a broken Docker installation. > Note: containerd is installed via Docker's `get-docker.sh` script. Fixes: #9292. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-18 14:08:35 +00:00
stevenhorsman	0ab8e61a64	release: Remove release type from arch release Now we don't have minor and major releases and we are now generating a new version in the release workflow, we can tidy up the arch specific releases workflows to remove the extra required inputs Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-03-18 12:27:57 +00:00
Greg Kurz	3cfc1b6ba7	releases: Adjust documentation to the new workflow This drops the documentation of the legacy release scripts and adds a quick description of the scripts of the new workflow. It also highlights the bump of the `VERSION` file. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-18 12:57:02 +01:00
Greg Kurz	76c640767e	releases: Drop Makefile It isn't used anymore. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-18 12:54:00 +01:00
Greg Kurz	bfe19e68e8	kata-deploy: Adapt `test-kata.sh` to the new release workflow All releases are now created in the `main` branch following the very same workflow. No need to special case pre-releases. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-18 12:54:00 +01:00
Fabiano Fidêncio	12578f11bc	releases: Assume VERSION has the correct version to be released This is done in order to avoid having to push a commit to the main branch, which is against the defined rules on GitHub. By doing this, we need to educate ourselves to always bump the VERSION file as soon as a release is cut out. As a side effect of this change, we can drop the release-major and release-minor workflows, as those are not needed anymore. Fixes: #9064 - part IV Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-16 13:30:58 +01:00
Fabiano Fidêncio	8ce50269fe	release: Bump the VERSION file to the next release number 3.3.0 it will be. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-16 13:21:27 +01:00
Xuewei Niu	9f512c016e	Merge pull request #9282 from gkurz/runtime-rs-fds-for-qemu runtime-rs: Consolidate the handling of fds passed to QEMU	2024-03-16 10:26:11 +08:00
Greg Kurz	1e526a4769	runtime-rs: Consolidate the handling of fds passed to QEMU File descriptors that are passed to QEMU need some special care. We want them to be closed when the QEMU process is started. But at the same time, it is required that the associated rust File structures, either coming from the` std::fs` or the `tokio::fs` crates, are still in scope when the QEMU process is forked. This is currently achieved by keeping File structures in variables at the outer scope of `start_vm()`. This scheme is currently duplicated, with similar justifications in the corresponding comments. Consolidate all this handling in one place with a more generic explanation. Fixes #9281 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-15 16:14:59 +01:00
James O. D. Hunt	9ef59488d9	agent: Show features enabled at build time The agent now has a number of optional build-time features that can be enabled. Add details of these features to the following areas: - Version output (`kata-agent --version`) - Announce message (so that the details are always added to the journal at agent startup). - The response message returned by the ttRPC `GetGuestDetails()` API. Fixes: #9285. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-15 13:29:21 +00:00
Chelsea Mafrica	2c50d3c393	Merge pull request #9278 from wainersm/github_env_fix tests: fix nounset error with $GITHUB_ENV	2024-03-14 16:39:13 -07:00
Greg Kurz	6a112cc7a5	runtime-rs: Fix missing dependency Some previous contribution missed to run cargo clippy. Fix the dependency now so that it doesn't cause noise in future contributions. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-03-14 23:19:38 +01:00
Dan Mihai	b3b00e00a6	Merge pull request #9246 from microsoft/danmihai/default-env genpolicy: default env if image doesn't have env	2024-03-14 11:01:43 -07:00
Dan Mihai	6094f1e31d	Merge pull request #9250 from microsoft/danmihai1/k8s-pid-ns2 tests: k8s: k8s-pid-ns.bats auto-generated policy	2024-03-14 10:10:24 -07:00
Zvonko Kaiser	c15e19c806	kata-agent: optional bind flag Fixes: #9269 From https://github.com/opencontainers/runtime-spec/blob/main/config.md#mounts type (string, OPTIONAL) The type of the filesystem to be mounted. bind may be only specified in the oci spec options -> flags update r#type The agent will ignore bind mounts if they are only specified in the OCI spec options and not in the flags. Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-03-14 14:42:01 +00:00
Hyounggyu Choi	1dac6b1357	runtime-rs: Configure s390x specific flags for Makefile s390x supports a different machine type `s390-ccw-virtio` and it is not required to configure cpu features by default for the platform. A hypervisor `dragonball` is not supported on s390x so that `DBCMD` is not necessary. `vm-rootfs_driver` should be set to `virtio-blk-ccw`. This commit is to set the architecture-specific flags for Makefile. Fixes: #9158 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-03-14 13:05:35 +01:00
Wainer dos Santos Moschetta	981f95df55	tests: fix nounset error with $GITHUB_ENV Initialize $GITHUB_ENV to avoid nounset error when running the scripts locally out of Github Actions. Fixed commit `9ba5e3d2a8` Fixes #9217 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-13 14:57:38 -03:00
Dan Mihai	ac27caf1b4	Merge pull request #9248 from microsoft/danmihai1/k8s-exec.bats2 tests: k8s: k8s-exec.bats auto-generated policy	2024-03-13 09:21:12 -07:00
Alex Lyn	2aa3519520	kata-agent: Change order of guest hook and bind mount processing The guest_hook_path item in configuration.toml allows OCI hook scripts to be executed within Kata's guest environment. Traditionally, these guest hook programs are pre-built and included in Kata's guest rootfs image at a fixed location. While setting guest_hook_path = "/usr/share/oci/hooks" in configuration.toml works, it lacks flexibility. Not all guest hooks reside in the path /usr/share/oci/hooks, and users might have custom locations. To address this, a more flexible and configurable approach is to be proposed that allows users to specify their desired path. This could include using a sandbox bind mount path for hooks specific to that particular container. However, The current implementation of guest hooks and bind mounts in kata-agent has a reversed order of execution compared to the desired behavior. To achieve the intended functionality, we simply need to swap the order of their implementation. Fixes: #9274 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-13 20:30:32 +08:00
Steve Horsman	8f4cbd49d7	Merge pull request #9263 from Amulyam24/gha-fixes gha: ensure that the self hosted runner is in desired state before running the workflow	2024-03-13 10:49:29 +00:00
Zvonko Kaiser	63dff9a9f2	kata-agent: CreateContainer Hook Fixes: #9267 The doc states we have support for all lifecycle hooks. There are still some missing. This is the first issue regarding the CreateContainer hook which is run before pivot_root but after prestart and createruntime Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-03-13 09:24:25 +00:00
Amulyam24	3f4b24be8b	gha: ensure that self hosted runner is prepared before running the workflow This PR ensures that the self hosted runner is prepared by taking necesary actions before running the workflow. The script prepare_runner.sh checks the following: 1. Ensure that containerd/docker is up and running 2. Make sure that the repository workspace is cleaned up and has no conflicts 3. Remove/cleanup any leftover files from the previous runs Fixes: #9262 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-03-13 14:20:10 +05:30
Alex Lyn	410afcc913	Merge pull request #8866 from Apokleos/netdev-qemu-rs runtime-rs: add netdev params to cmdline for qemu-rs.	2024-03-13 13:07:43 +08:00
Dan Mihai	e8c2a45ce0	tests: k8s: k8s-pid-ns.bats auto-generated policy Auto-generate policy for k8s-pid-ns.bats. Fixes: #9249 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-12 22:34:46 +00:00
Lukáš Doktor	46e62eecb1	ci.ocp: Log the full grepped line rather than the expected msg we are grepping for an expected message but it might contain extra bits of information fruitful for later debugging. Let's include it in the output and the full log in case of an error. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 17:03:46 +01:00
Lukáš Doktor	7ff2eb508e	ci.ocp: Increase the mcp update timeout we're hitting this timeout quite often, looks like newer OCP takes longer to reconfigure. Increase the timeout to 1200. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	cc02329fd1	ci.ocp: Add a cleanup script This script doesn't serve as a complete cleanup, but it can be used as a best-effort cleaner between deploying different versions of kata-containers on the same OCP cluster. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	b811ee0650	ci.ocp: Allow to override the kata-deploy image sometimes we want to test a different than the latest image (eg. when verifying a PR via ghcr images or when bisecting a failure over older builds). Let's add a KATA_DEPLOY_IMAGE variable for that while keeping the latest image by default. Fixes: #9228 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	2936503b24	ci.ocp: Always replace the kata-deploy image in OCP pipeline previously we only replaced the image when the previously defined one matched the "old_img". This is good to avoid modifying developers custom changes, but it might lead to hard-to-debug issues when the image stays different. Let's ensure we always replace the image with the one we asked for. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	6525c94065	ci.ocp: Add a workaround to optionally enable skip_mount_home the latest upstream kata-containers requires the skip_mount_home to be enabled, which is default on OCP 4.14+ but disabled on OCP 4.13-. Let's use a "WORKAROUND_9206_CRIO" (called by kata-containers GH issue) variable to allow users to enable this treatement when needed. Related to: #9206 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	739d627b4e	ci.ocp: Turn selinux relabel failures into warnings Instead of failing the pipeline let's proceed with an error message that selinux setup failed so, in case of a later failure, we know what might have caused it while keeping the coverage in case of a false setup issue. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:38:04 +01:00
Lukáš Doktor	76c452d4e0	ci.ocp: Wait for all pods to finish the work previously we only waited for a random pod to finish the selinux relabel, which could be error-prone. Let's wait for all of the podst to contain the expected message. Increase the timeout to 120s as some pods might take a little bit longer to finish. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:34:56 +01:00
Lukáš Doktor	f7febd07a0	ci.ocp: Allow to re-apply the selinux workaround in case we re-apply the selinux workaround or if user had already existing similar rule the relabel_selinux was failing. Let's allow it to modify the existing rules as well to avoid such issues. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:02:21 +01:00
Lukáš Doktor	fbbea68f1f	ci.ocp: Ignore selinux setup on non-selinux cluster improve our selinux workaround to work well on non-selinux clusters. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-12 16:02:20 +01:00
Alex Lyn	e2ae8ba79b	runtime-rs: add network device into Qemu's cmdline It will open tuntap device and vhost-net device and store device files. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-12 22:28:54 +08:00
Alex Lyn	d3bca4597e	runtime-rs: add open_named_tuntap to open a named tuntap device. The open_named_tuntap function is designed as a public function to open a tuntap device with the specified name. However, in order to reference existing methods in dbs_utils, we still need to keep the reference "path = "../../../dragonball/src/dbs_utils" in dependencies and cannot hide it. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-12 22:26:32 +08:00
Alex Lyn	005b333976	runtime-rs: add network helpers and impl ToQemuParams Add network helpers and impl ToQemuParams trait to build netdev params which are put into cmdline for Qemu VM running. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-12 22:25:39 +08:00
Alex Lyn	63786934f4	runtime-rs: set network namespace for qemu process and netdev. We need ensure the add_network_device happens in netns and move qemu process into netns which keeps the qemu process running in this net namespace. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-12 22:21:43 +08:00
Alex Lyn	69a5e5b955	runtime-rs: add network device handler in start_vm. Add network device handler in start_vm, which is sepcially for Qemu VM running with added net params to command line. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-12 22:18:01 +08:00
Alex Lyn	a116b252c8	Merge pull request #9236 from jodh-intel/docs-improve-install-details docs: install: Simplify instructions	2024-03-12 14:29:38 +08:00
Alex Lyn	a31fb35e5d	Merge pull request #9231 from UiPath/fix/clh-pid-init clh: initialize clh pid before using it	2024-03-12 13:43:24 +08:00
Alex Lyn	9f6003adde	runtime-rs: add a new netns field in struct QemuInner. We need add a new netns field in struct QemuInner, and initialize it with argument passed down in prepare_vm(). Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-11 16:02:39 +08:00
Alex Lyn	f571ec84d2	runtime-rs: add a public method to support process entering netns. The enter_netns function is designed as a public method to help VMMs running as a independent process enter a network namespace, reducing duplicate code. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-11 15:55:52 +08:00
Alex Lyn	4176fcc3c6	runtime-rs: make the code for cleanup fd flags as public method. It just move the related code to a public file(utils.rs) and make it a common method for both vsock and network, or some others. Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com> Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-03-11 15:52:20 +08:00
Alex Lyn	b1038704e0	runtime-rs: make NetnsGuard common for hypervisor and resource. In order to better support non-builtin vmm usage of NetnsGuard and reduce code duplication, we need to move it to a common path that can be referenced by both hypervisor and resource manager. In this patch, it just do moving code from network/utils/netns.rs to kata-sys-utils/src/netns.rs Fixes: #8865 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-11 15:38:42 +08:00
Alexandru Matei	617b0114b3	clh: initialize clh pid before using it The PID needs to be initialized before calling isClhRunning. waitVMM() uses isClhRunning and is called by launchClh() just before returning from function. Fixes: #9230 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-03-09 13:53:51 +02:00
Dan Mihai	88b7a44271	tests: k8s: k8s-exec.bats auto-generated policy Auto-generate policy for k8s-exec.bats. Fixes: #9247 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-08 17:48:20 +00:00
Steve Horsman	54e5ce2464	Merge pull request #9154 from chungeun-choi/change-deprecated-package fixed - Change the deprecated module from 'io/util' to util. 'io/util…	2024-03-08 15:05:43 +00:00
Steve Horsman	e9bbf2f67b	Merge pull request #9203 from fidencio/topic/releases-follow-up-III release: Ensure the release-type is passed to workflows	2024-03-08 14:09:36 +00:00
Alex Lyn	c73597c39d	Merge pull request #9208 from studychao/chao/fix_virt_ci Dragonball: fix unit test problems when switching to new virt github machine	2024-03-08 09:41:05 +08:00
Chengyu Zhu	d49391a555	Merge pull request #8798 from LindaYu17/setpolicy add setpolicy function to kata-runtime tool	2024-03-08 06:31:57 +08:00
Dan Mihai	5398b6466c	Merge pull request #9224 from 3u13r/sidecar-container genpolicy: add restartPolicy to container struct	2024-03-07 12:59:55 -08:00
GabyCT	35d8f82232	Merge pull request #9242 from GabyCT/topic/enabldebugnerd gha: Add collect artifacts step to nerdctl workflow	2024-03-07 13:34:40 -06:00
Wainer Moschetta	91998af173	Merge pull request #9114 from wainersm/ci_kbs_cli CI: add KBS utilities for attestation tests	2024-03-07 16:34:03 -03:00
Dan Mihai	4c3d6fadc8	genpolicy: default env if image doesn't have env Use containerd's default environment for container images that don't specify the Env field. Also, re-enable policy env variable verification, now that these uncommon images are supported too. Fixes: #9239 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 16:56:06 +00:00
Dan Mihai	b3a02d5e06	Merge pull request #9128 from microsoft/danmihai1/test-genpolicy tests: k8s: auto-generated policy	2024-03-07 08:50:47 -08:00
Fabiano Fidêncio	8faab965a7	gh: Fix payload-after-push tags We now expect the arch specific images to be tagged as kata-containers-latest-${arch}. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-07 12:02:51 +00:00
Fabiano Fidêncio	eab78cf1ba	release: Reword the extra notes added as part of the release We're trying to keep just the bare minimum info, as we really would like to not have the list of commits, and mainly the list of new contributors, trucated from the release notes. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-07 12:02:51 +00:00
Fabiano Fidêncio	658fb6972b	release: Ensure the release-type is passed to workflows We need to ensure the release type is passed down to workflows, otherwise we'll fail to get the correct release version for tagging the daemonset images. Fixes: #9064 - part III Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-03-07 12:02:51 +00:00
Alex Lyn	a0a50f5e52	Merge pull request #9191 from Apokleos/fix-kata-ctl-exec0 kata-ctl: Support using container short ID to enter guest.	2024-03-07 19:26:40 +08:00
Wainer dos Santos Moschetta	8ea9ac515e	tests/k8s: update kbs repository Recently confidential-containers/kbs repository was renamed to confidential-containers/trustee. Github will automatically resolve the old URL but we better adjust it in code. The trustee repository will be cloned to $COCO_TRUSTEE_DIR. Adjusted file paths and pushd/popd's to use $COCO_KBS_DIR ($COCO_TRUSTEE_DIR/kbs). On versions.yaml changed from `coco-kbs` to `coco-trustee` as in the future we might need other trustee components, so keeping it generic. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta	c669567cd3	tests/k8s: add utils to set KBS policies Added the kbs_set_resources_policy() function to set the KBS policy. Also the kbs_set_allow_all_resources() and kbs_set_deny_all_resources to set the "allow all" and "deny all" policy, respectively. Fixes #9056 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta	6f0d38094d	tests/k8s: add utils to set KBS resources Added utility functions to manage resources in KBS: - kbs_set_resource(), where the resource data is passed via argument - kbs_set_resource_from_file(), where the resource data is found in a file Fixes #9056 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta	2a374422c5	tests/k8s: add function to install kbs-client Added kbs_install_cli function to build and install the kbs-client executable if not present into the system. Removed the stub from gha-run.sh; now the install kbs-client in the .github/workflows/run-kata-deploy-tests-on-aks.yaml will effectively install the executable. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta	4141875ffd	ci/lib.sh: set GOPATH default value Scripts sourcing ci/lib.sh need to set $GOPATH otherwise it will fail. This ensure that GOPATH is set to ${HOME}/go unless it is already exported. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Wainer dos Santos Moschetta	e410aef4fa	tests/k8s: add utils to get kbs service address Added functions to return the service host, port or full-qualified HTTP address, respectively, kbs_k8s_svc_host(), kbs_k8s_svc_port(), and kbs_k8s_svc_http_addr(). Fixes #9056 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-07 11:20:36 +00:00
Leonard Cohnen	e30e8ab7dc	genpolicy: add restartPolicy to container struct This adds support for sidecar container introduced in Kubernetes 1.28 Fixes: #9220 Signed-off-by: Leonard Cohnen <lc@edgeless.systems>	2024-03-07 12:00:14 +01:00
Chungeun Choi	bad263f399	runtime: Replace deprecated module io/ioutil" to "io" This change updates the module import to use 'util' instead of the deprecated 'io/util' Fixes: #9166 Signed-off-by: Chungeun Choi <ce.choi@okestro.com>	2024-03-07 10:56:06 +00:00
Alex Lyn	ef9a38e551	shim-interface: add Copyright of AntGroup in file shim-interface.rs Fixes: #9189 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-07 15:46:32 +08:00
Alex Lyn	2972a3a675	shim-interface: add UT for get_uds_with_sid Fixes: #9189 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-07 15:45:44 +08:00
Alex Lyn	7145243bd3	kata-ctl: Support using container short ID to enter guest. Fixes: #9189 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-07 15:44:47 +08:00
Linda Yu	bb77d2d7e6	docs: add docs on how to set policy by kata-runtime Fixes: #8797 Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-03-07 15:00:23 +08:00
Linda Yu	1c5693be86	stream: repeat copybuffer if it is blocked by policy copyBuffer returns and the streams will be closed when error occurs. If the error contains "blocked by policy" it means the log output is disabled by policy with "ReadStreamRequest" and "WriteStreamRequest" set to false. But at this moment, we want the real stream still working (not be seen) because we might want to enable logging for debugging purpose, so we repeat copybuffer in this case to avoid streams being closed. Fixes: #8797 Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-03-07 15:00:23 +08:00
Linda Yu	eda419cb03	kata-runtime: add set policy function to kata-runtime logging/debugging information might probably be disabled in production due to security consideration, but we'd better provide an approach for customer to get logging information during runtime, this PR implement setpolicy function in kata-runtime tools, although it can set whole policy other than logging. setpolicy would evokes remote attestation, which means before setting policy during runtime, user has to reconfigure new policy hash in KBS/AS. usage: kata-runtime policy set policy.rego --sandbox-id XXXXXXXX Fixes: #8797 Signed-off-by: Linda Yu <linda.yu@intel.com>	2024-03-07 15:00:23 +08:00
Dan Mihai	c08b696d9e	tests: k8s: k8s-shared-volume generated policy Auto-generate policy for k8s-shared-volume.bats. Fixes: #9096 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 05:57:30 +00:00
Dan Mihai	b24758fad8	tests: k8s: k8s-scale-nginx auto-generated policy Auto-generate policy for k8s-scale-nginx.bats. Fixes: #9096 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 05:57:30 +00:00
Dan Mihai	af9ac8d194	tests: k8s: k8s-replication auto-generated policy Auto-generate policy for k8s-replication.bats. Fixes: #9096 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 05:57:30 +00:00
Dan Mihai	56689c6800	tests: k8s: k8s-qos-pods auto-generated policy Auto-generate policy for k8s-qos-pods.bats. Fixes: #9096 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 05:57:30 +00:00
Dan Mihai	0179f53469	tests: k8s: k8s-parallel auto-generated policy Auto-generate policy for k8s-parallel.bats. Fixes: #9096 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 05:57:30 +00:00
Dan Mihai	73a8b61c2e	Merge pull request #9243 from microsoft/danmihai1/genpolicy-unblock-ci genpolicy: disable env variable verification	2024-03-06 21:44:18 -08:00
Dan Mihai	e61ef30a76	genpolicy: disable env variable verification Disable env variable verification to unblock CI, until container images that don't specify the Env variables will be handled correctly (see #9239). Also, mark the image config Env field as optional, thus allowing policy generation for these container images. Fixes: #9240 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-03-07 01:59:18 +00:00
Gabriela Cervantes	94fdcda7f7	scripts: Add collect artifacts function in nerdctl gha run script This PR adds the collect artifacts function in nerdctl gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-06 19:48:12 +00:00
Gabriela Cervantes	f902ee78d0	gha: Add collect artifacts step to nerdctl workflow This PR adds the collect artifacts step to nerdctl workflow. Fixes #9241 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-06 19:41:16 +00:00
GabyCT	640ed591bd	Merge pull request #9219 from GabyCT/topic/fixkerneldoc docs: Remove stale kernel information at README documentation	2024-03-06 10:24:31 -06:00
James O. D. Hunt	b1d4cbd9d1	utils: spell-checker: Fix grep warning Fix the `grep(1)` warning caused by the unnecessary escaping of the hash/sharp symbol. Fixes: #9235. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-06 13:21:15 +00:00
James O. D. Hunt	5257bfa9a9	docs: install: Simplify instructions Move the "build from source" and "manual installation" details to the developer guide. This makes the installation landing page clearer for users. Fixes: #9234. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-06 13:14:03 +00:00
Ryan Savino	fdfc825bc4	Merge pull request #9174 from ryansavino/snp-qemu-stable-coco-tag versions: SNP qemu updated to stable coco tagged version	2024-03-06 01:03:10 -06:00
GabyCT	83e39a206c	Merge pull request #9223 from jodh-intel/tests-add-k3s-artifacts tests: Add k3s artifacts	2024-03-05 13:37:21 -06:00
James O. D. Hunt	a67ed2f1c2	tests: Add k3s artifacts The k3s distribution of k8s uses an embedded version of containerd and configures it to log to a file, not the journal. Hence, although we collect the journal as a test artifact, we also need to collect the actual log files for containerd. Also collect the k3s containerd config files to help with debugging. Fixes: #9104. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-05 17:54:20 +00:00
GabyCT	9fab57acc8	Merge pull request #9217 from wainersm/revert_collect_artifacts gha: export start_time to collect artifacts properly	2024-03-05 11:11:49 -06:00
Gabriela Cervantes	12be4cf828	docs: Remove stale kernel information at README documentation This PR removes stale kernel information at README documentation. Fixes #9218 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-05 16:46:45 +00:00
Wainer dos Santos Moschetta	9ba5e3d2a8	gha: export start_time to collect artifacts properly The jobs running on garm will collect journal information. The data gathered is based on the time the tests started running. The $start_time is exported on run_tests() and used in collect_artifacts(). It happens that run_tests() and collect_artifacts() are called on different steps of the workflow and the environment variables aren't preserved between them, i.e, $start_time exported on the first step is not available on the subsequents. To solve that issue, let's save $start_time in the file pointed out by $GITHUB_ENV that Github actions uses to export variables. In case $GITHUB_ENV is empty then probably it is running locally outside of Github, so it won't save the start time value. Fixes #9217 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-05 12:15:20 -03:00
James O. D. Hunt	b761a80bd1	Merge pull request #9059 from jodh-intel/kata-manager-add-hypervisor-option kata-manager: Allow hypervisor to be changed	2024-03-05 09:30:04 +00:00
Alex Lyn	bf5edc8e73	Merge pull request #9155 from Jimmy-Xu/fix-build-gpu-kernel gpu: fix build guest kernel with gpu	2024-03-05 16:53:44 +08:00
Greg Kurz	0320198889	Merge pull request #9206 from lifupan/main CI: fix the issue of ci failure on crio	2024-03-05 09:52:13 +01:00
Fupan Li	628f57aca0	Merge pull request #9193 from UiPath/fix/clh-dax clh: Enable DAX for rootfs	2024-03-05 09:39:22 +08:00
Wainer Moschetta	38088a934b	Merge pull request #9184 from wainersm/fix_kata_deploy_bats tests/kata-deploy: fix checker for kata-deploy running	2024-03-04 20:50:37 -03:00
GabyCT	77d048da4d	Merge pull request #9065 from wainersm/ci_install_kbs CI: Install KBS on k8s for attestation tests	2024-03-04 16:59:01 -06:00
GabyCT	a4153f3b71	Merge pull request #9210 from GabyCT/topic/addtestreadme docs: Add general README for tests section	2024-03-04 16:54:28 -06:00
Gabriela Cervantes	5d50262422	docs: Add general tests documentation in main README This PR adds the general tests documentation in main README of the kata containers repository. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-04 21:53:01 +00:00
Gabriela Cervantes	d5fa2bebd5	docs: Add general README for tests section This PR adds general README documentation for the tests section in the kata containers repository. Fixes #9209 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-04 21:50:37 +00:00
GabyCT	4dea9019ab	Merge pull request #9126 from GabyCT/topic/addartifactsk gha: Storing artifacts for logs of k8s tests garm	2024-03-04 15:41:54 -06:00
Gabriela Cervantes	fc5e040d96	scripts: Apply general fixes to variables in gha-run script This PR applies general fixes to variables in gha-run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-04 18:54:15 +00:00
James O. D. Hunt	7af892f8d8	docs: Update kata-manager docs for switching hypervisor Add details to the README for `kata-manager` showing how to list available hypervisor configs (packaged and local), and switch between the configurations. Also, update the hypervisors page to show a lot more detail about the hypervisor configurations, including the "short name" used by `kata-manager` for switching hypervisor config. > Note: > > These changes only apply to the current default golang runtime. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 12:24:31 +00:00
James O. D. Hunt	4f6fef1f61	docs: Whitespace fix Remove extraneous whitespace from hypervisors doc. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 12:18:05 +00:00
James O. D. Hunt	1ac3caf656	kata-manager: Allow hypervisor to be changed Add new options to allow the configured hypervisor to be changed: - `-L`: List available _packaged_ hypervisor config short names. - `-e`: List available _local_ hypervisor config names. - `-H <hypervisor>`: Install Kata then switch to the specified hypervisor. - `-S <hypervisor>`: Switch to the specified hypervisor (by config short name [Errors if Kata not installed]). For example, to install Kata and configure it to use Cloud Hypervisor with the golang Kata runtime: ```bash $ kata-manager.sh -H clh ``` To switch back to the default hypervisor: ```bash $ kata-manager.sh -S default ``` To show details of the available packaged configs: ```bash $ kata-manager.sh -L ``` To show details of the local configs: ```bash $ kata-manager.sh -e ``` > Notes: > > - This change only applies to the current default (golang) Kata runtime. > > - Although this is mainly for users wishing to switch hypervisor (by > changing the Kata config file to another of the packaged config files > provided for specific hypervisors), strictly it allows users to change > to _any_ config file. For example, if the user has a config file called > `/etc/kata-containers/configuration-my-custom-config.toml`, they could > switch to this by running: > > ```bash > $ kata-manager.sh -S my-custom-config > ``` > > - The "config short names" are the hypervisor specific part of the configuration file name. > For example, the config short name for file `configuration-qemu.toml` is > `qemu` and the config short name for `configuration-clh.toml` is `clh`. Fixes: #8305. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 12:18:00 +00:00
James O. D. Hunt	0bb558c0b9	kata-manager: Fix symlink handling The `configure_kata()` function modifies the configuration file to enable debug. But it was doing this by calling `sed -i` which, by default, creates a new _file_ from the `configuration.toml` symbolic link. This defeated the point of the symbolic link which is supposed to resolve to the local copy of the pristine config file, so we now use the GNU sed(1) specific `---follow-symlinks` option to retain the sym-link. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 11:15:39 +00:00
James O. D. Hunt	455637b30a	kata-manager: Show message when checking file Add an info message just before the archive file is checked. This keeps the user informed about what is happening as it can take a few seconds to perform the checks on slower systems. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 11:15:39 +00:00
James O. D. Hunt	ce350450e8	kata-manager: Sort options in usage Ensure the usage statement lists all options in alphabetical order. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 11:15:39 +00:00
James O. D. Hunt	159d29665a	kata-manager: Whitespace fixes Remove extraneous whitespace. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-03-04 11:15:39 +00:00
Chao Wu	9f0eab904b	Dragonball: fix test_signal_handler a) There is some unknown syscalls triggered in new github virt machine that would break the make test process with SIGSYS after applying SeccompFilter. In order to fix this, we change the allowlist in this unit test for seccompfileter into a blocklist to avoid meeting the unknown syscalls. b) lazy static METRICS is not fully initialize in the unit test and may lead to unstable result for this UT. fixes: #9207 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-03-04 16:27:27 +08:00
Chao Wu	253fe72435	Dragonball: fix test_handler_insert_region the mmap region start guest addr hard-code a value and later there would be check whether the mentioned addr is larger than or equal to mem_end (default to host_phy_mem >> 1) in order to satisfy the requirement for DaxMemory. Since github virt machine phy_mem is larger than previous CI machine we use, the hard-code value could no longer be worked. To fix this, we change the address to mem_end in unit test to avoid the influence of host machine change. fixes: #9207 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-03-04 16:27:19 +08:00
Jimmy-Xu	5ada7329b8	gpu: fix build guest kernel with nvidia gpu - enable CONFIG_MTRR,CONFIG_X86_PAT on x86_64 for nvidia gpu - optimize -f of build-kernel.sh, clean old kernel path and config before setup - add kernel 5.16.x Fixes: #9143 Signed-off-by: Jimmy-Xu <xjimmyshcn@gmail.com>	2024-03-04 09:40:42 +08:00
Fupan Li	07e0cf1855	CI: fix the issue of ci failure on crio PR #8760 tentatively tried to have the shim to run in its own mount namespace for the sake of improving isolation between the sandbox and the host. Thus crio storage drivers shouldn't create a PRIVATE bind mount on their home directory. Otherwise, the container's rootfs mount wouldn't be propagated to kata runtime's mount namespace, and kata runtime couldn't access the container's rootfs files. So, when kata cooperated with crio, crio should set skip_mount_home=true for its storage overlay. Fixes: #9028 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-03-03 20:53:36 +08:00
Wainer dos Santos Moschetta	2c24977cb1	tests/k8s: allow to overwrite the cluster name _print_cluster_name() create a string based information like the pull request number and commit SHA. However, when you are developing the scripts you might want to use an arbitrary name, so it was introduced the $AKS_NAME variable that once exported it will overwrite the generated name. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-02 12:42:35 -03:00
Wainer dos Santos Moschetta	5e4b7bbd04	tests/k8s: expose KBS service externally Until this point the deployed KBS service is only reachable from within the cluster. This introduces a generic mechanism to apply an Ingress configuration to expose the service externally. The first implemened ingress is for AKS. In case the HTTP application routing isn't enabled in the cluster (this is required for ingress), an add-on is applied. It was added the get_cluster_specific_dns_zone() and enable_cluster_http_application_routing() helper functions to gha-run-k8s-common.sh. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-02 12:42:35 -03:00
Wainer dos Santos Moschetta	e1e0b94975	tests/k8s: introduce the CoCo kbs library Introduce the tests/integration/kubernetes/confidential_kbs.sh library that contains functions to manage the KBS on CI. Initially implemented the kbs_k8s_deploy() and kbs_k8s_delete() functions to, respectively, deploy and delete KBS on Kubernetes. Also hooked those functions in the tests/integration/kubernetes/gha-run.sh script to follow the convention of running commands from Github Workflows: $ .tests/integration/kubernetes/gha-run.sh deploy-coco-kbs $ .tests/integration/kubernetes/gha-run.sh delete-coco-kbs Fixes #9058 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-02 12:39:26 -03:00
Wainer dos Santos Moschetta	6a28c94d99	tests/k8s: add a kustomize installer Kustomize has been used on some of our internal components (e.g. kata-deploy) to manage k8s deployments. On CI it has been used the `sed` tool to edit kustomization.yaml files, but `kustomize` is more suitable for that purpose. So in order to use that tool on CI scripts in the future, this commit introduces the `install_kustomize()` function that is going to download and install the binary in /usr/local/bin in case it's found on $PATH. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-03-02 12:39:26 -03:00
Xuewei Niu	daab76de36	Merge pull request #9201 from liubogithub/liubo/dev/panic_fix_3 katautils: fix panic on tracing.	2024-03-02 10:27:02 +08:00
GabyCT	4a0cfc4e3f	Merge pull request #9199 from GabyCT/topic/enablecri gha: Enable cri-containerd tests for cloud hypervisor runtime-rs	2024-03-01 12:23:16 -06:00
Steve Horsman	1ec33b8879	Merge pull request #9200 from wainersm/ci_install_kbs-timeout gha: increase timeout of KBS steps	2024-03-01 16:00:21 +00:00
Gabriela Cervantes	7299dbdb43	gha: Store journalctl logs This PR stores the journalctl logs. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-01 15:17:20 +00:00
Gabriela Cervantes	342d3a320d	gha: Add collect artifacts function in gha-run script This PR adds the collect artifacts function in gha-run script for the kubernetes tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-01 15:17:20 +00:00
Gabriela Cervantes	2070e3481e	gha: Storing artifacts for logs of k8s tests garm This PR helps to store the artifacts for different logs for k8s tests on garm. Fixes #9103 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-03-01 15:17:20 +00:00
Greg Kurz	df17bf95d5	Merge pull request #9169 from ldoktor/backport-ocp ci.ocp: Backport service-up detection fixes	2024-03-01 16:09:55 +01:00
Greg Kurz	dc6bda19bf	Merge pull request #9179 from gkurz/fix-k8s-sandbox-vcpus-allocation-check tests: k8s: Adapt k8s-sandbox-vcpus-allocation.bats to kubernetes v1.29	2024-03-01 15:55:07 +01:00
Lukáš Doktor	6fffbaa190	ci.ocp: Backport service-up detection fixes This backports the: 9060e930caf2d20f413df07778d3ab497493161c ci.ocp: Add debug output on HTTP service failure these logs are vital to analyze a setup failure. a10a1e2c9cbc21afc1e80f22b0fb8634d27cbd8d ci.ocp: Improve the service-up detection waiting for the first response is not sufficient as OCP returns html page without error even when the route is not yet established describing the issue (why it doesn't reply with 500?). Waiting for the correct output should do better. commits from the kata-containers/tests repo. Fixes: #8653 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-03-01 12:04:20 +01:00
Alex Lyn	13a20957cb	Merge pull request #9164 from Apokleos/directvol-csi-dockerfile csi-kata-directvolume: add Dockerfile for building csi image	2024-03-01 18:12:19 +08:00
Alex Lyn	f69428a1e7	csi-kata-directvolume: add Dockerfile for building csi image Fixes: #9163 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-03-01 10:41:51 +08:00
Liu Bo	b6f8355ea3	katautils: fix panic on tracing. This fixes a panic on tracing on container exit. The root cause is that global var needs to be set by "=" instead of ":=". Fixes: #9102 Signed-off-by: Liu Bo <liub.liubo@gmail.com>	2024-02-29 18:40:23 -08:00
Wainer dos Santos Moschetta	24c163e6e1	tests/kata-deploy: fix checker for kata-deploy running Currently, the checking for kata-deploy is running assume that the daemonset scheduled at least one pod, however it might not had and the kubectl wait command fails due to "error: no matching resources found". On CI I've observed that fail intermittently. I suspect the service account kata-deploy-sa take a while to show up then no kata-deploy is scheduled in meanwhile. Changed the checker logic to use waitForProcess() to keep testing if it is already running, or hit the timeout (still 10m). Fixes #9183 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-29 22:26:27 -03:00
Wainer dos Santos Moschetta	4410df7233	gha: increase timeout of KBS steps The step to deploy KBS on run-k8s-tests-on-aks workflow should be increased so that there is enough time for checking the service is healthy and exposed. Likewise the step that builds the kbs-client which requires enough time to build the executable. Fixes #9058 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-29 22:05:58 -03:00
Dan Mihai	11b603e5f1	Merge pull request #9139 from microsoft/saulparedes/genolicy_panic_subpath genpolicy: panic when we see a volume mount subpath	2024-02-29 12:18:56 -08:00
Gabriela Cervantes	beb592b309	gha: Enable cri-containerd tests for cloud hypervisor runtime-rs This PR enables the cri-containerd tests for cloud hypervisor runtime-rs. Fixes #9198 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-29 20:18:16 +00:00
GabyCT	a4f5815f6b	Merge pull request #9182 from GabyCT/topic/addclhcri gha: Add cloud-hypervisor (runtime-rs) support to cri-containerd tests	2024-02-29 14:12:01 -06:00
Gabriela Cervantes	0f595cf15b	gha: General variable fixes to gha-run script This PR adds general variable fixes to gha-run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-29 18:15:27 +00:00
Alexandru Matei	6856e8f678	clh: Enable DAX for rootfs Fixes: #9192 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2024-02-29 18:01:47 +02:00
Greg Kurz	f3442cdef9	tests: k8s: Adapt k8s-sandbox-vcpus-allocation.bats to kubernetes v1.29 Kubernetes v1.29 introduced a new `PodReadyToStartContainers` condition that gets inserted at index 0 in the conditions array. This means that the expected `PodCompleted` reason can now be either at index 0 with kubernetes v1.28 and older or at index 1 starting with kubernetes v1.29. This is fragile at best since the `kubectl wait` doesn't allow to combine multiple checks. Also, checking the reason is dubious as it doesn't really tell if the pods have actually completed or not. Check the pod phase to be `Succeeded` instead, this guarantees that : > All containers in the Pod have terminated in success, and will not > be restarted. Fixes #9178 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-02-29 17:00:31 +01:00
Greg Kurz	f89120662d	tests: k8s: Wait for all pods concurrently A single invocation of `kubectl wait` can handle all pods. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-02-29 17:00:31 +01:00
Greg Kurz	58bc026656	Merge pull request #9180 from fidencio/topic/actually-add-the-pause-image-into-the-rootfs rootfs: Fix PAUSE_IMAGE_TARBALL addition to the rootfs	2024-02-29 13:56:32 +01:00
Chengyu Zhu	c01ba58b3d	Merge pull request #9176 from ChengyuZhu6/stale_doc docs: renew stale link	2024-02-29 18:35:26 +08:00
Fabiano Fidêncio	1d2f7afd1f	Merge pull request #9188 from fidencio/topic/releases-follow-up-II releases: Second round of follow-up fixes	2024-02-29 10:59:36 +01:00
Fabiano Fidêncio	c9dfe49152	gha: payload: Fix env var declarations This was introduced by `a45988766c`, but didn't follow the correct format for the env declaration. Fixes: #9064 - part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-29 10:52:49 +01:00
Fabiano Fidêncio	1c3a769822	gha: payload: Don't use concurrency for this job We want all payloads to be built and published, regardless if there's a new PR merged. This will help people to easily trace / debug issues. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-29 10:52:45 +01:00
Fabiano Fidêncio	02af62b66c	gha: payload: Stop generating payloads for the stable branches We've decided to not maintain stable branches anymore, thus we can only trigger this workflow for the `main` branch. For more details, please, see: https://github.com/kata-containers/kata-containers/issues/9064 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-29 10:42:25 +01:00
Fabiano Fidêncio	b4061a1c23	Merge pull request #9170 from fidencio/topic/releases-follow-up-I release: Add the needed fixes for the release process	2024-02-29 10:36:20 +01:00
ChengyuZhu6	e5d3627794	docs: renew stale link Renew the stale link "https://github.com/containerd/containerd/tree/main/runtime/v2" to the latest "https://github.com/containerd/containerd/tree/main/core/runtime/v2". Fixes: #9177 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-29 15:03:02 +08:00
Fabiano Fidêncio	0022474164	rootfs: Fix PAUSE_IMAGE_TARBALL addition to the rootfs We were never passing the arguments to add the PAUSE_IMAGE to the rootfs, leading to it never being present in the confidential image / initrd. Fixes: #9032 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 22:42:27 +01:00
GabyCT	aacbbde35d	Merge pull request #9172 from GabyCT/topic/docpradvice docs: Update Code PR advice document	2024-02-28 13:37:28 -06:00
Gabriela Cervantes	3cd319fcc2	scripts: General fixes to the gha-run script This PR implements general fixes to the gha-run script for the cri-containerd tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-28 19:32:51 +00:00
Gabriela Cervantes	5a498948c8	scripts: Skip cri-containerd in gha-run script This PR skips the cri-containerd in gha-run script for cloud hypervisor runtime-rs. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-28 19:30:38 +00:00
Gabriela Cervantes	4bfb9c30e7	gha: Add cloud-hypervisor (runtime-rs) support to cri-containerd tests This PR adds the Cloud Hypervisor driver, integrated with the runtime-rs, as part of the cri-containerd tests. Fixes #9181 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-28 19:24:18 +00:00
Wainer Moschetta	c4b8270073	Merge pull request #9009 from wainersm/runk_bats tests/runk: fix the "run ps command" flaky test	2024-02-28 15:58:36 -03:00
Wainer Moschetta	129ce84705	Merge pull request #9116 from wainersm/ci_install_kbs-workflow gha: k8s: prepare AKS workflow to install the CoCo KBS	2024-02-28 14:43:41 -03:00
Gabriela Cervantes	ec1dde1d01	docs: Update Code PR advice document This PR updates the code pr advice document to make the proper references now that we have move the test repository to the kata containers repository. Fixes #9171 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-28 16:14:22 +00:00
Ryan Savino	9e9dae8efb	versions: SNP qemu updated to stable coco tagged version New qemu fork of AMDESE created in confidential-containers project. SNP qemu version now pointed to stable tag at: https://github.com/confidential-containers/qemu/tree/amd-snp-202402240000 Fixes: #9173 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2024-02-28 09:28:14 -06:00
Fabiano Fidêncio	068d80a9cb	docs: releases: Update link for the release actions This allows users to go directly to the action page whenever a release needs to be cut. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	520cd90c43	release: Remove the "test-" from the release version This is not needed anymore as we can run the tests from any branch, and we can patch this locally before doing a test. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	22b19d0637	release: Add a step to get the release tags GitHub actions is fun and always willing to play tricks with us. This nice little kid decided that `echo "FOO=\"bar zaz\"" >> $GITHUB_ENV` is not valid, and it simply breaks things in a way that is a pain to debug. But hey, we take this path, and after doing so I realised that the correct way to export that is `echo "FOO=bar zaz" >> $GITHUB_ENV`. I know, this looks incorrect, but this fellow never stops surprising us. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	cdf1e4afde	release: Fix typo in the arm arch For some reason I'd changed arm64 to arm4 in a previous (already merged) commit. :-/ Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	3db0630bc1	release: Add our own bits to the release notes I'm getting here the most relevant parts of what we had as part of the release-notes.sh script. As the script will not be used anymore, it's been removed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	aaf38aca98	release: Fix typo in the _upload_libseccomp_tarball() RELEASE_VERSIOB -> RELEASE_VERSION Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:56 +01:00
Fabiano Fidêncio	397167836b	release: Fix yq installation For some reason we need to force its installation in the GOPATH, otherwise yq is not found. Ideally we should switch to a packaged version of yq, but that's a topic for another series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:55 +01:00
Fabiano Fidêncio	6915131adc	release: Fix KATA_DEPLOY_{IMAGE_TAGS,REGISTRIES} declaration Otherwise we may end up with an unbound variable. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:55 +01:00
Fabiano Fidêncio	757f958943	release: Adjust tags used to publish our deamonset We need to adjust the tags as when this workflow ends up being called from the release side, we'll receive "refs/tags/main" as the GITHUB_REF, and in that case we must use the release version. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:34:51 +01:00
Fabiano Fidêncio	d339366a16	release: Get the release version from our internal function This is utterly counter intuitive, but if we change a file during the GitHub Action, the checkout done for the next workflow won't have that file updated, but rather the branch on its original state when the workflow was created. This makes us safe to always "calculate" the next release version from the VERSION file at the time the workflow was triggered. This requires us to have the release type exported for the whole workflow. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:30:06 +01:00
Fabiano Fidêncio	8023d64b1a	release: Adjust "needs" in the release workflow Without those we'll end up running steps in parallel that should actually wait for a previous step to be completed. While here, let's also correct some of the "needs" that were waiting fro the wrong workflow to be finished. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:30:06 +01:00
Fabiano Fidêncio	d10b818de5	release: Add missing return to _check_required_env_var() Otherwise none of the calls to this function will actually continue after it's called. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:30:06 +01:00
Fabiano Fidêncio	0aa82e7050	release: Add missing env vars to _check_required_env_var() We missed doing this as part of `50011e89a0`, but we also need to check for: * RELEASE_VERSION * GH_TOKEN * ARCHITECTURE * KATA_STATIC_TARBALL While here, let's fix a ARCHITECURE -> ARCHITECTURE typo. Fixes: #9064 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-28 12:30:05 +01:00
Chengyu Zhu	bb4c608b32	Merge pull request #9110 from ChengyuZhu6/agent_option agent: Add all agent configuration options to README	2024-02-28 18:50:44 +08:00
Dan Mihai	352e2af5f0	Merge pull request #9153 from microsoft/danmihai1/clh-bootVM-timeout runtime: clh: minimum 10s timeout for CreateVM + BootVM	2024-02-27 09:58:01 -08:00
Wainer dos Santos Moschetta	b44e0c4e7c	gha: k8s: prepare AKS workflow to install the CoCo KBS Changed the "run k8s tests on AKS" workflows to get the CoCo KBS installed so that we can run attestation tests. The plan is to run attestation tests only on a subset of non-TEE jobs initially, so this commit restricts to install KBS only on kata-qemu configuration. Actually at this point it is added only stubs commands to tests/integration/kubernetes/gha-run.sh that should be implemented in a future commit. Fixes #9058 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-27 13:51:15 -03:00
Wainer Moschetta	6186410e35	Merge pull request #8949 from wainersm/tests_nydus tests/nydus: refactor the teardown()	2024-02-27 09:52:44 -03:00
ChengyuZhu6	731c490ded	agent: Add all agent configuration options to README Add all agent configuration options to README so that users can more easily understand what these options do and how to configure them at runtime. Fixes: #9109 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-27 17:35:19 +08:00
Fabiano Fidêncio	4aa40f1bbb	Merge pull request #9146 from fidencio/topic/releases release: Update everything in this repo related to the release and its process	2024-02-27 10:30:49 +01:00
Fabiano Fidêncio	111bb3ec66	release: Add "test-" into the release name This commit should be merged as it's now, then we trigger a test release, fix whatever has to be fixed, and drop it as soon as we know our workflows are working as expected. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:03 +01:00
Fabiano Fidêncio	d69766c0b2	docs: Update the release process Now that we've simplified it by quite a lot, let's update the documentation accordingly. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:03 +01:00
Fabiano Fidêncio	a85481110a	releases: Remove scripts that won't be used anymore Those are not needed anymore as we're automating our release process around GitHub actions. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:03 +01:00
Fabiano Fidêncio	e714c37521	gha: Remove workflows related to backporting stuff We're not doing backports anymore, as we're getting rid of the stable branches in favour of having a better release cadence from the main branch. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	3229c777e7	kata-deploy: Remove "stable" yamls As we're not maintaining a stable branch anymore, let's get rid of the kata-deploy stable pieces. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	008293f015	gha: Add release-{major,minor} workflows Those will allow us to cut a release just by a single click, instead of the current process we have. Fixes: #9064 -- part I Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	f9f04dca2b	gha: release: Update the workflow The release workflow is now updated to be a `workflow_call`, and it includes the steps that had to be manually done in the past, such as updating the needed files and creating the release itself. While on this, the kata-deploy multiarch manifest tags have been updated to match the new release scheme. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	f0675a163a	release: Add _next_release_version() This function returns the version of the next release (the one about to be cut), and it'll be used as part of our new workflow that will take care of the release. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	4675364d8d	release: Add _update_version_file() function Let's add a function that will be responsible for bumping the project's version in the VERSION file, and push it to the branch as part of the release process that will be introduced. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	a99f9026e1	release: Add _create_new_release() This is a helper function that will be used to create a new release as part of our release process workflow (which will still be modified). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	fd699625fe	release: Add _upload_libseccomp_tarball() As the name of the function says, it's responsible for uploading the libseccomp source tarballs as par of our release process. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	d517fa54ac	release: Add _upload_vendored_code_tarball() As hinted by the name of the function, this is used to generate and upload the vendored code we have as its own tarball. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	94b30fcb14	release: Add _upload_versions_yaml_file() As the name says, this function will be used to upload the versions.yaml file during a given release process of the project. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	50011e89a0	release: Add _upload_kata_static_tarball This function, as it names says, will be used to upload the kata-static.tar.xz tarballs generated during the release process. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:02 +01:00
Fabiano Fidêncio	a45988766c	release: Add _publish_multiarch_manifest() This function, as it names says, will be used to publish multiarch manifests for the Kata Containers CI and Kata Containers releases. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:01 +01:00
Fabiano Fidêncio	fb2ef32c04	release: Introduce the release.sh helper For now this script does nothing, but we're introducing it in order to redduce the diffs for the next commits in this series. My intention is to have as much as possible related to the release as part of this helper script, and it'll be populated function by function while replacing content that's "hard coded" (and duplicated) on different GitHub actions. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-27 08:34:01 +01:00
GabyCT	1a6c378d26	Merge pull request #9161 from GabyCT/topic/testsreadme docs: Update link for tests in README	2024-02-26 14:50:46 -06:00
Gabriela Cervantes	94615a4fd4	docs: Update link for tests in README This PR updates the link for the tests in README for Kata Containers. Fixes #9160 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-26 15:43:33 +00:00
Wainer dos Santos Moschetta	0f8c36d990	tests/nydus: refactor the teardown() This refactor the teardown() of tests/integration/nydus/nydus_tests.sh: * Moved boilerplate code that kill process to a loop; * Doesn't leave teardown() if a process failed to get killed, so that other clean up routines are ran; * Check if the pid exist then attempt to kill the process, so avoid this misleading message: ``` Usage: kill [options] <pid> [...] Options: <pid> [...] send signal to every <pid> listed -<signal>, -s, --signal <signal> specify the <signal> to be sent -q, --queue <value> integer value to be sent with the signal -l, --list=[<signal>] list all signal names, or convert one to a name -L, --table list all signal names in a nice table -h, --help display this help and exit -V, --version output version information and exit For more details see kill(1). ``` Fixes #8948 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-26 11:21:43 -03:00
Wainer dos Santos Moschetta	0f0ce9a81b	tests/runk: replace the busybox image It's recommended to avoid images from docker.io to avoid errors related with hitting the pull limits that happens mostly on bare-metal machines. So this replaced the docker.io's busybox with quay.io/prometheus/busybox. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-26 11:11:05 -03:00
Wainer dos Santos Moschetta	bba8b5b2b4	tests/runk: fix flaky test The "run ps command" test has failed once in a while because it doesn't wait the sh command to start within the container, consequently `ps` won't report the amount of lines expected. Fixes #8975 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-26 11:09:29 -03:00
Wainer dos Santos Moschetta	28a63070f7	gha: fix step name in run-runk-tests Likely copied from the tracing workflow by mistake. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-26 11:09:29 -03:00
Wainer dos Santos Moschetta	8a606eb94d	tests/runk: convert to bats Migrated runk tests from pure shell script to bats to be consistent with other test suites. The install_dependencies() will install the bats tool locally. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-26 11:09:23 -03:00
Xuewei Niu	bb5e33b33a	Merge pull request #9100 from littlejawa/fix_5738_metrics_memory runtime: remove kata_shim_netdev metric	2024-02-26 19:01:21 +08:00
James O. D. Hunt	0ea30f44cf	Merge pull request #9076 from jodh-intel/add-survey-link-to-release-notes packaging: release notes: Don't show shortlist by default, and add survey link	2024-02-26 10:25:19 +00:00
Steve Horsman	483ecbadf0	Merge pull request #9142 from ChengyuZhu6/protoc build-checks: Install protoc in the ci environments	2024-02-26 09:52:31 +00:00
Dan Mihai	f4509b806b	runtime: clh: minimum 10s timeout for CreateVM + BootVM Relax the timeout for calling CLH's CreateVM + BootVM APIs. When hitting the older 1s timeout, killing a half-booted Guest and retrying the same boot sequence could have been wasteful and resulting in unstable CI testing on slower Hosts. Fixes: #9152 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-24 19:15:57 +00:00
GabyCT	4f3c83cd12	Merge pull request #9115 from GabyCT/topic/adddief scripts: Add an enhanced die function	2024-02-23 12:03:02 -06:00
Saul Paredes	9b7bd376eb	genpolicy: panic when we see a volume mount subpath Based on https://github.com/kata-containers/runtime/issues/2812 Fixes: #9145 Signed-off-by: Saul Paredes <saulparedes@microsoft.com>	2024-02-23 09:56:51 -08:00
James O. D. Hunt	8c72abe38d	packaging: Add link to survey in release notes Add a link in the release notes to the Kata Container survey, to advertise it, and hopefully encourage users to take the survey. Fixes: #9074. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-02-23 09:57:52 +00:00
James O. D. Hunt	0391c0de82	packaging: Add twistie to release notes shortlog Add a "twistie" / arrow (`▶`) that the user can click on to see the full list of commits _if they want to_. This way, the release notes become easier to read and we can display information below the shortlog which would (probably) normally not be seen due to the huge long list of commits. Fixes: #9075. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-02-23 09:57:52 +00:00
ChengyuZhu6	3cc55ff8af	build-checks: Install protoc in the ci environments To test PR #8484 for pulling image in the guest with image-rs, the compilation process for the kata-agent relies on protoc: https://github.com/kata-containers/kata-containers/actions/runs/8016317290/job/21898040849?pr=8484 https://github.com/kata-containers/kata-containers/actions/runs/8016534530/job/21898654435?pr=8484 Fixes: #9141 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-23 17:38:13 +08:00
Xuewei Niu	89c76d7d8d	Merge pull request #9125 from gkurz/fix-agent-cgroup-ns agent: Run container workload in its own cgroup namespace (cgroup v2 guest only)	2024-02-23 10:40:17 +08:00
Steve Horsman	e342a9adc4	Merge pull request #9119 from ChengyuZhu6/pause-confidential kata-deploy: Add pause image to confidential rootfs	2024-02-22 17:10:55 +00:00
Steve Horsman	531dcd2f25	Merge pull request #9132 from ChengyuZhu6/nydus-snapshotter-version gha: bump nydus snapshotter version to v0.13.8	2024-02-22 17:10:42 +00:00
Steve Horsman	dfa6e932bb	Merge pull request #9122 from ChengyuZhu6/snapshotter-clean gha: try to cleanup nydus snapshotter before deploying it	2024-02-22 13:30:04 +00:00
Julien Ropé	1c306fe4a6	runtime-rs: stop reporting net dev metrics for the shim For consistency with the go runtime. As the shim itself is not using the network (all its communication with other processes is done with local unix sockets), there is no reason to keep gathering and reporting shim-specific network metrics. Actual network usage of the kata containers can be found from the existing agent network metrics (kata_guest_netdev_stat). Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-02-22 14:00:00 +01:00
Julien Ropé	9de65707ca	runtime: stop reporting net dev metrics for the shim As part of the shim network metrics, the shim is reporting network interfaces from the host with no namespace isolation - this gives insight in interfaces not tied to the kata containers, and causes an increase in resource usage for kata metrics. As the shim itself is not using the network (all its communication with other processes is done with local unix sockets), there is no reason to keep gathering and reporting shim-specific network metrics. Actual network usage of the kata containers can be found from the existing hypervisor network metrics (kata_hypervisor_netdev) and from the agent network metrics (kata_guest_netdev_stat). Fixes: #5738 Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-02-22 14:00:00 +01:00
ChengyuZhu6	8ab3894dc5	gha: try to cleanup nydus snapshotter before deploying it CI failed to deploy nydus snapshotter because it was not cleaned up last time. So we can try to cleanup nydus snapshotter before deploying it. Fixes: #9121 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-22 18:51:14 +08:00
Alex Lyn	5d3ae360ed	Merge pull request #9130 from Apokleos/bugfix-dragonball-invalidOperation runtime-rs: bugfix for GPU passthrough failed with InvalidOperation.	2024-02-22 17:47:09 +08:00
ChengyuZhu6	f16f709a5e	kata-deploy: Add pause image to confidential rootfs For confidential containers, the pause image needs to be installed in the rootfs. Fixes: #9118 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-22 15:41:16 +08:00
ChengyuZhu6	d8db3fb17f	gha: bump nydus snapshotter version to v0.13.8 Bump nydus snapshotter version to v0.13.8 to fix the bug in v0.13.7 : https://github.com/containerd/nydus-snapshotter/pull/582 Fixes: #9131 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-22 15:35:08 +08:00
Alex Lyn	014e0f4e46	runtime-rs: bugfix for GPU passthrough failed with InvalidOperation. We need initailize the pci_hotplug_enabled with true before we do GPU passthrough with runtime-rs/dragonball. Otherwise it fails with error `InvalidOperation`. Fixes: #9129 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-02-22 10:22:32 +08:00
Dan Mihai	58fbb9f6ec	Merge pull request #9073 from microsoft/danmihai1/test-genpolicy3 tests: k8s: generated policy for additional tests	2024-02-21 14:11:51 -08:00
Dan Mihai	b3c3f992ab	tests: k8s: common clean-up on teardown teardown() gets executed after each test case, so there is no need to clean-up before teardown. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	9c164698d3	tests: k8s: k8s-optional-empty-configmap policy Auto-generate policy for k8s-optional-empty-configmap.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	74a52c6d25	tests: k8s: k8s-oom.bats auto-generated policy Auto-generate policy for k8s-oom.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	26a77d67f4	tests: k8s: k8s-number-cpus auto-generated policy Auto-generate policy for k8s-number-cpus. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	9cbdce15fd	tests: k8s: k8s-memory.bats auto-generated policy Auto-generate policy for k8s-memory.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	40209cc0b7	tests: k8s: k8s-limit-range auto-generated policy Auto-generate policy for k8s-limit-range.bats. Also, fix teardown() namespace. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	df3c0318c6	tests: k8s: add set_namespace_to_policy_settings Add set_namespace_to_policy_settings() for changing the pod namespace in genpolicy settings. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:08 +00:00
Dan Mihai	6e14ce93c9	tests: k8s-kill-all-process-in-container policy Auto-generate policy for k8s-kill-all-process-in-container.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	fad7ba0aea	tests: k8s: k8s-job.bats auto-generated policy Auto-generate policy for 8s-job.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	41c2bcbdc5	tests: k8s: k8s-file-volume auto-generated policy Auto-generate policy for k8s-file-volume.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	d84f50db5b	genpolicy: fix typo in policy logging Improve logging, for easier debugging. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	81e641814f	tests: k8s: k8s-cpu-ns auto-generated policy Auto-generate policy for k8s-cpu-ns.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	bc6d3fc238	tests: k8s: k8s-env.bats auto-generated policy Auto-generate policy for k8s-env.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	0a4fc071ac	tests: k8s: k8s-custom-dns auto-generated policy Auto-generate policy for k8s-custom-dns.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	f693f49e92	tests: k8s: k8s-credentials-secrets policy Auto-generate policy for k8s-credentials-secrets.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	d3d27bbb5b	tests: k8s: k8s-configmap auto-generated policy Auto-generate policy for k8s-configmap.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Dan Mihai	b318535536	tests: k8s: auto-generate k8s-caps.bats policy Auto-generated policy for k8s-caps.bats. Fixes: #9072 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-21 18:08:07 +00:00
Greg Kurz	600b951afd	agent: Run container workload in its own cgroup namespace When cgroup v2 is in use, a container should only see its part of the unified hierarchy in `/sys/fs/cgroup`, not the full hierarchy created at the OS level. Similarly, `/proc/self/cgroup` inside the container should display `0::/`, rather than a full path such as : 0::/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podde291f58_8f20_4d44_aa89_c9e538613d85.slice/crio-9e1823d09627f3c2d42f30d76f0d2933abdbc033a630aab732339c90334fbc5f.scope What is needed here is isolation from the OS. Do that by running the container in its own cgroup namespace. This matches what runc and other non VM based runtimes do. Fixes #9124 Signed-off-by: Greg Kurz <groug@kaod.org>	2024-02-21 13:14:13 +01:00
Greg Kurz	14886c7b32	agent: lint code Run cargo-clippy to reduce noise in actual functional changes. Signed-off-by: Greg Kurz <groug@kaod.org>	2024-02-21 13:14:13 +01:00
ChengyuZhu6	cddaf2ce97	kata-deploy: Remove specific kernel/initrd/image leftovers in Makefile Remove specific kernel/initrd/image leftovers in Makefile of local-build, which is the part of #9026. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-21 18:24:10 +08:00
Chelsea Mafrica	241a56989a	Merge pull request #9090 from GabyCT/topic/pulldockerimage gha: docker: Pull docker image as part of the dependencies	2024-02-20 14:28:53 -08:00
GabyCT	ea78013c7e	Merge pull request #9079 from GabyCT/topic/removecilink docs: Update CI link into the README	2024-02-20 14:11:13 -06:00
GabyCT	64c09fe6c5	Merge pull request #9088 from GabyCT/topic/fixnydus gha: nydus: Fix indentation in gha run script	2024-02-20 14:09:54 -06:00
Gabriela Cervantes	ff8a6fa9ef	scripts: Add error script This PR adds the error script to display the error message with much more information to help debugging. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-20 18:30:03 +00:00
Gabriela Cervantes	43a46d5a6b	scripts: Add an enhanced die function This PR adds an enhanced die function in order to dump more information in a yaml format that will help with the debugging. Fixes #9105 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-20 18:27:44 +00:00
Archana Shinde	6d84fe3a37	Merge pull request #8647 from amshinde/cleanup-network Cleanup network to make sure physical interfaces are restores back to original host driver.	2024-02-20 08:59:53 -08:00
Archana Shinde	6d38fa1530	network: Try removing as many changes as possible during network cleanup In case an error is encountered while removing a network endpoint during network cleanup, we cuurently return immediately with the error. With this change, in case of error we simply log the error and proceed towards removing the next endpoint. With this, we can cleanup the network changes made by the shim as much as possible. This is especially important when multiple interfaces are passed to the network namespace using a network plugin like multus. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-20 06:08:05 -08:00
Archana Shinde	b005cda689	network: Move up defer block tp cleanup network Move the defer for cleaning up network before the call to add network. This way if any change made by add network is reverted by in case of failure. This is particulary important for physical network interfaces as with this step we make sure that driver for the physical interface is reverted back to the original host driver. Without this the physical network iterface will remain bound to vfio. Fixes: #8646 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-20 06:06:42 -08:00
Ryan Savino	61ce7455c5	Merge pull request #9086 from niteeshkd/nd_snp_upm packaging: qemu-snp-experimental: support host kernel with gmem	2024-02-19 10:50:13 -06:00
Fabiano Fidêncio	79dc6e95d1	Merge pull request #9108 from fidencio/topic/ci-k8s-fix-wrong-logic-on-confidential-tests ci: k8s: Fix checks used to skip confidential tests	2024-02-19 12:49:57 +01:00
Xuewei Niu	f9307f6852	Merge pull request #9112 from ChengyuZhu6/vendor runtime: fix checksum mismatch error in `make vendor`	2024-02-19 10:54:38 +08:00
ChengyuZhu6	96c297cb37	runtime: fix checksum mismatch error in `make vendor` Fix checksum mismatch error in `make vendor`. Fixes: #9111 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-18 22:22:38 +08:00
Fabiano Fidêncio	3468ac3b6e	ci: k8s: Fix checks used to skip confidential tests This has been introduced by `53bc4a432b`, where the condition was changed. The correct condition is: * If the list of supported tees does not contain the kata hypervisor and the list of supported non tees does not contain the kata hypervisor. The error is that we were checking whether kata-hypervisor would contain the list of supported tees, and that would almost always be false (unless in the case where the list had an one and only one element). Fixes: #9055 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-18 10:10:45 +01:00
Niteesh Dubey	0538bbfc49	packaging: qemu-snp-experimental: support host kernel with gmem This is required to allow creation of SNP coco on host kernel (e.g. https://github.com/AMDESE/linux ,branch:snp-host-latest) supporting guest private memory for SNP using gmem. Note: This qemu does not work if the host kernel does not support gmem/UPM. Fixes: #9092 Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-02-15 16:33:46 +00:00
Wainer Moschetta	db744aa8d2	Merge pull request #9023 from ldoktor/webhook-path tools.kata-webhook: Fix lib path	2024-02-15 12:34:01 -03:00
Fabiano Fidêncio	28b4e5ce51	Merge pull request #9099 from BbolroC/skip-k8s-sandbox-vcpus-allocation-s390x CI\|k8s: Skip vcpu allocation test for s390x	2024-02-15 16:05:18 +01:00
James O. D. Hunt	d1513b2030	Merge pull request #9091 from jodh-intel/packaging-add-kata-manager-script packaging: Add the kata manager script	2024-02-15 13:08:36 +00:00
Hyounggyu Choi	8b3f7f353d	CI\|k8s: Skip vcpu allocation test for s390x A test `vcpu allocation k8s test` exhibits different behavior on s390x For more details, please refer to issue #9093. This commit is to make the test skipped until the issue is resolved on the platform. Fixes: #9093 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-15 12:26:35 +01:00
Fabiano Fidêncio	9178541dfb	Merge pull request #9098 from fidencio/topic/runtime-update-runc-to-v1.1.12 runtime: Update runc to v1.1.12	2024-02-15 09:29:10 +01:00
Fabiano Fidêncio	eea4277fbf	runtime: Update runc to v1.1.12 Although we don't seem to be affected by https://nvd.nist.gov/vuln/detail/CVE-2024-21626, we vendor and use the runc package in a few different places of our code, and we better update the package to its latest release. Fixes: #9097 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-14 23:13:39 +01:00
James O. D. Hunt	8c51e02f55	packaging: Add the kata manager script Add `kata-manager.sh` to the release packages. Fixes: #9066. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-02-14 17:44:42 +00:00
James O. D. Hunt	e49aeec97f	packaging: Use variable for default binary permissions Create a variable for the default binary permissions. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-02-14 17:44:35 +00:00
James O. D. Hunt	cc2d96671f	packaging: Remove extraneous whitespace Remove some unnecessary whitespace from a couple of `kata-deploy` files. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> whitespace Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-02-14 17:44:08 +00:00
Fabiano Fidêncio	c95c37d2ab	Merge pull request #9026 from fidencio/topic/packaging-remove-tee-specific-leftovers packaging: Remove leftovers from the transition from TEE specific kernel / initrd / image to the "confidential" ones	2024-02-13 22:14:26 +01:00
GabyCT	9cf343779f	Merge pull request #9062 from GabyCT/topic/nonteet tests: Add ability to run non-TEE environments	2024-02-13 14:28:07 -06:00
Fabiano Fidêncio	74c8d243ea	versions: Remove TEE specific kernels We've switched to using the confidential one, instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 19:07:33 +01:00
Fabiano Fidêncio	adbe24c283	versions: Remove non-used tdx / sev image and initrd entries We've switched to using the confidential ones, instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 19:07:33 +01:00
Fabiano Fidêncio	6c3338271b	packaging: kernel: Remove sev/snp/tdx specific stuff Now we're using a "confidential" image that has support for all of those. Fixes: #9010 -- part II #8982 -- part II #8978 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 19:07:33 +01:00
Gabriela Cervantes	598c77409a	gha: docker: Pull docker image as part of the dependencies This PR pulls the docker image needed for the test as part of the dependencies in order to avoid failures of timeouts mainly because the image was not properly download it and it is unable to find it. Fixes #9089 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-13 17:48:31 +00:00
Gabriela Cervantes	53bc4a432b	tests: Add ability to run non-TEE environments This PR adds the ability to run k8s confidential tests in a non-TEE environment. Fixes #9055 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-13 17:27:55 +00:00
Fabiano Fidêncio	14f4480f12	packaging: Remove specific TEEs image / initrd leftovers Let's remove the targets as those are not built anymore as part of our CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 18:03:12 +01:00
Fabiano Fidêncio	0c761f14b3	packaging: Remove specific TEEs kernel leftovers Let's remove the targets as those are not built anymore as part of our CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 18:03:11 +01:00
Fabiano Fidêncio	28488f0790	Merge pull request #9082 from fidencio/topic/cleanup-kata-deploy-leftovers-before-start-a-test tests: Remove kata-deploy-tdx test and ensure kata-deploy is always cleaned up before starting the tests	2024-02-13 18:01:16 +01:00
Gabriela Cervantes	54d1f34650	gha: nydus: Fix indentation in gha run script This PR fixes the indentation in gha run script for nydus. Fixes #9087 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-13 16:53:28 +00:00
Fabiano Fidêncio	a867e19da1	gha: tdx: Stop running kata-deploy tests on TDX We only have one TDX machine, let's not make it busier than needed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 14:14:57 +01:00
Fabiano Fidêncio	3877a9f49a	ci: Clean up kata-deploy ds before starting the tests This will ensure no leftovers are in the node, which has been cause the TDX CI to fail every now and then. Fixes: #9081 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 14:10:44 +01:00
Fabiano Fidêncio	8fe7349d3e	Merge pull request #9080 from fidencio/topic/dont-add-the-pause-image-to-the-released-tarball release: Don't ship the pause-image / coco-guest-components as part of the release artefacts	2024-02-13 12:34:29 +01:00
Fabiano Fidêncio	443a5b8327	release: Don't ship the coco-guest-components In the same way that doesn't make sense to ship the pause-image, it also doesn't make sense to ship the coco-guest-components itself as part an release artefact. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 09:47:26 +01:00
Fabiano Fidêncio	0462b33a5b	release: Don't ship the pause-image It doesn't make sense to ship the pause-image itself as an release artefact. The reason we build it and cache it is in order to use it inside the rootfs, and that's it, there's not need to ship it as part of the release, at all. Fixes: #9032 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-13 09:45:50 +01:00
GabyCT	00be9ae872	Merge pull request #9070 from microsoft/danmihai1/debug-containers tests: k8s: avoid deleting unrelated pods	2024-02-12 15:24:15 -06:00
Gabriela Cervantes	69b325a31c	docs: Update CI link into the README This PR updates the CI link into the README as currently we are using GHA workflows and they are now part of the kata containers repository. Fixes #9078 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-12 20:53:25 +00:00
Greg Kurz	532567bfe9	Merge pull request #8936 from fidencio/topic/fix-cri-o-ci tests: cri-o: Use packages from pkgs.k8s.io	2024-02-12 10:04:53 +01:00
Dan Mihai	42d13a0f33	Merge pull request #9068 from microsoft/danmihai1/dockerfile-linux-musl-gcc tools: avoid rootfs-image build "ln -s" error	2024-02-11 18:02:53 -08:00
Greg Kurz	d7afd31fd4	Merge pull request #8455 from BbolroC/runtime-rs-qemu-config runtime-rs: Add a new config option for QEMU	2024-02-10 08:48:23 +01:00
Dan Mihai	a21ca9b7c9	tests: k8s: avoid deleting unrelated pods Delete the debugger pod created during the test, rather than already existing debugger pods. Also, send the output of "kubectl delete" to stderr, just in case it's useful for debugging. Fixes: #9069 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-09 22:48:41 +00:00
Dan Mihai	a054462eb7	Merge pull request #9051 from microsoft/danmihai1/k8s-copy-file tests: k8s: k8s-copy-file auto-generated policy	2024-02-09 12:30:49 -08:00
Hyounggyu Choi	05c4c8055c	runtime-rs: Configure argument replacement for QEMU in Makefile Last but not least, all placeholders for argument replacement should be configured to generate a configuration file when `QEMUCMD` is defined. This enriches those variables. Additionally, this involves creating a symbolic link to `configuration-qemu.toml` if QEMU is defined as the default hypervisor. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-09 19:31:20 +01:00
Dan Mihai	fcd005774d	tools: avoid rootfs-image build "ln -s" error Avoid error when building for amd64 using: USE_CACHE=no AGENT_POLICY=yes DEBUG=1 \ tools/packaging/kata-deploy/local-build/kata-deploy-binaries.sh \ --build=rootfs-image Fixes: #9067 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-09 17:10:35 +00:00
GabyCT	b8f277676f	Merge pull request #9047 from GabyCT/topic/ukd docs: Remove jenkins reference in kernel documentation	2024-02-09 10:58:06 -06:00
Fabiano Fidêncio	e78a951e03	Merge pull request #8585 from ChengyuZhu6/dependencies-for-guest-pull gha: Setup nydus snapshotter for CoCo tests	2024-02-09 16:45:42 +01:00
Hyounggyu Choi	27cb30d8ce	runtime-rs: Adjust configuration template for runtime-rs There are some variables newly introduced to runtime-rs, such as: - runtime.name - runtime.hypervisor_name - runtime.agent_name - vm_rootfs_driver Additionally some of the placeholders for argument replacement are made hypervisor-specific based on the changes made for dragonball. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-09 16:26:59 +01:00
ChengyuZhu6	97fbf360cc	gha: Cleanup nydus snapshotter by the daemonset Cleanup nydus snapshotter by the daemonset. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-09 14:47:13 +01:00
ChengyuZhu6	43b04fd0c0	gha: Deploy nydus snapshotter by the daemonset We can use daemonset to deploy nydus snapshotter, which will decrease one manual step both for Kata Containers and Confidential Containers CI. Fixes: #8584 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-09 14:47:09 +01:00
Julien Ropé	236c2c7650	tests: cri-o: Update critools version to 1.29 This will also update the version of crio used in kata-monitor tests. Signed-off-by: Julien Ropé <jrope@redhat.com>	2024-02-09 12:15:55 +01:00
Fabiano Fidêncio	344e0580ca	tests: cri-o: Use packages from pkgs.k8s.io CRI-O has moved, for a long time, towards pkgs.k8s.io, see: https://kubernetes.io/blog/2023/10/10/cri-o-community-package-infrastructure/ With this the OBS repo won't be used anymore. Fixes: #8935 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-09 12:15:55 +01:00
Fabiano Fidêncio	03f7cfd429	Merge pull request #9061 from GabyCT/topic/csk tests:k8s: make add_kernel_initrd_anotations function generic	2024-02-09 10:05:58 +01:00
Fabiano Fidêncio	555784268d	Merge pull request #9031 from ChengyuZhu6/guest-pull-rootfs packaging/osbuilder: allow to pull and unpack pause image	2024-02-08 22:21:44 +01:00
Gabriela Cervantes	0b508f301b	tests:k8s: make add_kernel_initrd_anotations function generic This PR replaces the add_kernel_initrd_annotations_to_yaml function more generic so later can be used for other components. Fixes #9054 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-08 19:30:43 +00:00
Dan Mihai	f139c7dc60	tests: k8s: k8s-copy-file auto-generated policy Auto-generate policy for k8s-copy-file.bats. Fixes: #9050 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-08 13:26:05 +00:00
Dan Mihai	1179306afa	tests: k8s: additional policy testing utilities 1. add_requests_to_policy_settings allows one or more ttrpc requests from the Host to the Guest. Example: add_requests_to_policy_settings "${policy_settings_dir}" \ "ReadStreamRequest" "WriteStreamRequest" 2. add_copy_from_host_to_policy_settings allows executing on the Guest the commands initiated behind the scenes by "kubectl cp" from the Host to the Guest. Example: add_copy_from_host_to_policy_settings "${policy_settings_dir}" 3. add_copy_from_guest_to_policy_settings allows executing on the Guest the commands initiated behind the scenes by "kubectl cp" from the Guest to the Host. Example: add_copy_from_guest_to_policy_settings "${policy_settings_dir}" \ "/tmp/file.txt" Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-08 13:25:41 +00:00
Steve Horsman	b99f574522	Merge pull request #9037 from niteeshkd/nd_SevSnpGuest runtime: fix creation of SEV confidential container on SNP enabled host.	2024-02-08 09:29:20 +00:00
ChengyuZhu6	a43edd0c30	rootfs: Install pause image into rootfs Install the pause image into the confidential rootfs image and initrd. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-08 16:49:56 +08:00
Greg Kurz	6ead48ec06	Merge pull request #8986 from pmores/drop-shim-v2-address-value-validation runtime-rs: fix interoperability issues between runtime-rs and cri-o	2024-02-08 09:44:12 +01:00
ChengyuZhu6	42ef6bdcae	osbuilder:rootfs: support to unpack pause image to rootfs This env ver will serve us to pass the pause image tarball to the rootfs builder, which will then just unpack the content into the rootfs. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>	2024-02-08 16:29:36 +08:00
ChengyuZhu6	53183cba31	workflow: Enable to build pause image in ci Enable to build pause image static tarball for confidential containers casesi in ci environment. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-08 11:23:23 +08:00
ChengyuZhu6	70a84eca9e	packaging: allow to pull and unpack pause image For Confidential containers stack, the pause image is managed by host side, then it may configure a malicious pause image, we need package a pause image inside the rootfs and don't the pause image from host. But the installation of skopeo is not included in 20.04 release, so we can not directly install skopeo in rootfs and pull pause image. So I plan to let the task as a static build stuff, which would not be influenced by the system version in rootfs. And the pause image will be part of the Kata Containers rootfs that's used by the Confidential Containers usecase. This commit enables the component to be built both locally and in our CI environment with the command: make pause-image-tarball. Fixes: #9032 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>	2024-02-08 11:23:23 +08:00
Dan Mihai	9a780aa98f	genpolicy: improve logging from ExecProcessRequest Additional logging from the ExecProcessRequest rules, for easier debugging. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-08 02:21:58 +00:00
Dan Mihai	dab567bdfa	genpolicy: add easy way to allow CloseStdinRequest For example, Kata CI's k8s-copy-file.bats transfers files between the Host and the Guest using "kubectl exec", and that results in CloseStdinRequest being called from the Host. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-08 02:21:58 +00:00
Dan Mihai	8401adb113	genpolicy: update default values 1. Remove PullImageRequest because that is not used in the main branch. It was used in the CCv0 branch. 2. Add default false values for the remaining Kata Agent ttrpc requests. These changes don't change the functionality of the auto generated Policy, but they help with easier understanding the Policy text and the logging from the Rego rules. Fixes: #9049 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-08 02:21:58 +00:00
Dan Mihai	535db6b29c	Merge pull request #9043 from ChengyuZhu6/assert runtime-rs: fix assert error in `make check`	2024-02-07 18:19:18 -08:00
Dan Mihai	2bb91c9d8f	Merge pull request #8922 from microsoft/danmihai1/k8s-attach-handlers tests: k8s-attach-handlers auto-generated policy	2024-02-07 13:29:50 -08:00
Dan Mihai	01745689e1	Merge pull request #9029 from microsoft/danmihai1/k8s-empty-dirs genpolicy: mount source for non-confidential guest	2024-02-07 11:26:16 -08:00
Dan Mihai	6b5e57f7c7	tests: k8s: address PR review feedback 1. Rename install_kata_common to install_kata_core. 2. Add TODO for better way to install the Kata tools. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 18:51:56 +00:00
Steve Horsman	934d8dca0f	Merge pull request #9045 from ChengyuZhu6/nydus-version nydus: Bump nydus snapshotter version to v0.13.7	2024-02-07 17:20:21 +00:00
Pavel Mores	6346e04cf7	runtime-rs: fix handling of TTRCP_ADDRESS Since cri-o doesn't seem to use address for event publishing as mentioned in the previous commit it will not send it. However, the exact way of not sending it is unfortunately different from what is assumed by runtime-rs. Due to an implementation detail of cri-o which uses containerd libraries for some low-level tasks, TTRPC_ADDRESS will not be missing from environment as assumed, instead it will be present with an empty value. This commit contains a small adjustment to account for that and use LogForwarder even if TTRPC_ADDRESS is present, but with an empty value. Fixes #8985 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-02-07 17:01:04 +01:00
Gabriela Cervantes	ff1ace1c74	docs: Remove jenkins reference in kernel documentation This PR removes the jenkins reference which is not longer being used in the kernel documentation. Fixes #9046 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-07 15:44:07 +00:00
ChengyuZhu6	d0b8e6d8f3	nydus: Bump nydus snapshotter version to v0.13.7 Bump nydus snapshotter version to v0.13.7. The new release name of nydus snapshotter is `nydus-snapshotter-v0.13.7-linux-amd64.tar.gz`, which differs from the version used by kata (`nydus-snapshotter-v0.12.0-x86_64.tgz`). Therefore, we need to update the script to obtain the correct nydus snapshotter name. Fixes: #9044 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-07 22:17:05 +08:00
ChengyuZhu6	34c47e08b2	runtime-rs: fix assert error in test in `make check` Fix assert error: error: used `assert_eq!` with a literal bool --> crates/hypervisor/src/ch/inner.rs:218:9 \| 218 \| assert_eq!(state.jailed, false); \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#bool_assert_comparison = note: `-D clippy::bool-assert-comparison` implied by `-D warnings` Fixes: #9042 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-07 19:31:10 +08:00
Archana Shinde	d9ce88ada3	Merge pull request #8704 from amshinde/runtime-rs-clh-implement-persist runtime-rs: implement persist api for cloud-hypervisor	2024-02-07 02:29:33 -08:00
Dan Mihai	dd16bc393f	tests: k8s: k8s-attach-handlers generated policy Automatically generate the test policy for k8s-attach-handlers.bats, if AUTO_GENERATE_POLICY is enabled. Steps: - Create a temporary directory for the current test and copy the common genpolicy settings into this new directory. - Change genpolicy settings in the temp directory to allow the "kubectl exec" command that this test needs. (For CoCo, exec is blocked by the default policy settings) - Auto-generate the policy for the test YAML file. - Test as usual, using the YAML file. - Clean-up the temporary settings described above. Fixes: #8921 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:26:03 +00:00
Dan Mihai	0de407f8b7	tests: k8s: enable AUTO_GENERATE_POLICY Enable AUTO_GENERATE_POLICY for one of the Kata CI K8s test platforms. Additional platforms will be enabled after testing them. When AUTO_GENERATE_POLICY is enabled, create genpolicy settings that are common for all tests. Some of the tests will make temporary copies of these common settings and customize them as needed. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:25:54 +00:00
Dan Mihai	05b2e4f606	tests: k8s: install genpolicy Install the genpolicy app before starting test execution. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:25:42 +00:00
Dan Mihai	8aa8b70573	tests: k8s: add policy test utilities Add script functions useful for auto-generating and testing policy. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:24:06 +00:00
Dan Mihai	24a17a2e1b	tests: k8s: output the names of test files Output the names of test files, for easier search through logs. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:23:54 +00:00
Dan Mihai	bf533de31a	tests: k8s: add DEBUG support for test scripts Make these scripts easier to debug. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:23:46 +00:00
Dan Mihai	1b4ef672ef	tests: k8s: reduce namespace name duplication 1. Avoid repeating "kata-containers-k8s-tests". 2. Allow users to specify a different test namespace. 3. Introduce the TEST_CLUSTER_NAMESPACE variable, that will also be useful when auto-generating the Agent Policy for these tests. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:23:38 +00:00
Dan Mihai	8a5ba5fb34	tests: k8s: allow run_kubernetes_tests.sh exec Allow everyone to directly execute run_kubernetes_tests.sh, for easier local testing. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-07 02:23:30 +00:00
Fabiano Fidêncio	11ba90ebf2	Merge pull request #8958 from fidencio/topic/kata-manager-nerdctl-support kata-manager: Add support for nerdctl installation	2024-02-06 21:33:48 +01:00
GabyCT	d74b6e143f	Merge pull request #8951 from GabyCT/topic/udf metrics: Update packages for TensorFlow ResNet Int8 Dockerfile	2024-02-06 14:29:41 -06:00
GabyCT	6337f300a8	Merge pull request #8628 from GabyCT/topic/enablek8stclh tests: k8s: Enable tests for cloud hypervisor runtime-rs without devicemapper	2024-02-06 14:28:35 -06:00
Niteesh Dubey	3e383674f8	runtime: fix creation of SEV confidential container on SNP enabled host. This is needed to fix the bug which is not allowing to create SEV container on SNP enabled host anymore. This is a regression that was introduced as part of the following commit: `de39fb7d38` Fixes: #9036 Signed-off-by: Niteesh Dubey <niteesh@us.ibm.com>	2024-02-06 19:01:30 +00:00
Hyounggyu Choi	462afcf829	runtime-rs: Copy configuration for QEMU from runtime It makes sense to reuse a configuration template for runtime-golang as a base. This is simply to copy it into the config directory. Fixes: #8441 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-06 19:35:44 +01:00
Fabiano Fidêncio	058f068d67	Merge pull request #9020 from BbolroC/ok-to-test-static-checks-but-x86 gha: Run static-checks on self-hosted runners conditionally	2024-02-06 19:30:21 +01:00
Gabriela Cervantes	cf049fc718	k8s: Skip k8s tests that are not working This PR skips the k8s tests that are not working with cloud hypervisor runtime-rs with its proper issue. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-06 16:52:02 +00:00
Pavel Mores	f0256fded5	runtime-rs: remove validation of shim v2 -address value It appears that under the shim v2 protocol, a shim has no use of its own for the -address value, it just passes it back to container runtime's (mostly containerd or cri-o) event-publishing binary. Since the -address value only flows through the shim, being passed to the shim by a container runtime and then essentially passed back by shim to the container runtime, it seems inappropriate for a shim to validate the value that is fully owned and only used by the container runtime. This commit removes such validation from runtime-rs. Doing so, it solves (part of) an interoperability problem between runtime-rs and cri-o. cri-o seems to intentionally choose not to implement the event-publishing part of the shim v2 protocol and thus it has no value it could pass to runtime-rs for -address. As a result, it sends an empty string which has been failing the excessive validation performed by runtime-rs so far. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-02-06 13:43:09 +01:00
Wainer Moschetta	f1ca5d1563	Merge pull request #8953 from ChengyuZhu6/ci-guest-pull gha: Enable nydus snapshotter in CoCo ci tests	2024-02-06 09:36:59 -03:00
Fabiano Fidêncio	1ccb850ee7	Merge pull request #9027 from fidencio/topic/add-libattest-tdx-into-the-confidential-rootfs rootfs: Add libattest-tdx into the confidential rootfs	2024-02-06 12:52:13 +01:00
Fabiano Fidêncio	ce82b5e3f5	rootfs: Add libtdx-attest into the confidential rootfs This is required as the tdx-attest-rs crate, which is used as part of the guest components, has a runtime dependency on libattest-tdx. Fixes: #9021 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-06 09:13:49 +01:00
Xuewei Niu	67d9847fac	Merge pull request #9025 from wainersm/cri-containerd_fix_loop cri-containerd: fix loop in TestContainerMemoryUpdate()	2024-02-06 14:49:57 +08:00
Amulya Meka	354a3093fa	Merge pull request #9019 from Amulyam24/k8s-fix gha: add GOPATH env var to the ppc64le k8s workflow	2024-02-06 11:01:49 +05:30
Alex Lyn	1ab9a21492	Merge pull request #8552 from deagon/fix/missing-port-type runtime: missing port type in the DeviceInfo	2024-02-06 10:56:46 +08:00
Dan Mihai	473efc2149	genpolicy: mount source for non-confidential guest The emergent Kata CI tests for Policy use confidential_guest = false in genpolicy-settings.json. That value is inconsistent with the following mount settings: "emptyDir": { "mount_type": "local", "mount_source": "^$(cpath)/$(sandbox-id)/local/", "mount_point": "^$(cpath)/$(sandbox-id)/local/", "driver": "local", "source": "local", "fstype": "local", "options": [ "mode=0777" ] }, We need to keep those settings for confidential_guest = true, and change confidential_guest = false to use: "emptyDir": { "mount_type": "local", "mount_source": "^$(cpath)/$(sandbox-id)/rootfs/local/", "mount_point": "^$(cpath)/$(sandbox-id)/local/", "driver": "local", "source": "local", "fstype": "local", "options": [ "mode=0777" ] }, The value of the mount_source field is different. This change unblocks testing using Kata CI's pod-empty-dir.yaml: genpolicy -u -y pod-empty-dir.yaml kubectl apply -f pod-empty-dir.yaml k get pod sharevol-kata NAME READY STATUS RESTARTS AGE sharevol-kata 1/1 Running 0 53s Fixes: #8887 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-06 01:19:48 +00:00
Fabiano Fidêncio	ffa190831d	Merge pull request #9022 from fidencio/topic/add-guest-components-to-the-confidential-image-and-initrd rootfs: confidential: Install coco-guest-components	2024-02-05 18:56:48 +01:00
Hyounggyu Choi	40b2b2a43a	gha: Run static-checks on self-hosted runners conditionally Due to the restrictions on instance provisioning for self-hosted runners, performing static checks (36 jobs at the time of writing) on them each time a PR is updated could significantly burden them, consequently slowing down the entire CI system. To address this, the decision is to trigger these checks only when an 'ok-to-test' label is added. Meanwhile, the checks for x86_64, which are supported by GitHub-hosted runners, will remain unchanged. Fixes: #8998 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-05 15:24:21 +01:00
Wainer dos Santos Moschetta	106e1af497	cri-containerd: fix loop in TestContainerMemoryUpdate() The loop that generate test cases for virtio-mem enabled/disabled doesn't return the integers '1' and '0' as expected. Instead it returns the strings '{1,' and '0}'. Fixes #9024 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-05 10:59:39 -03:00
Fabiano Fidêncio	27e7974048	rootfs: confidential: Install coco-guest-components Let's install the coco-guest-components into the confidential rootfs image and initrd. Fixes: #9021 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-05 14:41:29 +01:00
Fabiano Fidêncio	f80dbcee0e	rootfs: Add logging about the coco guest components This will make our lives easier to figure out whether the components are being installed or not. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-05 14:41:29 +01:00
Fabiano Fidêncio	68b8186ec4	osbuilder: Expose COCOGUEST_COMPONENTS_TARBALL We need to pass this to the container where the rootfs is built, so it can actually be unpacked inside the rootfs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-05 14:41:28 +01:00
Lukáš Doktor	3b0049b2a4	tools.kata-webhook: Fix lib path When moving the webhook we skipped the common.bash as (close-enough) version is already in `/tests` but we forgot to update the source path, fixing it here. Fixes: #8653 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-02-05 14:17:24 +01:00
Fabiano Fidêncio	64d09874c3	packaging: coco-guest-components: Pass DESTDIR to the build script As DESTDIR was not being passed, we've been installing the final binaries in a container path that was not exposed to the host, leading to creating an empty tarball with the guest components. Now, theoretically, guest-components should respect a PREFIX passed, but that's not the case and we're manually adding "/usr/local/bin" to the passed DESTDIR. Here's the result of the tarball: ```bash ⋊> kata-containers ≡ tar tf build/kata-static-coco-guest-components.tar.xz ./ ./usr/ ./usr/local/ ./usr/local/bin/ ./usr/local/bin/confidential-data-hub ./usr/local/bin/attestation-agent ./usr/local/bin/api-server-rest ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-05 14:07:10 +01:00
ChengyuZhu6	a214bd8d13	gha: Enable nydus snapshotter in CoCo ci tests This PR is a split of #8585. make the changes on the Github workflows, and the skeleton to deploy_snapshotter() and cleanup_snapshotter() in tests/integration/kubernetes/gha-run.sh in this commit. After initially merging this patch to trigger CI jobs for CoCo, which will begin executing the dummy functions deploy_snapshotter() and cleanup_snapshotter(), the implementation details for these functions remain in #8585. Our subsequent step involves transferring this logic to the PR #8484, enabling the PR to undergo CI testing prior to its merge. Fixes: #8997 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-05 18:51:59 +08:00
Fabiano Fidêncio	1362918ff0	Merge pull request #9011 from fidencio/topic/switch-to-using-the-confidential-rootfs runtime: Replace TEE specific initrd / image for the confidential one	2024-02-05 10:43:12 +01:00
Guoqiang Ding	6068faf40b	runtime: failed to run in the case of ColdPlugVFIO Add the missing port type in the DeviceInfo. Fixes: #9014 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-02-05 17:30:11 +08:00
Fabiano Fidêncio	65013205ed	Merge pull request #9005 from ChengyuZhu6/clang static-checks: Install clang in the ci environments	2024-02-05 09:24:51 +01:00
Archana Shinde	b3c74411f6	runtime-rs: Add tests for persist api for clh Add tests to check clh struct is saved/restored correctly. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-04 22:03:57 -08:00
Archana Shinde	0b78296dca	runtime-rs: Store additional field for hypervisor state Implementing Persist API for cloud-hypervisor was done partially with initial support for cloud-hypervisor. Store and retrieve additional fields to/from the hypervisor state. Fixes: #6202 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-04 22:03:57 -08:00
Archana Shinde	a5f0b92bca	runtime-rs: Add guest protection to hypervisor state Store guest-protection used while storing the state of the hypervisor. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2024-02-04 22:03:54 -08:00
Alex Lyn	cf74166d75	Merge pull request #9015 from Apokleos/bugfix-exec-uds runtime: display accurate error msg to avoid misleading users.	2024-02-05 13:50:43 +08:00
Amulyam24	e59d005568	gha: add GOPATH env var to the ppc64le k8s workflow The filtering of testing cases installs/uses yq and expects GOPATH to be present. Hence, add it to the workflow. Fixes: #9018 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-02-05 10:30:10 +05:30
Alex Lyn	51a82bec3c	Merge pull request #9012 from deagon/fix/monitor-agent-url kata-monitor: fix agentUrl from containerd shim	2024-02-05 10:41:56 +08:00
ChengyuZhu6	f354beb253	static-checks: Install clang in the ci environments To test PR #8484, the compilation process for the kata-agent relies on clang. There have been encountered failures on ARM, s390x, and ppc64le architectures: ppc64le: https://github.com/kata-containers/kata-containers/actions/runs/7754082828/job/21146689026?pr=8484 s390x: https://github.com/kata-containers/kata-containers/actions/runs/7754082828/job/21146689401?pr=8484 arm: https://github.com/kata-containers/kata-containers/actions/runs/7754082828/job/21146689026?pr=8484 Fixes: #9004 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2024-02-04 17:00:19 +08:00
Alex Lyn	c6830ceb89	runtime: display accurate error msg to avoid misleading users. The original handling method does not reach user expectations. When the ClientSocketAddress method stats the corresponding path of runtime-rs and has not found it yet, we should return an error message here that includes the reason for the failure (which should be an error display indicating that both runtime-go and runtime-rs were not found). Instead of simply displaying the corresponding path of runtime-rs as the final error message to users. It is also necessary to return the error promptly to the caller for further error handling. Fixes: #8999 Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>	2024-02-04 16:45:59 +08:00
Xuewei Niu	fa01a86334	Merge pull request #9007 from wainersm/aks_delete_rg gha: delete azure RG only if it exists	2024-02-04 16:34:17 +08:00
Guoqiang Ding	7bf1ebe16d	kata-monitor: fix agentUrl from containerd shim Fix the missing leading slash. Fixes: #9013 Signed-off-by: Guoqiang Ding <dgq8211@gmail.com>	2024-02-04 16:24:13 +08:00
Fabiano Fidêncio	d4a9856a84	gha: Remove SEV / SNP / TDX images / initrds We can remove this now that we're relying on the confidential one. Fixes: #9010 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-03 13:22:07 +01:00
Fabiano Fidêncio	e4258d8694	runtime: Use confidential image / initrd instead of TEE specific ones Now that we have a confidential image / initrd being built, instead of a specific one for each TEE, let's use it everywhere possible. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-03 13:20:14 +01:00
Fabiano Fidêncio	e0bb632053	Merge pull request #8983 from fidencio/topic/add-confidential-image packaging: Add confidential image / initrd	2024-02-03 12:30:16 +01:00
Fabiano Fidêncio	a9f8888c15	packaging: Add confidential image / initrd Let's use a single rootfs image / initrd for confidential workloads, instead of having those split for different TEEs. We can easily do this now as the soon-to-be-added guest-components can be built in a generic way. Fixes: #8982 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-03 00:58:52 +01:00
Fabiano Fidêncio	7ddb2e5999	Merge pull request #8978 from fidencio/topic/use-the-kernel-confidential-when-possible runtime: packaging: Use confidential kernel instead of the TDX one	2024-02-03 00:29:43 +01:00
Fabiano Fidêncio	e9de0ef6b3	packaging: rootfs: Depend on kernel-confidential tarball Now that we're using the kernel-confidential, let the rootfs depending on it, instead of depending on the TEE specific ones. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:13:41 +01:00
Fabiano Fidêncio	b58cfc765c	packaging: Ensure rootfs is rebuilt in case kernel changes We need to do this in order to ensure that the measure boot will be taking the latest kernel bits, as needed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:13:06 +01:00
Fabiano Fidêncio	4394dacb88	packaging: Build the confidential kernel with MEASURED_ROOTFS support This is already done for the TDX kernel, and should have been done also for the confidential one. This action requires us to bump the kernel version as the resulting kernel will be different from the cached one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:13:06 +01:00
Fabiano Fidêncio	c7680839f9	packaging: Fix modules tarball for nvidia-gpu-confidential The modules dir has an extra "-nvidia-gpu-confidential" string in its name. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:13:06 +01:00
Fabiano Fidêncio	dc027e39d6	gha: Remove TEE specific kernel build targets We're using the confidential kernel instead from now on. Fixes: #8981 -- part I Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:12:41 +01:00
Fabiano Fidêncio	3755c69165	runtime: makefile: remove SNP specific kernel references As this is not used anymore, we can go ahead and just remove it Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:12:21 +01:00
Fabiano Fidêncio	57b132f94c	runtime: makefile: remove SEV specific kernel references As this is not used anymore, we can go ahead and just remove it Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:12:21 +01:00
Fabiano Fidêncio	2562d23242	runtime: makefile: remove TDX specific kernel references As this is not used anymore, we can go ahead and just remove it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:11:43 +01:00
Fabiano Fidêncio	f4e3c936d8	runtime: snp: config: Use the confidential kernel As we're building a single confidential kernel, we should rely on it rather than keep using the specific ones for TDX / SEV / SNP. However, for debugability-sake, let's do this change TEE by TEE. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:11:36 +01:00
Fabiano Fidêncio	8731366d7b	runtime: sev: config: Use the confidential kernel As we're building a single confidential kernel, we should rely on it rather than keep using the specific ones for TDX / SEV / SNP. However, for debugability-sake, let's do this change TEE by TEE. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 21:11:36 +01:00
Wainer dos Santos Moschetta	a04b215bcc	gha: delete azure RG only if it exists delete_cluster() has tried to delete the az resources group regardless if it exists. In some cases the result of that operation is ignored, i.e., fail to resource group not found, but the log messages get a little dirty. Let's delete the RG only if it exists then. Fixes #8989 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-02-02 16:57:20 -03:00
Gabriela Cervantes	eb5b7d3bf8	tests: k8s: Enable tests for cloud hypervisor runtime-rs This PR enable the k8s tests for cloud hypervisor runtime-rs. Fixes #8627 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-02 17:58:58 +00:00
Fabiano Fidêncio	6cbdba7268	runtime: tdx: config: Use the confidential kernel As we're building a single confidential kernel, we should rely on it rather than keep using the specific ones for TDX / SEV / SNP. However, for debugability-sake, let's do this change TEE by TEE. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 17:13:06 +01:00
Fabiano Fidêncio	a618461d3a	runtime: Add confidential kernel to the makefile With this we can properly generate and the the `-confidential` kernel, which supports SEV / SNP / TDX as part of our configuration files. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 17:13:05 +01:00
GabyCT	40d9a65601	Merge pull request #8996 from GabyCT/topic/addclhr gha: k8s: Add cloud-hypervisor (runtime-rs) support	2024-02-02 09:48:35 -06:00
Fabiano Fidêncio	741ed1c8bd	Merge pull request #9001 from fidencio/topic/fix-cache-for-confidential-kernel-part-III packaging: Don't build the confidential / sev kernel twice -- part III	2024-02-02 15:19:41 +01:00
Wainer Moschetta	424fbfe58f	Merge pull request #8654 from ldoktor/openshift-tests ci/openshift-ci: Move openshift-ci from the tests repo here	2024-02-02 10:40:30 -03:00
Fabiano Fidêncio	2ff3f0afc6	packaging: Remove trailing whitespace from extra_tarballs arg This was overlooked during the reviews. Fixes: #6415 -- part III Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 12:42:02 +01:00
Fabiano Fidêncio	228bc48c73	packaging: Fix kernel confidential name It should be "kernel-confidential" instead of "kernel". Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 12:42:02 +01:00
Fabiano Fidêncio	31b21093b0	packaging: Pass the kernel flavour to get_kernel_modules_dir I made this a required argument during the series and ended up forgetting to add that while calling the function. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 12:42:02 +01:00
Fabiano Fidêncio	51b1df2333	packaging: Fix typo to get the extra_tarballs path It should've been "${m#*:}" instead of "${m#&:}". Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 12:41:54 +01:00
Fabiano Fidêncio	53e8461db2	Merge pull request #9000 from fidencio/topic/fix-pushing-artefacts-to-registry packaging: Fix pushing artefacts to the registry	2024-02-02 10:21:40 +01:00
Fabiano Fidêncio	0b221b5618	packaging: Fix pushing artefacts to the registry This issues was introduced due to a typo not caught during reviews on `e5bca90274`. Fixes: #6415 -- part II Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-02 10:13:11 +01:00
Wenyuan Liu	cb888516c1	Merge pull request #8760 from fadecoder/reduce_go_runtime_mounts runtime: Reduce the mount points with namespace isolation	2024-02-02 16:54:44 +08:00
Greg Kurz	d1a26ead94	Merge pull request #8454 from BbolroC/compile-with-qemu-s390x runtime-rs: make compilation for QEMU on s390x	2024-02-02 09:29:32 +01:00
Fabiano Fidêncio	0520b272a3	Merge pull request #8987 from fidencio/topic/fix-cache-for-confidential-kernel packaging: cache: Fix caching kernels which rely on extra modules	2024-02-02 09:10:52 +01:00
Amulya Meka	e4252a3fe2	Merge pull request #8957 from Amulyam24/add-k8s-test-ppc64le gha: add kubernetes tests workflow for ppc64le	2024-02-02 10:22:00 +05:30
Fabiano Fidêncio	b2f1235e3c	Merge pull request #8994 from sprt/sprt/switch-aks-eastus ci: aks: switch from eastus2 to eastus region	2024-02-02 00:09:40 +01:00
Hyounggyu Choi	bb6f5073aa	runtime-rs: Allow compilation for s390x Until now, runtime-rs couldn't be compiled on s390x. We need to lift those restrictions in Makefile first. Fixes: #8446 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-01 23:48:15 +01:00
Dan Mihai	6f1062b5d6	Merge pull request #8966 from microsoft/danmihai1/k8s-sandbox-vcpus-allocation genpolicy: ignore empty YAML as input	2024-02-01 13:51:02 -08:00
Dan Mihai	8f9c92c0ee	Merge pull request #8977 from microsoft/danmihai1/default-namespace genpolicy: support non-default namespace name	2024-02-01 13:50:33 -08:00
Gabriela Cervantes	6771ca463b	gha: k8s: Add cloud-hypervisor (runtime-rs) support This PR adds the Cloud Hypervisor driver, integrated with the runtime-rs, as part of the kubernetes tests different with devmapper. Fixes #8995 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-02-01 21:22:56 +00:00
Aurélien Bombo	0ace31f041	ci: aks: switch from eastus2 to eastus region This addresses an internal AKS issue that intermittently prevents clusters from getting created. The fix has been rolled out to eastus but not yet eastus2, so we unblock the CI by switching. No downsides in general. This supersedes #8990. Fixes: #8989 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2024-02-01 19:22:42 +00:00
Hyounggyu Choi	8fcee6e6ec	runtime-rs: Use Persist::restore() of QEMU for VirtSandbox It fails to compile virt_container because Dragonball is only used in the implementation of the trait method Persist::restore(). As the hypervisor is not compiled on s390x and QEMU implements the trait method, this commit is to let the method use QEMUi's. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-01 18:02:10 +01:00
Hyounggyu Choi	56aef3741d	runtime-rs: Exclude hypervisors plugins except QEMU for s390x Dragonball and cloud-hypervisor are not supported on s390x. We need to exclude the plugins for these hypervisors from compilation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-02-01 18:02:10 +01:00
Fabiano Fidêncio	5d2906c36a	packaging: Bump the kata config kernel version Just to make sure we won't use cached components. Fixes: #6415 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:57:15 +01:00
Fabiano Fidêncio	d2ea11dbff	packaging: Use the cached kernel modules Till now we didn't have a logic to consume the kernel modules cached tarball. Let's make sure those are consumed as it'll save us a reasonable amount of build time. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:57:15 +01:00
Fabiano Fidêncio	e5bca90274	packaging: Cache the kernel modules This will save us a lot of time, as right now the CI is rebuilding the kernel for absolutely no reason. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:55:21 +01:00
Fabiano Fidêncio	f481f58659	packaging: Create the tarball for the kernel modules Let's start doing this for the confidential kernels (and also for SEV, till it gets removed). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:55:20 +01:00
Fabiano Fidêncio	a58caca723	packaging: Take extra tarballs in install_cached_tarball_component() This allows us to add a map, in the format of: `"tarball1_name:tarball1_path tarball2_name:tarball2_path ..."` With this we have a base to start doing a better job when caching extra artefacts, like kernel modules. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:55:20 +01:00
Fabiano Fidêncio	33ac5468fe	packaging: Add function to get the kernel modules directory Right now this is just being added but not used yet. The idea is to use this to both cache and later on untar the kernel modules needed for some of the kernel targets we have (specifically looking at the confidential one). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 16:55:20 +01:00
Zhigang Wang	9317e23df1	mount: Reduce the mount points with namespace isolation This patch can reduce load on systemd process, and increase the k8s deployment density when using go runtime. Fixes: #8758 Signed-off-by: Zhigang Wang <wangzhigang17@huawei.com> Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2024-02-01 18:34:24 +08:00
Fabiano Fidêncio	ed6816e29f	kata-manager: Add support for nerdctl installation As already done for docker, let's also add support for installing nerdctl + kata containers. For now, at least for now, we are explicitly not allowing the combination of installing both docker and nerdctl in the same installation in order to reduce the script complexity. Also, nerdctl installation, for now, is limited to x86_64 and aarch64 as those are the only architectures that nerdctl releases a "full" package for. Fixes: #8358 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-02-01 09:19:35 +01:00
Xuewei Niu	2332552c8f	Merge pull request #7483 from frezcirno/passfd_io_feature runtime-rs: improving io performance using dragonball's vsock fd passthrough	2024-02-01 14:53:53 +08:00
Amulyam24	f8585db8d9	gha: add kubernetes tests workflow for ppc64le This PR adds workflow for running kubernetes test suite on ppc64le. It uses scripts to create and delete the cluster using kubeadm as none of the current cluster creation tools are supported on Power. Fixes: #7950 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-02-01 12:23:11 +05:30
Alex Lyn	cf26c16017	Merge pull request #8931 from yaoyinnan/8930/feat/merge-ValidCgroupPath runtime: merged ValidCgroupPath method	2024-02-01 12:53:55 +08:00
Alex Lyn	a157fc3b74	Merge pull request #8974 from yaoyinnan/5240/fix/cgroup-parallel runtime: add SingleContainer when obtaining OCI Spec	2024-02-01 11:43:02 +08:00
Alex Lyn	1b8f3ce28a	Merge pull request #8929 from yaoyinnan/8838/fix/error-message runtime-rs: report error on missing or empty fields in configuration	2024-02-01 11:02:30 +08:00
Dan Mihai	09ea0eed9d	genpolicy: ignore empty YAML as input Kata CI's pod-sandbox-vcpus-allocation.yaml ends with "---", so the empty YAML document following that line should be ignored. To test this fix: genpolicy -u -y pod-sandbox-vcpus-allocation.yaml Fixes: #8895 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-02-01 02:22:21 +00:00
Dan Mihai	befef119ff	Merge pull request #8941 from malt3/genpolicy-flags genpolicy: allow separate paths for rules and settings files	2024-01-31 18:14:12 -08:00
GabyCT	6db1cd5f65	Merge pull request #8964 from GabyCT/topic/fixnerdcltt tests: Re-arranged nerdctl tests	2024-01-31 15:02:54 -06:00
Dan Mihai	21125baec3	Merge pull request #8962 from microsoft/danmihai1/config-map-optional2 genpolicy: ignore volume configMap optional field	2024-01-31 12:29:30 -08:00
Fabiano Fidêncio	39a64d1447	Merge pull request #8269 from wainersm/kata-deploy_deprecated kata-deploy: fix deprecations on kustomization files	2024-01-31 20:02:01 +01:00
Hyounggyu Choi	9c0312d466	Merge pull request #8956 from BbolroC/agent-build-fix-s390x-ppc64le packaging: Use Ubuntu 20.04 for building an agent	2024-01-31 18:23:16 +01:00
Greg Kurz	8b1dc06971	Merge pull request #8938 from pmores/log-qemus-stderr-in-shim-log runtime-rs: Log qemu's stderr in shim log	2024-01-31 18:04:28 +01:00
Dan Mihai	f0339a79a6	genpolicy: support non-default namespace name Allow users to specify in genpolicy-settings.json a default cluster namespace other than "default". For example, Kata CI uses as default namespace: "kata-containers-k8s-tests". Fixes: #8976 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-31 15:47:01 +00:00
Zixuan Tan	222de4f684	agent: Fix a race condition in passfd_io.rs There is a race condition in agent HVSOCK_STREAMS hashmap, where a stream may be taken before it is inserted into the hashmap. This patch add simple retry logic to the stream consumer to alleviate this issue. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	6e4d4c329a	agent,runtime-rs: Add license header to passfd_io.rs Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	1206de2c23	agent: Use pipes as stdout/stderr of container process Linux forbids opening an existing socket through /proc/<pid>/fd/<fd>, making some images relying on the special file /dev/stdout(stderr), /proc/self/fd/1(2) fail to boot in passfd io mode, where the stdout/stderr of a container process is a vsock socket. For back compatibility, a pipe is introduced between the process and the socket, and its read end is set as stdout/stderr of the container process instead of the socket. The agent will do the forwarding between the pipe and the socket. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	f6710610d1	agent,runtime-rs,runk: fix fmt and clippy warnings Fix rustfmt and clippy warnings detected by CI. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	89be42a177	runtime-rs: open stdout and stderr fifos NONBLOCK This patch adds O_NONBLOCK flag when open stdout and stderr FIFOs to avoid blocking. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	3eb4bed957	agent: use biased select to avoid data loss This patch uses a biased select to avoid stdin data loss in case of CloseStdinRequest. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	7874ef5fd2	agent: set stdout/err vsock stream as blocking before passing to child In passfd io mode, when not using a terminal, the stdout/stderr vsock streams are directly used as the stdout/stderr of the child process. These streams are non-blocking by default. The stdout/stderr of the process should be blocking, otherwise the process may encounter EAGAIN error when writing to stdout/stderr. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Fupan Li	cfb262d02f	container: keep the io connection when pass fd to hybrid vsock We want the io connection keep connected when the containerd closed the io pipe, thus it can be attached on the io stream. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-01-31 21:07:48 +08:00
Fupan Li	4a762fcfdd	dbs: hybrid stream support keep the connection when local closed Support the hybrid fd passthrough mode with passing pipe fd, which can specify this connection kept even when the pipe peer closed, and this connection can be reget wich re-opening the pipe. Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	5536743361	agent,runtime-rs: fix container io detach and attach Partially fix some issues related to container io detach and attach. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	657b17a86f	runtime-rs: open stdin fifo with RDWR\|NONBLOCK when pass vsock streams In linux, when a FIFO is opened and there are no writers, the reader will continuously receive the HUP event. This can be problematic when creating containers in detached mode, as the stdin FIFO writer is closed after the container is created, resulting in this situation. In passfd io mode, open stdin fifo with O_RDWR\|O_NONBLOCK to avoid the HUP event. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	f1b33fd2e0	agent: clean up term master fd when container exits When container exits, the agent should clean up the term master fd, otherwise the fd will be leaked. Fixes: kata-containers#6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	b8632b4034	dragonball: vsock: properly handle EPOLLHUP/EPOLLERR events When one end of the connection close, the epoll event will be triggered forever. We should close the connection and kill the connection. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	442df71fe5	agent,runtime-rs: refactor process io using vsock fd passthrough feature Currently in the kata container, every io read/write operation requires an RPC request from the runtime to the agent. This process involves data copying into/from an RPC request/response, which are high overhead. To solve this issue, this commit utilize the vsock fd passthrough, a newly introduced feature in the Dragonball hypervisor. This feature allows other host programs to pass a file descriptor to the Dragonball process, directly as the backend of an ordinary hybrid vsock connection. The runtime-rs now utilizes this feature for container process io. It open the stdin/stdout/stderr fifo from containerd, and pass them to Dragonball, then don't bother with process io any more, eliminating the need for an RPC for each io read/write operation. In passfd io mode, the agent uses the vsock connections as the child process's stdin/stdout/stderr, eliminating the need for a pipe to bump data (in non-tty mode). Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	eb6bb6fe0d	config: add two options to control vsock passthrough io feature Two toml options, `use_passfd_io` and `passfd_listener_port` are introduced to enable and configure dragonball's vsock fd passthrough io feature. This commit is a preparation for vsock fd passthrough io feature. Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Zixuan Tan	973b5ad1f4	runtime-rs: make Container::new async Fixes: #6714 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2024-01-31 21:07:48 +08:00
Xuewei Niu	5449173102	Merge pull request #8932 from kalil-pelissier/feature/issue-8586/fix-noop-method-call-warning dragonball: fix noop-method-call warning	2024-01-31 19:24:27 +08:00
Malte Poll	531a11159f	genpolicy: allow separate paths for rules and settings files Using custom input paths with -i is counter-intuitive. Simplify path handling with explicit flags for rules.rego and genpolicy-settings.json. Fixes: #8568 Signed-Off-By: Malte Poll <1780588+malt3@users.noreply.github.com>	2024-01-31 11:00:19 +01:00
Hyounggyu Choi	2e1d770fcf	packaging: Track files correctly when naming builder image for agent The necessary files for the agent builder image can be found in `tools/packaging/static-build/agent`, `ci/install_libseccomp.sh` and `tools/packaging/kata-deploy/local-build/kata-deploy-copy-libseccomp-installer.sh`. Identifying the correct files addresses the previously misreferenced path used to name the builder image. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-31 10:49:20 +01:00
yaoyinnan	9aa1ed805a	runtime: add SingleContainer when obtaining OCI Spec When creating a cgroup, add a SingleContainer when obtaining the OCI Spec to apply to ctr, podman, etc. Fixes: #5240 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 15:24:07 +08:00
yaoyinnan	b0b8523cea	runtime: modify ValidCgroupPath unit test Modify ValidCgroupPath unit test. Fixes: #8930 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 14:37:17 +08:00
yaoyinnan	feed5c8ff9	runtime: merged ValidCgroupPath method Merged ValidCgroupPath method to handle cgroupv1 and cgroupv2. Fixes: #8930 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 14:37:13 +08:00
yaoyinnan	864389c524	runtime-rs: report error on missing or empty fields in configuration Removed the setting of default values for runtime fields. Added explicit checks for missing or empty fields, reporting errors with clear messages. Fixes: #8838 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-31 12:46:17 +08:00
Wainer dos Santos Moschetta	abc2fcd88f	kata-deploy: fix deprecations on kustomization files By running `kustomize edit fix` on those files they have changed deprecated instructions ('bases' and 'patchesStrategicMerge') as well as 'apiVersion' and 'kind' were added. Fixes #8268 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2024-01-30 18:41:03 -03:00
Lukáš Doktor	4876eadd2f	tools: Add reference to the kata webhook's README The newly added webhook is a new component and oughst to be linked from the main README file. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-01-30 19:05:56 +01:00
Lukáš Doktor	b0b7748f30	ci/openshift-ci: Correct the lib location correct the lib file locations after the move from tests->kata-containers repo and add a minimized version of the ".ci/lib.sh" library into the "ci/openshift-ci" as we don't really utilize all of the features. Fixes: #8653 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-01-30 19:05:56 +01:00
Lukáš Doktor	4c58478536	ci/openshift-ci: Move openshift-ci from the tests repo Move the f15be37d9bef58a0128bcba006f8abb3ea13e8da version of scripts required for openshift-ci from "kata-containers/tests/.ci/openshift-ci" into "kata-containers/kata-containers/ci/openshift-ci" and required webhook+libs into "kata-containers/kata-containers/tools/testing" as is to simplify verification, the different location handling will be added in following commit. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2024-01-30 19:05:55 +01:00
Kvlil	3fd5628771	dragonball: fix noop-method-call warning The `noop-method-call` is a rustc lint that has existed since v1.52.0. This lint has been moved to the warn by default lint level since v1.73.0. Therefore build is failing with this version and above. This commit removes the unnecessary call to `<&T as Deref>::deref` on `T: !Deref`. Fixes: #8586 Signed-off-by: Kvlil <kalil.pelissier@gmail.com>	2024-01-30 17:16:49 +00:00
Wainer Moschetta	bf54a02e16	Merge pull request #8924 from microsoft/danmihai1/pod-nested-configmap-secret genpolicy: fix ConfigMap volume mount paths	2024-01-30 14:09:41 -03:00
Gabriela Cervantes	78b517ccc8	tests: Re-arranged nerdctl tests This PR re-arranged the nerdctl tests to avoid random failures. In this PR first will run the tests with RunC and then with the kata hypervisor. This PR tries to avoid the random failures that is happening with cloud-hypervisor and clh. Fixes #8963 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-30 16:07:12 +00:00
Dan Mihai	d12875ee66	genpolicy: ignore volume configMap optional field The auto-generated Policy already allows these volumes to be mounted, regardless if they are: - Present, or - Missing and optional Fixes: #8893 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-30 15:32:37 +00:00
Fabiano Fidêncio	7a83e6dc14	Merge pull request #8959 from fidencio/topic/crio-bump-runners-to-2204 gha: cri-o: Bump runners to 22.04	2024-01-30 14:27:40 +01:00
Fabiano Fidêncio	34d51b05f8	gha: cri-o: Bump runners to 22.04 This will not solve the CRI-O CI breakage but will give us an environment where we could get it to run locally. Fixes: #8935 -- part I Thanks to Julien Ropé for trying to reproduce the issues I faced on https://github.com/kata-containers/kata-containers/issues/8935 in an Ubuntu 22.04 system. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-30 14:17:06 +01:00
Xuewei Niu	7e10000b6f	Merge pull request #8928 from yaoyinnan/8927/fix/unused-DriverInfo runtime-rs: fix unused driverInfo error	2024-01-30 20:39:10 +08:00
Hyounggyu Choi	f3bc6e4155	packaging: Use Ubuntu 20.04 for building an agent This involves using Ubuntu 20.04 as a build environment for an agent to match with a runtime environment. Fixes: #8955 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-30 10:22:14 +01:00
Pavel Mores	d53edbd0a5	runtime-rs: collect qemu stderr and log it in shim log Qemu stderr monitoring runs in its own asynchronous green thread. For that, `stderr` is taken out of the Child representing the qemu child process to avoid partial move and make it possible for the main thread still to call functions on QemuInner::qemu_process (e.g. kill(), id()). Fixes #8937 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-30 09:09:05 +01:00
Pavel Mores	684d740122	runtime-rs: switch qemu child process management from std to tokio We'll want to capture qemu's stderr in parallel with normal runtime-rs execution. Tokio's primitives make this much easier than std's. This also makes child process management more consistent across runtime-rs (i.e. virtiofsd child process is already launched and managed using tokio). Some changes were necessary due to tokio functions being slightly different from their std counterparts. Child::kill() is now async and Child::id() now returns an Option. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-30 09:07:14 +01:00
Dan Mihai	6a8f46f3b8	Merge pull request #8918 from microsoft/danmihai1/metadata genpolicy: optional PodTemplateSpec metadata field	2024-01-29 12:36:30 -08:00
Dan Mihai	60ac3048e9	genpolicy: fix ConfigMap volume mount paths Allow Kata CI's pod-nested-configmap-secret.yaml to work with genpolicy and current cbl-mariner images: 1. Ignore the optional type field of Secret input YAML files. It's possible that CoCo will need a more sophisticated Policy for Secrets, but this change at least unblocks CI testing for already-existing genpolicy features. 2. Adapt the value of the settings field below to fit current CI images for testing on cbl-mariner Hosts: "kata_config": { "confidential_guest": false }, Switching this value from true to false instructs genpolicy to expect ConfigMap volume mounts similar to: "configMap": { "mount_type": "bind", "mount_source": "$(sfprefix)", "mount_point": "^$(cpath)/watchable/$(bundle-id)-[a-z0-9]{16}-", "driver": "watchable-bind", "fstype": "bind", "options": [ "rbind", "rprivate", "ro" ] }, instead of: "confidential_configMap": { "mount_type": "bind", "mount_source": "$(sfprefix)", "mount_point": "$(sfprefix)", "driver": "local", "fstype": "bind", "options": [ "rbind", "rprivate", "ro" ] } }, This settings change unblocks CI testing for ConfigMaps. Simple sanity testing for these changes: genpolicy -u -y pod-nested-configmap-secret.yaml kubectl apply -f pod-nested-configmap-secret.yaml kubectl get pods \| grep config nested-configmap-secret-pod 1/1 Running 0 26s Fixes: #8892 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-29 16:13:47 +00:00
Gabriela Cervantes	31813cf8d8	metrics: Update packages for TensorFlow ResNet Int8 Dockerfile This PR updates the required packages for the TensorFlow ResNet50 Int8 Dockerfile. Fixes #8950 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-29 16:11:09 +00:00
Fabiano Fidêncio	087856f26c	Merge pull request #8934 from microsoft/danmihai1/nodeName genpolicy: ignore the nodeName field	2024-01-29 16:57:59 +01:00
Greg Kurz	d687b601f1	Merge pull request #8933 from fidencio/topic/package-coco-guest-components packaging: Build coco-guest-components	2024-01-29 16:34:06 +01:00
Zvonko Kaiser	a9348fa35b	Merge pull request #8375 from zvonkok/opa-binary-fix arm64: agent_policy build always pulls amd64 opa binary	2024-01-29 15:10:10 +01:00
Fabiano Fidêncio	5ea6a29c37	Merge pull request #8947 from fidencio/topic/gha-pass-down-AZ_SUBSCRIPTION_ID gha: azure: Set the correct subscription to the account	2024-01-29 15:07:06 +01:00
Fabiano Fidêncio	448c0aaecb	gha: azure: Set the correct subscription to the account Due to the changes done in the CI, we need to set the correct subscription to be used with the account from now on, otherwise we'd end up using CoCo subscription. Fixes: #8946 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-29 15:00:38 +01:00
Pavel Mores	b52a398469	runtime-rs: move creation of VM path from start_vm() to prepare_vm() This fixes a flaw pointed out in review of PR #8185. Creation of the directory semantically fits better into VM preparation than VM launch. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-27 13:46:35 +01:00
Fabiano Fidêncio	98dc2d4c52	rootfs: agent: Initialise AGENT_SOURCE_BIN & AGENT_TARBALL Otherwise those would be unbound if not passed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-26 19:58:41 +01:00
Fabiano Fidêncio	5e57e0235e	rootfs: agent: Fix build with AGENT_SOURCE_BIN We need to actually check that the env var is not empty. :-) This was introduced by `8307718842`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-26 19:58:20 +01:00
Fabiano Fidêncio	fbfc880eb6	rootfs: Add COCO_GUEST_COMPONENTS_TARBALL env var This env ver will serve us to pass the Confidential Containers guest-components tarball to the rootfs builder, which will then just unpack the content into the rootfs. Fixes: #8848 -- part I Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com> Co-authored-by: Alex Carter <alex.carter@ibm.com> Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-01-26 19:58:19 +01:00
Fabiano Fidêncio	644abde35c	packaging: coco-guest-components: Allow building the project The Confidential Containers guest-components will, in the very short future, be part of the Kata Containers rootfs that's used by the Confidential Containers usecase. This commit introduces the ability to, standalone, build the component locally and as part of our CI, and this can be done by calling: `make coco-guest-components-tarball` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Linda Yu <linda.yu@intel.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com> Co-authored-by: Wang, Arron <arron.wang@intel.com> Co-authored-by: zhouliang121 <liang.a.zhou@linux.alibaba.com> Co-authored-by: Alex Carter <alex.carter@ibm.com> Co-authored-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> Co-authored-by: Xynnn007 <xynnn@linux.alibaba.com>	2024-01-26 19:36:01 +01:00
Hyounggyu Choi	ee072e8a06	Merge pull request #8926 from fidencio/topic/cache-the-agent-for-non-x86_64 gha: Cache the agent for non-x86_64 arches	2024-01-26 18:04:33 +01:00
Dan Mihai	076869aa39	genpolicy: ignore the nodeName field Validating the node name is currently outside the scope of the CoCo policy. This change unblocks testing using Kata CI's test-pod-file-volume.yaml and pv-pod.yaml. Fixes: #8888 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-26 16:30:55 +00:00
Dan Mihai	ef1ee81f81	Merge pull request #8909 from microsoft/danmihai1/main-shareProcessNamespace genpolicy: add shareProcessNamespace support	2024-01-26 05:49:19 -08:00
yaoyinnan	9b7c5c69cf	runtime-rs: fix unused driverInfo error Remove the unused DriverInfo declaration or integrate it into the codebase where applicable. Fixes: #8927 Signed-off-by: yaoyinnan <35447132+yaoyinnan@users.noreply.github.com>	2024-01-26 19:59:52 +08:00
Greg Kurz	f41fa7557a	Merge pull request #8914 from BbolroC/basic-e2e-ibm-se tests: Add IBM SE to the basic confidential test	2024-01-26 12:32:32 +01:00
Fabiano Fidêncio	08a082ca47	gha: Cache the agent for non-x86_64 arches Those are not yet being cached for no reason, and they better be as it'll allow us to save a considerable amount of time building the rootfs. Fixes: #8917 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-26 12:02:26 +01:00
Fabiano Fidêncio	a7c68225aa	Merge pull request #8916 from fidencio/topic/packaging-reuse-already-built-agent packaging: Don't always build the kata-agent	2024-01-26 12:00:55 +01:00
Fabiano Fidêncio	95c569b0a6	packaging: Add safe.directory to the git config Otherwise building as root will not work, as demonstrated by the arm64 CI. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-26 09:44:43 +01:00
Hyounggyu Choi	ab462a4b89	tests: Add IBM SE to the basic confidential test The existing confidential basic test titled `Test unencrypted confidential container launch success and verify that we are running in a secure enclave` has been updated to incorporate IBM Secure Execution (`qemu-se`). Previously, a secure image was absent from kata-deploy, hindering the inclusion of IBM SE in the test. Thanks to the #6755 update, it is now possible to test the TEE. This modification extends the existing test by introducing `qemu-se`. The specific changes are outlined below: - Add an additional test `cc-se-e2e-tests` to s390x nightly - Expansion of `REMOTE_COMMAND_PER_HYPERVISOR` for `qemu-se` - Temporary exclusion of two test cases currently incompatible with IBM SE (`cpu-ns` is a common issue across all TEEs, while `inotify` will be addressed in a subsequent pull request). Fixes: #8913 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-26 06:04:39 +01:00
GabyCT	c13a63c8ba	Merge pull request #8905 from zvonkok/enable-tpm qemu: enable TPM	2024-01-25 14:52:00 -06:00
GabyCT	aa958adf90	Merge pull request #8904 from GabyCT/topic/buildbq tools: Use defined variable in build base qemu script	2024-01-25 13:51:44 -06:00
GabyCT	36fc2fd83f	Merge pull request #8876 from GabyCT/topic/dockerrestfp metrics: Update packages needed for ResNet50 FP32 Dockerfile	2024-01-25 13:51:16 -06:00
Dan Mihai	8ad5459beb	genpolicy: optional PodTemplateSpec metadata field Add metadata containing the Policy annotation if the user didn't provide any metadata in the input yaml file. For a simple sanity test using a Kata CI YAML file: genpolicy -u -y job.yaml kubectl apply -f job.yaml kubectl get pods \| grep job job-pi-test-64dxs 0/1 Completed 0 14s Fixes: #8891 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-25 19:06:59 +00:00
Fabiano Fidêncio	dd49479829	packaging: Don't build the agent if not needed Let's start relying on the already cached agent to be deployed inside the rootfs. By doing this we save a lot of time in our CI, and we have a better way, for developers, to play with changes in the agent. Fixes: #8915 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:33 +01:00
Fabiano Fidêncio	21fd7e6dfd	packaging: Fail in case oras can't find an artefact It just means the component is not cached, and that it must be built in the usual way. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:32 +01:00
Fabiano Fidêncio	eb7a33ee71	rootfs: Always strip the agent binary Let's always do this, regardless of where the agent is coming from. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:32 +01:00
Fabiano Fidêncio	f23451de01	rootfs: Add xz as a dep As we'll be untarring the agent tarball (and any other component that may be part of the rootfs) into the rootfs, we have to have xz installed. For debian and ubuntu the package is called xz-utils; for centos, alpine and cbl-mariner the package is called xz. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:32 +01:00
Fabiano Fidêncio	8307718842	rootfs: Add AGENT_TARBALL env var This env var will serve us to pass the agent tarball to the rootfs builder, which will then just unpack the content into the rootfs instead of building the agent again. AGENT_TARBALL and AGENT_SOURCE_BIN should never be used together. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:32 +01:00
Fabiano Fidêncio	5b0d0687e5	packaging: agent: Allow building in all arches We're moving away from alpine and using ubuntu in order to be able to build the agent for all the architectures we need. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 19:41:32 +01:00
Dan Mihai	535cf04edb	genpolicy: add shareProcessNamespace support Validate the sandbox_pidns field value for CreateSandbox and CreateContainer. Fixes: #8868 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-25 16:48:57 +00:00
Dan Mihai	1e24581c07	Merge pull request #8908 from microsoft/danmihai1/genpolicy-permissions tools: allow all users to execute genpolicy	2024-01-25 08:42:24 -08:00
Dan Mihai	295494c7dc	Merge pull request #8898 from microsoft/danmihai1/show-output-of-passing-tests tests: k8s: bats --show-output-of-passing-tests	2024-01-25 06:22:50 -08:00
Fabiano Fidêncio	1039641ab8	packaging: agent: Add the arch to the builder container This has been missed during reviews and is already a problem as we're trying to build the agent outside of the rootfs for other architectures than x86_64. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 14:11:14 +01:00
Fabiano Fidêncio	58874f9c3e	packaging: tools: Add the arch to the builder container This has been missed during reviews and will become a problem when the tools start to be built in different architectures. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-25 14:10:22 +01:00
Zvonko Kaiser	76efe25aed	Merge pull request #8901 from zvonkok/remove-gha-action gpu: remove GHA target first then remove the obsoleted Makefile targets	2024-01-25 13:40:03 +01:00
Chelsea Mafrica	24b33ae35b	Merge pull request #8884 from GabyCT/topic/ulib versions: Update libseccomp to version v2.5.5	2024-01-24 23:55:32 -08:00
Dan Mihai	723c76d945	tools: allow all users to execute genpolicy This tool can be useful for any users. Fixes: #8907 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-25 00:40:53 +00:00
Zvonko Kaiser	19ecdbca3b	qemu: enable TPM Several use-cases need a vTPM lets enable it for QEMU, a follow up patch will introduce the runtime config. Fixes: #8902 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-01-24 17:49:08 +00:00
Gabriela Cervantes	98b5a19b3a	tools: Use defined variable in build base qemu script This PR uses a variable that is already defined in the build base qemu script to have uniformity across the script as this variable is already used in the script. Fixes #8903 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-24 17:05:17 +00:00
Zvonko Kaiser	4b8d79c1f6	gpu: remove GHA target first then remove the obsoleted Makefile targets Lets remove the GHA target actions first so the the follow-up PR #8874 tests are succeeding. Fixes: #8900 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-01-24 11:43:39 +00:00
Dan Mihai	66c012d052	tests: k8s: bats --show-output-of-passing-tests Add --show-output-of-passing-tests to the k8s integration tests. The output of a passing test can be helpful when investigating a failure of the same test. Fixes: #8885 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-24 03:04:28 +00:00
Hyounggyu Choi	f4290688bb	Merge pull request #7146 from BbolroC/ibm-se-howto-doc docs: provide a guide for how to use IBM Secure Execution	2024-01-23 22:48:05 +01:00
Hyounggyu Choi	25ecca91c6	docs: provide a guide for how to use IBM Secure Execution This PR is to add a document for how to run kata containers under IBM Secure Execution environment. Fixes: #7025 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-23 18:58:27 +01:00
Greg Kurz	0f67a26751	Merge pull request #8812 from kalil-pelissier/feature/issue-7720/drop-dead-code runtime: remove SharedVersions field dead code	2024-01-23 17:46:41 +01:00
Gabriela Cervantes	1b0d12ab78	versions: Update libseccomp to version v2.5.5 This PR updates the libseccompt version to v2.5.5 which includes the following changes: - Update the syscall table for Linux - Fix minor issues with binary tree testing and with empty binary trees Fixes #8883 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-23 16:31:25 +00:00
Zvonko Kaiser	ab597a4d5b	opa: Improve the download logic The versions.yaml has a default for the amd64 binary, but there is no code to actually build the arm64 binary, which seems an overlook. Let's simplify the OPA logic by removing the direct link to the binary, and construct that link as part of the checks we do to decide whether we need to build OPA or not. Fixes: #8373 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-23 09:16:16 +00:00
Greg Kurz	4516f38165	Merge pull request #8872 from zvonkok/nvidia-gpu-confidential gpu: Add NVIDIA GPU Confidential kernel target	2024-01-23 09:22:27 +01:00
Dan Mihai	3d2ec5c919	Merge pull request #8857 from microsoft/danmihai1/k8s-gha gha: get ready to install genpolicy	2024-01-22 08:29:24 -08:00
Gabriela Cervantes	eb7e123de8	metrics: Update packages needed for ResNet50 FP32 Dockerfile This PR updates the packages necessary to build the ResNet50 fp32 Dockerfile to run properly the benchmark. Fixes #8875 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-22 16:15:36 +00:00
Zvonko Kaiser	4fc34323ae	gpu: Add NVIDIA GPU Confidential kernel target This is a follow up to the work of minimizing targets, unifying TDX,SNP builds for NVIDIA GPUs Fixes: #8828 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2024-01-22 14:58:57 +00:00
Kvlil	a4b208a712	runtime: remove SharedVersions field dead code SharedVersion fiel add a versiontable property that isn't supported by upstream QEMU. This is dead code since virtcontainers isn't setting SharedVersions to true. Fixes: #7720 Signed-off-by: Kvlil <kalil.pelissier@gmail.com>	2024-01-22 12:18:42 +00:00
Dan Mihai	ea9c659d36	gha: get ready to install genpolicy The changes to install and test genpolicy must come later, after CI picks up these gha changes. Fixes: #8856 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-19 23:37:49 +00:00
GabyCT	bb1ada1a8b	Merge pull request #8855 from GabyCT/topic/updatefc versions: Update firecracker version	2024-01-19 16:25:50 -06:00
Fabiano Fidêncio	1e30fde8fa	Merge pull request #8862 from microsoft/danmihai1/genpolicy-dns genpolicy: ignore pod DNS settings	2024-01-19 23:08:26 +01:00
Dan Mihai	ca03d47634	genpolicy: ignore pod DNS settings Ignore pod DNS settings because policing the network traffic is currently outside the scope of the Agent Policy. Example from Kata CI: pod-custom-dns.yaml Fixes: #8832 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-19 16:42:35 +00:00
Alex.Lyn	826c751bf3	Merge pull request #8185 from pmores/add-qemu-cmdline-generation-framework Add qemu cmdline generation framework	2024-01-19 21:42:49 +08:00
Greg Kurz	b7d6b18768	Merge pull request #8485 from BbolroC/add-unit-test-s390x GHA: Enable static check for s390x, aarch64 and ppc64le	2024-01-19 11:49:16 +01:00
Pavel Mores	25c8d5db5d	runtime-rs: use qemu cmdline generation framework to launch VM Deploy the framework added by the previous commit to generate qemu command line and launch the VM. We now properly store the child process object which allows us to implement remaining Hypervisor functions necessary for a simple but successful VM lifecycle, get_vmm_master_tid() and stop_vm(). Fixes #8184 Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-19 11:42:23 +01:00
Gabriela Cervantes	0696807384	versions: Update firecracker version This PR updates the firecracker version to v1.6.0 which includes the following features - Added support for per net device metrics. In addition to aggregate metrics net, each individual net device will emit metrics under the label "net_{iface_id}". E.g. the associated metrics for the endpoint "/network-interfaces/eth0" will be available under "net_eth0" in the metrics json object. - Added support for per block device metrics. In addition to aggregate metrics block, each individual block device will emit metrics under the label "block_{drive_id}". E.g. the associated metrics for the endpoint "/drives/{drive_id}" will be available under "block_drive_id" in the metrics json object. - Added a new vm-state subcommand to info-vmstate command in the snapshot-editor tool to print MicrovmState of vmstate snapshot file in a readable format. Also made the vcpu-states subcommand available on x86_64. - Added source-level instrumentation based tracing. See tracing for more details. - Added developer preview only (NOT for production use) support for vhost-user block devices. Firecracker implements a vhost-user frontend. Users are free to choose from existing open source backend solutions or their own implementation. Known limitation: snapshotting is not currently supported for microVMs containing vhost-user block devices. See the related doc page for details. The device emits metrics under the label "vhost_user_{device}_{drive_id}". Fixes #8854 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-18 15:50:30 +00:00
Amulyam24	f6fea5f2ca	agent: fix failing unit tests on ppc64le - test_volume_capacity_stats: verify the file block size against the fetched size via statfs() - test_reseed_rng: Correct the request codes for RNDADDTOENTCNT and RNDRESEEDCRNG when platform is ppc64le - test list_routes: Add the route only if destination is not empty - test_new_fs_manager: skip the test if cgroups v2 is used by default - skip test cases rpc::tests::test_do_write_stream, sandbox::tests::test_find_process, sandbox::t ests::test_find_container_process and sandbox::tests::add_and_get_container on ppc64le as they are fl aky Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:32:16 +01:00
Hyounggyu Choi	610f878894	dragonball: Fix compile error for aarch64 This is to fix a compile error raised for aarch64. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:32:15 +01:00
Amulyam24	376941cf69	kata-ctl: skip building kata-ctl on ppc64le kata-ctl currently fails to build on ppc64le. Skip it for running static checks and the issues will be fixed and tracked in a seperate issue. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Amulyam24	4ecd82a5df	runk: skip the test_init_container_create_launcher if not root on ppc64le This is to skip the test_init_container_create_launcher if not root on ppc64le. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Amulyam24	a4b5447924	tools: fix makefile spacing This minor PR removes the extra space in the makefiles. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Amulyam24	394777291d	runtime: fix failing unit tests on ppc64le A few CPU related test cases were failing as the version was being verified against Power8 while the CI machine is Power9. Fixes: #5531 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Amulyam24	486b8a0538	dragonball: skip running static-checks for ppc64le Since dragonball is not currently supported on ppc64le, skip running the targets for static-checks. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Amulyam24	14934c7b0d	github: run static checks on ppc64le This PR adds ppc64le runner to the static-checks workflow. Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	8061a49ca5	kata-ctl: Clean up a test leftover file explicitely It was observed that a tmporary file `/tmp/kata_hybrid_vsock02.hvsock` for test_setup_hvsock_failed() is not removed from time to time. This leads to a test failure for the same test next time due to the file permission on a self-hosted runner. This commit is to explicitely delete the file before the check starts. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	290ecf4c46	Static-check: Exclude s390x from dragonball and runtime-rs At the moment, a project `dragonball` and `runtime-rs` does not support for s390x. During the enablement, some errors due to the misconfiguration of Makefile for `make check` and `make vendor` were identified. This is to skip the build for the affected target of the projects. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	c0f57c9e0a	Lint: Fix `cargo clippy` errors for s390x Some linting errors were identified during the enablement of `make check`. These have not been found by the Jenkins CI job because `make test` was only triggered. The errors for the `agent` occurs under the s390x specific tests while the other ones for the `kata-ctl` are the architecture-specific code. This commit is to fix those errors. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	a1f288e5d3	CI: Use sudo if yq_path is not writable by USER If `yq_path` is set to `/usr/local/bin/yq`, there could be a situation where the `yq` cannot be installed without `sudo`. This commit handles the situation by putting `sudo` in front of `curl` and `chmod`, respectively. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Hyounggyu Choi	354cbede9c	GHA: Enable static check for s390x As part of the CI migration from Jenkins to GitHub Action, a CI job named `kata-containers-2.0-ubuntu-s390x-unit-PR` is covered by the static check. This commit is to enable the check for s390x by incorporating a runner `s390x` with the corresponding workflow. Fixes: #8482 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2024-01-18 16:31:13 +01:00
Jianyong Wu	ba74a624a8	runtime-rs: use pathBuf only for x86 PathBuf here is only used for x86. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2024-01-18 16:31:13 +01:00
Jianyong Wu	a10779bf0b	GHA: enable static check on arm64 This is to add a runner for arm64 to the workflow. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2024-01-18 16:31:11 +01:00
Dan Mihai	eeba459a6b	Merge pull request #8845 from microsoft/danmihai1/genpolicy-defaults tools: install genpolicy settings files	2024-01-17 15:08:49 -08:00
Chelsea Mafrica	32ad465663	Merge pull request #8710 from jodh-intel/runtime-rs-ch-get-thread-ids runtime-rs: ch: Implement minimal implementation for missing thread/pid APIs	2024-01-17 14:51:44 -08:00
Fabiano Fidêncio	147d5fd752	Merge pull request #8836 from microsoft/danmihai1/test-with-cbl-mariner genpolicy: use root path from cbl-mariner Guest VM	2024-01-17 17:51:44 +01:00
Pavel Mores	f550d9a325	runtime-rs: add basic implementation of qemu command line generation This current framework is enough to launch a VM with a simple container in it (e.g. busybox). Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-17 12:55:00 +01:00
Pavel Mores	e8e13044da	runtime-rs: add simple impls to some of Qemu's Hypervisor functions The idea of most of these is just to prevent running into todo!()s where we can at the moment, while implementing the fundamental functionality of VM launch. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-17 12:55:00 +01:00
Dan Mihai	febabef08c	tools: install genpolicy settings files Install the default genpolicy OPA rules and settings JSON files, in addition to the genpolicy binary. Fixes: #8844 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-16 23:59:59 +00:00
David Esparza	e11c520ffa	Merge pull request #8808 from kata-containers/memory_usage_test_skip_virtiofs_when_req tests: Ignore virtiofs contribution to memory usage when it is disabled.	2024-01-16 16:50:06 -06:00
Dan Mihai	69557e5ad6	Merge pull request #8814 from microsoft/danmihai1/genpolicy-kata-deploy tools: genpolicy static checks	2024-01-16 07:33:42 -08:00
Dan Mihai	13f2398fe8	Merge pull request #8837 from microsoft/danmihai1/allow_storages genpolicy: temporarily disable allow_storages()	2024-01-16 07:10:49 -08:00
Alex.Lyn	17719f1ac5	Merge pull request #8708 from Apokleos/directvol-bugfix-blk-pci runtime-rs: bugfix for DirectVolume/rawblock when driver is blk	2024-01-16 14:25:16 +08:00
alex.lyn	99717371c1	runtime-rs: bugfix for DirectVolume/rawblock when driver is blk DirectVolume/Rawblock doesn't work well when device's block driver is virtio-blk-pci and the storage handler is DRIVER_BLK_PCI_TYPE. Fixes: #8707 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-16 10:35:08 +08:00
Dan Mihai	205dafd323	genpolicy: temporarily disable allow_storages() Temporarily disable the allow_storages() rules, because they are based on the tarfs snapshotter + container image integrity information that are not available yet in the main branch - see #8833. Fixes: #8834 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-15 23:55:27 +00:00
Dan Mihai	f4106a6107	genpolicy: use root path from cbl-mariner Guest VM Adjust genpolicy-settings.json to match the container root path from the main branch + cbl-mariner Guest VMs. This configuration might have to be adjusted again when other types of Guest VMs will be tested during CI using genpolicy, in the future. Also, improve logging from allow_root_path(), to easier debug these issues in the future. Fixes: #8835 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-15 23:33:28 +00:00
GabyCT	37a4049d0f	Merge pull request #8830 from GabyCT/topic/removeprotocol metrics: Remove iperf3 server protocol	2024-01-15 14:44:39 -06:00
Dan Mihai	201eec628a	tools: genpolicy static checks Package genpolicy and enable static checks for it. Fixes: #8813 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-15 16:49:58 +00:00
David Esparza	4b772d2480	tests: Ignore virtiofs contribution to memory usage when it is disabled. This PR removes the references to virtiofs from memory average calculation when the container uses a shared file system other than virtiofs. Fixes: #8807 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2024-01-15 08:07:06 -08:00
Gabriela Cervantes	dff800a8ff	metrics: Remove iperf3 server protocol This PR removes the iperf3 server protocol as this server definition is also used for the UDP iperf3 benchmarks to avoid duplication of the same yaml files. Fixes #8829 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-15 15:44:24 +00:00
Fabiano Fidêncio	0dc00ae373	Merge pull request #8822 from microsoft/danmihai1/cargo-clippy genpolicy: cargo clippy fixes	2024-01-15 14:59:04 +01:00
Fabiano Fidêncio	73cf31bd9e	Merge pull request #8827 from microsoft/danmihai1/disable-k8s-oom tests: cbl-mariner: disable k8s-oom.bats	2024-01-15 14:40:16 +01:00
Xuewei Niu	923bd65dff	Merge pull request #8819 from justxuewei/rm-protocol-backend dragonball: Remove unused definition	2024-01-15 10:09:46 +08:00
Dan Mihai	b7c31e3b98	tests: cbl-mariner: disable k8s-oom.bats Disable k8s-oom.bats on cbl-mariner until it passes more often. Fixes: #8824 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-14 17:39:25 +00:00
Dan Mihai	681cb1626a	genpolicy: cargo clippy fixes Clean up cargo clippy errors. Fixes: #8818 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-14 01:23:46 +00:00
Dan Mihai	3af713acd4	Merge pull request #8817 from microsoft/danmihai1/cargo-fmt genpolicy: "cargo fmt -- --check" clean-up	2024-01-13 16:22:27 -08:00
Xuewei Niu	f1fda3d6b0	dragonball: Remove unused definition `EndpointProtocolFlags::ProtocolBackend` is removed due to no reference. Fixes: #8745 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-13 13:25:11 +08:00
Dan Mihai	dcaae54cf6	genpolicy: "cargo fmt -- --check" clean-up Also, update Cargo.lock Fixes: #8816 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-13 01:57:00 +00:00
GabyCT	a7114a35a8	Merge pull request #8792 from GabyCT/topic/updatenhwc metrics: Use a specific python version to run tensorflow benchmark	2024-01-12 11:24:54 -06:00
Alex.Lyn	ffcd95b6b4	Merge pull request #8737 from Apokleos/test-ci-dgb-cri-containerd ci: enable test dragonball stability and cri-containerd	2024-01-12 11:56:22 +08:00
Fabiano Fidêncio	a606401722	Merge pull request #8803 from jodh-intel/issues-8784-runtime-rs-ch-rm-todo-to-unbreak runtime-rs: ch: Unbreak CH driver	2024-01-11 19:37:13 -03:00
Gabriela Cervantes	12a41f89b1	metrics: Use a specific python version to run tensorflow benchmark This PR uses a specific python version to run tensorflow benchmark as it needs python 3.8 to run correctly and avoid failures. Fixes #8791 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-11 22:15:31 +00:00
GabyCT	2ffb161958	Merge pull request #8763 from stevenhorsman/fix-backport-check-hub Fix backport check hub	2024-01-11 15:15:12 -06:00
Fabiano Fidêncio	86a6d133e4	Merge pull request #8248 from microsoft/danmihai1/genpolicy-main tools: add policy generation tool	2024-01-11 17:02:54 -03:00
GabyCT	69be050ff9	Merge pull request #8657 from WenyuanLau/8656/Fix_StratoVirt_on_gha_metrics gha: Fix the failure of gha metrics for StratoVirt	2024-01-11 11:41:25 -06:00
James O. D. Hunt	29e0de4e4a	runtime-rs: ch: Implement minimal memory hotplug APIs Replace the `todo!()` calls with a minimal NOP implementation to return the CH driver to working order since the `todo!()`'s forcibly crash the driver at runtime. Full implementations for these APIs will be added on issues #8800, #8801, and #8802. Fixes: #8784. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-01-11 14:11:31 +00:00
James O. D. Hunt	1c0df670af	runtime-rs: ch: Add minimal implementation of hypervisor metrics method Remove the `todo!()` macro which would cause a runtime crash and replace with a implementation that returns an error as a stop-gap until #8800 is implemented. Fixes: #8785. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2024-01-11 14:11:01 +00:00
alex.lyn	b97efc3139	CI: enable test container memory update for dragonball Fixes: #8746 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-11 19:07:33 +08:00
alex.lyn	6c85e95c34	CI: bugfix for dragonball when CI running with cri-containerd Containerd runtime options with wrong setting cause it failed. Correct it as below: ... [plugins.cri.containerd.runtimes.${runtime}.options] ConfigPath= "${KATA_CONFIG_PATH}" ... Fixes: #8746 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-11 17:35:33 +08:00
alex.lyn	cd59d31a15	CI: make CI work for dragonball to test stability and cri-containerd It needs to remove the skip setting, and make it work for dragonball. Fixes: #8746 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-11 17:35:13 +08:00
Hyounggyu Choi	f62ec0a7f5	Merge pull request #8693 from BbolroC/ibm-se-config-validation-fix runtime: Allow no initrd path for IBM Z Secure Execution	2024-01-11 09:53:51 +01:00
Xuewei Niu	70305fefc5	Merge pull request #8780 from justxuewei/containerd-events runtime-rs: Forward events to containerd via ttrpc	2024-01-11 14:58:14 +08:00
Xuewei Niu	6fd49f7604	runtime-rs: Forward events to containerd via ttrpc It is a little bit heavy for the runtime-rs to forwards events via containerd CLI, contrast to the ttrpc way. Plus, for runtimes that haven't this mechanism, e.g. CRI-O, we can't get those events anywhere. This patch introduces two types of forwarders: - `ContainerdForwarder`: Acquire ttrpc address from environment variables and forward events via ttrpc connection. - `LogForwarder`: Write event info into logs. Fixes: #7881 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-11 10:32:50 +08:00
GabyCT	a8be3d0450	Merge pull request #8796 from GabyCT/topic/uruncv versions: Update runc version	2024-01-10 14:16:20 -06:00
Gabriela Cervantes	e69f7c07a7	versions: Update runc version This PR updates the runc version to 1.1.11 which includes the following improvements - Fix several issues with userns path handling. - Support memory.peak and memory.swap.peak in cgroups v2. Add swapOnlyUsage in MemoryStats. This field reports swap-only usage. For cgroupv1, Usage and Failcnt are set by subtracting memory usage from memory+swap usage. For cgroupv2, Usage, Limit, and MaxUsage are set. - build(deps): bump github.com/cyphar/filepath-securejoin. Fixes #8795 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-10 16:46:08 +00:00
Greg Kurz	0c37aec7dc	Merge pull request #8753 from fidencio/topic/add-confidential-artefacts TEEs: Introduce kernel-confidential	2024-01-10 16:59:57 +01:00
Alex.Lyn	695440a431	Merge pull request #8749 from Apokleos/fixup-dragonball-vfio runtime-rs: fixup vfio device in runtime-rs/dragonball	2024-01-10 15:20:34 +08:00
Dan Mihai	de61b4d4e2	Merge pull request #8772 from microsoft/danmihai1/wait-for-delete tests: list the current k8s pods	2024-01-09 13:45:55 -08:00
Fabiano Fidêncio	c3f6eaa267	build-kernel: Fix typo 'terball' -> 'tarball' SSIA. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-09 14:35:45 -03:00
Fabiano Fidêncio	8b2f43a2c2	build: Add "confidential" kernel We're using a Kernel based on v6.7, which should include all te patches needed for SEV / SNP / TDX. By doing this, later on, we'll be able to stop building the specific kernel for each one of the targets we have for the TEEs. Let's note that we've introduced the "confidential" target for the kernel builder script, while the TEE specific builds are being kept as they're -- at least for now. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-09 14:35:45 -03:00
Jianyong Wu	379e2f3da2	kernel: update some configs based on kernel 6.5 and 6.6 There are lots of configs removed from latest kernel. Update them here for convenience of next kernel upgrade. Remove CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE [1] Remove CONFIG_IP_NF_TARGET_CLUSTERIP [2] Remove CONFIG_NET_SCH_CBQ [3] Remove CONFIG_AUTOFS4_FS [4] Remove CONFIG_EMBEDDED [5] [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=a7e4676e8e2cb158a4d24123de778087955e1b36 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=9db5d918e2c07fa09fab18bc7addf3408da0c76f [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=051d442098421c28c7951625652f61b1e15c4bd5 [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=1f2190d6b7112d22d3f8dfeca16a2f6a2f51444e [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.6&id=ef815d2cba782e96b9aad9483523d474ed41c62a Fixes: #8408 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2024-01-09 14:35:45 -03:00
Fabiano Fidêncio	cf4835e3ae	packaging: qemu: Simplify "--disable-virtiofsd" logic As all the supported architectures are disabling the virtiofsd build, there's no need to keep the switch statement there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-09 14:35:45 -03:00
Fabiano Fidêncio	bfc6fc7a85	build: Get rid of QEMU experimental We've not been building QEMU experimental for a very long time, and the entry there has only been serving the purpose to clutter the versions.yaml (in the best case scenario) or even confuse new contributors to the project. Mind that the machinery to build the QEMU experimental is not touched, and that's used to build the TEEs capabale artefacts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2024-01-09 14:35:45 -03:00
GabyCT	4ac5f13722	Merge pull request #8789 from GabyCT/topic/installimagestress tests: Add check images as part of install dependencies	2024-01-09 09:28:13 -06:00
GabyCT	393edf380a	Merge pull request #8778 from GabyCT/topic/fixin packaging: Fix indentation of build static stratovirt	2024-01-09 09:27:52 -06:00
Greg Kurz	e3611cf27d	Merge pull request #8326 from cheriL/8325/fix_method_param agent: use method params instead of const params in functions	2024-01-09 07:35:19 +01:00
Gabriela Cervantes	24fab19f6f	tests: Remove check images function from stressng test This PR removes the check images function from stressng test as now it will part of the install dependencies function from gha-run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-08 17:40:39 +00:00
Gabriela Cervantes	aceba94d95	tests: Add check images as part of install dependencies To avoid random failures while trying to build and install the stressng image, this PR moves that step as part of the install dependencies in order to move the stability tests and avoid timeouts. Fixes #8787 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-08 17:38:14 +00:00
Pavel Mores	0cfb2d2570	runtime-rs: add simple Persist implementation for Qemu This is not necessarily meant to work, just to stub out unimplemented functionality while focusing on more fundamental things. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-08 13:12:39 +01:00
Pavel Mores	45862aeec0	runtime-rs: add default rootfs type for qemu Make sure that rootfs type is known early on even if it's not set in configuration.toml. Signed-off-by: Pavel Mores <pmores@redhat.com>	2024-01-08 13:12:39 +01:00
Gabriela Cervantes	7d41c97f60	packaging: Fix indentation of build static stratovirt This PR fixes the indentation of the build static stratovirt script for kata containers. Fixes #8777 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-05 18:06:08 +00:00
Dan Mihai	90c782f928	tests: list the current k8s pods Log the list of the current pods between tests because these pods might be related to cluster nodes occasionally running out of memory. Fixes: #8769 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-05 16:41:43 +00:00
Xuewei Niu	192c6ee9c3	Merge pull request #8773 from justxuewei/dbs-k8s-fragile	2024-01-05 12:54:32 +08:00
Xuewei Niu	0e9d73fe30	agent: Fix an issue reporting OOM events by mistake The agent registers an event fd in `memory.oom_control`. An OOM event is forwarded to containerd when the event is emitted, regardless of the content in that file. I observed content indicating that events should not be forwarded, as shown below. When `oom_kill` is set to 0, it means no OOM has occurred. Therefore, it is important to check the content to avoid mistakenly forwarding OOM events. ``` oom_kill_disable 0 under_oom 0 oom_kill 0 ``` Fixes: #8715 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-05 11:06:37 +08:00
Dan Mihai	b18f269ccf	Merge pull request #8735 from microsoft/danmihai1/set-policy agent: hold lock while setting new policy	2024-01-04 13:28:21 -08:00
GabyCT	5ea07c2b3e	Merge pull request #8776 from GabyCT/topic/addextraqemu tests: Add hypervisor component to kill kata components function	2024-01-04 14:29:52 -06:00
Gabriela Cervantes	4ad1971a0a	tests: Add hypervisor component to kill kata components function This PR adds the qemu-experimental hypervisor in the function to kill kata components. Fixes #8775 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-04 17:07:12 +00:00
stevenhorsman	6bac3323be	workflows: Update backport-label to use gh-utils.sh - hub is deprecated, so use the new gh-utils.sh script that wraps the github cli instead Fixes: #8125 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-01-04 16:48:34 +00:00
stevenhorsman	0d5d1c8c36	ci: Add gh-util.sh script - The hub tool is now deprecated, so introduce a new alternative to `hub-util.sh` https://github.com/kata-containers/.github/blob/main/scripts/hub-util.sh that works with it. Initially I've only started with the couple of commands that we use regularly, but we can extend it in future. - Expects jq to be installed and `gh` to be installed an setup (see [1]) - Now we don't have lots of repos, I've moved it into `kata-containers` rather than `.github`, so it is more visible. Fixes: #8125 [1] https://docs.github.com/en/github-cli/github-cli/quickstart#prerequisites Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2024-01-04 16:48:34 +00:00
Dan Mihai	7d5336aca3	agent: hold lock while setting new policy Don't release the lock between is_allowed and set_policy calls, because the policy might change in between these calls. Also, move more policy code into policy.rs. Fixes: #8734 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2024-01-04 16:45:30 +00:00
GabyCT	f056ffe5ef	Merge pull request #8759 from fadecoder/update_docs_for_stratoVirt_VMM docs: Update docs for new StratoVirt VMM introduction	2024-01-04 10:39:37 -06:00
GabyCT	4f9ee7b31c	Merge pull request #8766 from GabyCT/topic/improvedeleteion metrics: Improve iperf3 cleanup	2024-01-04 10:38:33 -06:00
Xuewei Niu	b5a6e74cdf	Merge pull request #8744 from justxuewei/vhu-net-compile dragonball: Fix compilation issue without all net features	2024-01-04 19:02:55 +08:00
Xuewei Niu	db948f685d	Merge pull request #8757 from justxuewei/upgrade-containerd-shim-protos runtime-rs\|agent\|protocols\|agent-ctl: Bump ttrpc and containerd-shim-protos versions	2024-01-04 19:02:42 +08:00
soup	7c176a62fe	agent: use method params instead of const params in functions Fixes: #8325 Signed-off-by: soup <lqh348659137@outlook.com>	2024-01-04 09:29:29 +01:00
Xuewei Niu	f97f16a44a	agent-ctl: Bump ttrpc version - `ttrpc` from `0.7.1` to `0.8`. Fixes: #8757 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-04 15:58:34 +08:00
Xuewei Niu	bf59c7b3d4	runtime-rs: Bump ttrpc and containerd-shim-protos versions - `ttrpc` from `0.7.1` to `0.8`. - `containerd-shim-protos` from `0.3.0` to `0.6.0`. Fixes: #8756 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-04 15:58:34 +08:00
Xuewei Niu	cf9a0e21a1	protocols: Bump ttrpc version - `ttrpc` from `0.7.1` to `0.8`. Fixes: #8756 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-04 15:58:34 +08:00
Xuewei Niu	91360e7ddb	agent: Bump ttrpc version - `ttrpc` from `0.7.1` to `0.8`. Fixes: #8756 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2024-01-04 15:58:34 +08:00
Chao Wu	0f532175fe	Merge pull request #8771 from openanolis/chao/fix_ut dbs-pci: introduce Cargo.lock to prevent the influence from upstream	2024-01-04 15:14:22 +08:00
Zhigang Wang	44b5b88f4c	docs: Update docs for new StratoVirt VMM introduction As the StratoVirt VMM has been added, we can update the docs and make some intoduction to StratoVirt, thus users can know more about the hypervisor choices. Fixes: #8645 Signed-off-by: Zhigang Wang <wangzhigang17@huawei.com> Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2024-01-04 14:26:48 +08:00
Chao Wu	f1235ddba3	dbs_virtio_devices: add Cargo.lock In order to avoid rust-vmm upstream change breaks Dragonball compilation, we introduce Cargo.lock to dbs crates. fixes: #8770 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-01-04 11:23:30 +08:00
Chao Wu	02cd726bfc	dbs-utils: add Cargo.lock In order to avoid rust-vmm upstream change breaks Dragonball compilation, we introduce Cargo.lock to dbs crates. fixes: #8770 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-01-04 11:17:45 +08:00
Chao Wu	97bdc1529b	dbs-pci: introduce Cargo.lock As reported in #8767, we have found that the root cause is that rust-vmm's vmm-sys-utils introduce a new release 0.12.1 and dbs-pci rely on rust-vmm's vfio-ioctls which uses >= to declare vmm-sys-utils so it automatically upgrade vmm-sys-utils to 0.12.1. That's how two different versions of vmm-sys-utils is introduced and this breaks the compilation. In order to fix this and also avoid future problems, we introduce Cargo.lock file to dbs crates. fixes: #8770 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2024-01-04 11:11:56 +08:00
Gabriela Cervantes	4bc67dba08	metrics: Improve iperf3 cleanup This PR improves the iperf3 cleanup to ensure all the components are being deleted properly to avoid the random failures of leaving the iperf3 clients on the kata metrics CI. Fixes #8765 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2024-01-03 17:14:38 +00:00
alex.lyn	d2080fd221	runtime-rs: refactor getting the vfio device guest pci path Fixes: #8748 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-02 14:28:34 +08:00
alex.lyn	d795fcfc2f	runtime-rs: bridge the vfio device between runtime-rs and dragonball Previously, Dragonball did not support PCI device hot-plugging or VFIO device passthrough. Therefore, the runtime-rs support for Dragonball was incomplete. it is time to complete it so that users can use Dragonball's PCI hot-plugging and VFIO passthrough capabilities. Fixes: #8748 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2024-01-02 14:28:10 +08:00
Chao Wu	67b91c1eb3	Merge pull request #8740 from openanolis/upstream/pci-6-final Dragonball: add pci vfio passthrough, hot(un)plug support	2023-12-29 01:58:32 +08:00
Chao Wu	71c322c293	runtime-rs: fix ci complains vfio commits introduce quite a lot change in runtime-rs, this commit is for all the changes related to ci, including compilation errors and so on. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-28 23:34:41 +08:00
Chao Wu	f9e0a4bd7e	upcall: introduce pci device add & del kernel patch add pci add and del guest kernel patch as the extension in the upcall device manager server side. also, dump config version to 120 since we need to add config for dragonball pci in upcall fixes: #8741 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-28 16:21:30 +08:00
Chao Wu	a3f7601f5a	dragonball: add pci hotplug / hot-unplug support Introduce two new vmm action to implement pci hotplug and pci hot-unplug: PrepareRemoveHostDevice and RemoveHostDevice. PrepareRemoveHostDevice is to call upcall to unregister the pci device in the guest kernel. RemoveHostDevice should be called after PrepareRemoveHostDevice, it is used to clean the PCI resource in the Dragonball side. fixes: #8741 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-28 16:08:31 +08:00
Chao Wu	0f402a14f9	dragonball: add InsertHostDevice vmm action Introduce a new vmm action InsertHostDevice to passthrough host pci devices like NIC or GPU devices into guest so that users could have high performance usage of those devices. fixes: #8741 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-28 16:04:22 +08:00
Xuewei Niu	4c023e341c	dragonball: Fix compilation issue without all net features Combinations of network features were tested: - None - virtio-net - vhost-net - vhost-user-net - virtio-net,vhost-net - vhost-net,vhost-user-net - virtio-net,vhost-user-net - virtio-net,vhost-net,vhost-user-net Fixes: #8742 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-28 11:37:26 +08:00
Alex.Lyn	990a3adf39	Merge pull request #8618 from Apokleos/csi-for-directvol runtime-rs: Add dedicated CSI driver for DirectVolume support in Kata	2023-12-27 21:27:29 +08:00
Chao Wu	cbd4481bc1	Merge pull request #7489 from Apokleos/pci_path runtime-rs: add pci topology for pci devices	2023-12-27 18:52:06 +08:00
alex.lyn	ea69c17008	runtime-rs: initialize pcie topology in Device Manager Add a pcie_topology field to DeviceManager and initialize pcie_topology when ResourceManager calls DeviceManager's new() with TopologyConfigInfo. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:57:23 +08:00
alex.lyn	b42548b8e1	runtime-rs: do unregister device in Trait Device/detach Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:53:18 +08:00
alex.lyn	0f0b6d13c9	runtime-rs: do register/update device in Trait Device/attach Before calling the device driver to attach a device, register the device to PCIe topology and allocate a PciPath for it. However, for some hypervisor such as CLH, the allocation is invalid when plugging devices to VM, they have the ability to return DeviceInfo containing PciPath. It'll update the PciPath with the returned pci path in the PCIe topology for them to prevent the inferred pcipath from being different from the actual value returned. But the update will not be executed if the pcipath value doesn't change. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:49:18 +08:00
alex.lyn	ce7d363695	runtime-rs: Introduce helper macros to simplify PCIe device ops Introduce helper macros to simplify PCIe device register/unregister and update, which provides a convenient way to handle devices in topology. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:43:58 +08:00
alex.lyn	0d4992b24d	runtime-rs: add one more argument in Device attach/detach Add one more argument with type &mut Option<&mut PCIeTopology> in attach and detach to inroduce methods within PCIe Topology. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:40:01 +08:00
alex.lyn	b425de6105	runtime-rs: implement Trait PCIeDevice for pcie/pci device Implement Trait PCIeDevice register/unregister for pcie/pci device, such as vfio device which needs set/get device's pci path for kata agent's device handler. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:33:08 +08:00
alex.lyn	87e39cd1f6	runtime-rs: introduce Trait PCIeDevice to do [un]register device Introduce Trait PCIeDevice with register/unregister, which are used to register or unregister pcie device within the PCIe topology. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:29:35 +08:00
alex.lyn	6ebc4884fa	runtime-rs: introduce PCIe Topology framework for pcie/pci devices Due to different ways that different VMMs handle PCI devices, we expect to provide a general PCIe topology processing framework that is as compatible as possible with VMMs such as dragonball, qemu, clh(Though it has its own management method, no conflict). Currently,it's mainly developed for kinds of PCIe/PCI devices in dragonball/clh which are attached on the pci/pcie root bus directly. More will be added when Qemu is ready in runtime-rs. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:29:25 +08:00
alex.lyn	88839026b9	runtime-rs: introduce TopologyConfigInfo to initialize pcie topology A TopologyConfigInfo added to store device config info for PCIe/PCI devices in the VM from Hypervisor DeviceInfo. And TopologyConfigInfo::new will be the entry to initialize PCIe Topology for each VM. Fixes: #7218 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-27 15:21:53 +08:00
Fabiano Fidêncio	35f88dfc93	Merge pull request #8733 from fidencio/topic/fix-shim-check-for-snapshotter-configration kata-deploy: Fix shim check for snapshotter configuration	2023-12-27 03:30:53 -03:00
Chao Wu	8895cb82df	Merge pull request #8724 from openanolis/chao/add_vfio dragonball: introduce vfio support	2023-12-27 11:40:53 +08:00
Xuewei Niu	43a627c96f	Merge pull request #8632 from adamqqqplay/support-vhost-user-blk dragonball: introduce vhost-user-blk device	2023-12-27 09:54:21 +08:00
Chao Wu	2f797a6eb7	pci: rename 2 parameters to follow rust naming convention PciCapabilityID -> PciCapabilityId PciBarRegionType::IORegion -> PciBarRegionType::IoRegion Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-26 23:28:47 +08:00
Chao Wu	9c13b2c990	dragonball: introduce vfio support vfio mod collects lots of information related to the vfio operations, including VfioMsi and VfioMsix capability & state, vfio interrupt info, pci region infor and vfio pci device info & state. fixes: #8722 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com> Signed-off-by: Yang Su <yang.su@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Xin Lin <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-26 23:28:43 +08:00
alex.lyn	8779fe7dd5	runtime-rs: create a reference that directs users to kata csi doc Fixes: #8602 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-26 20:36:34 +08:00
alex.lyn	ba5437382a	runtime-rs: add examples about Kata pod with directvol by CSI. Fixes: #8602 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-26 20:36:34 +08:00
alex.lyn	c6d2a32146	runtime-rs: add support for directvol csi deploy scripts. Fixes: #8602 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-26 20:36:34 +08:00
alex.lyn	25d8e83e43	runtime-rs: Add dedicated CSI driver for DirectVolume support in Kata Bridge the gap between user requirements for direct block device access and the DirectVolume capabilities provided by Kata runtimes (kata-runtime/runtime-rs), and facilitate seamless integration with CSI to improve user experience. It aims to integrate DirectVolume CSI support into Kata, enabling users to benefit from its performance and flexibility advantages. Fixes: #8602 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-26 20:36:22 +08:00
Fabiano Fidêncio	6ee7fb5402	kata-deploy: Double quote the snapshotter name Otherwise `jq` will complain about: ```sh jq: error: nydus/0 is not defined at <top-level>, line 1: .plugins."io.containerd.grpc.v1.cri".containerd.runtimes."kata-clh".snapshotter=nydus jq: 1 compile error ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-26 09:14:36 -03:00
Qinqi Qu	81ab174c16	dragonball: support vhost-user-blk in device manager This patch introduces a feature of supporting vhost-user-blk device. Fixes: #8631 Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com>	2023-12-26 20:02:38 +08:00
Qinqi Qu	ef8dc3b0ce	dragonball: support vhost-user-blk This patch introduces a feature of supporting vhost-user-blk device. This device needs to be defined before the VM instance is started, which can be done through the dbs-cli tool with --virblks option: --virblks '{ "drive_id": "8623", "device_type": "Spdk", "path_on_host": "spdk:///var/tmp/vhost.sock", "is_root_device": false, "is_read_only": false, "is_direct": false, "no_drop": false, "num_queues": 1, "queue_size": 256 }' Fixes: #8631 Signed-off-by: Eric Ren <renzhen@linux.alibaba.com> Signed-off-by: fupan <fupan.lfp@antgroup.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com>	2023-12-26 20:02:32 +08:00
Fabiano Fidêncio	8332f3c684	kata-deploy: Fix the snapshotter config placement In the way the script is without this patch, we're trying to set ```toml [`$shim`] snapshotter = $snapshotter ``` However, what we actually want to set is the full runtime table instead of shim. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-26 08:26:38 -03:00
Fabiano Fidêncio	907f1ddb9e	kata-deploy: Fix shim check for snapshotter configuration We want to check whether the shim is part of the "plain text" shims passed to the daemonset (meaning, checking against `$SHIMS`). Before this fix we were checking against `$shims`, which is an array of shims instead of a string, resulting on a broken check. Fixes: #8732 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-26 07:42:36 -03:00
Tim Zhang	a4ad12a3d1	Merge pull request #8729 from liubin/fix/package-kata-monitor kata-monitor: fix Dockerfile to build image	2023-12-26 18:30:15 +08:00
alex.lyn	3b317e69e2	runtime-rs: add README and user guide to deploy directvol CSI Driver Fixes: #8602 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-26 18:00:35 +08:00
Bin Liu	23eb3042c7	kata-monitor: fix Dockerfile to build image move `SKIP_GO_VERSION_CHECK` after `make` command to skip checking golang version. And also upgrade golang to 1.19. Fixes: #8728 Signed-off-by: Bin Liu <bin@hyper.sh>	2023-12-26 15:11:13 +08:00
Xuewei Niu	1065ca6fa7	Merge pull request #8626 from justxuewei/vhost-user-endpoint	2023-12-26 12:52:21 +08:00
Xuewei Niu	36a4cbccf6	runtime-rs: Expand all DeviceType in match arms The compiler will give a warning if a developer forget to add an arm for a new variants defined. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-26 10:18:59 +08:00
Xuewei Niu	f2d08bc00f	runtime-rs: Remove unused index from Endpoints The affected `Endpoint`s are `VhostUserEndpoint` and `TapEndpoint`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-26 10:18:59 +08:00
Xuewei Niu	60a42351e2	runtime-rs: DAN supports vhost-user-net device DAN reads vhost-user-net device from JSON config. It only supports VMM running as server right now. Fixes: #8625 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-26 10:18:59 +08:00
Xuewei Niu	693a0cfbfd	dragonball: Make vhost-user-net ready for VhostUserEndpoint The changes involve: - Expose VhostUserConfig struct to runtime-rs. - Set a default value while num_queues or queue_size are 0. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-26 10:18:59 +08:00
Xuewei Niu	54df832407	runtime-rs: Support VhostUserEndpoint This commit introduces VhostUserEndpoint and supports relative to vhost-user-net devices for device manager. For now, Dragonball is able to attach vhost-user-net devices. Fixes: #8625 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-26 10:18:50 +08:00
Xuewei Niu	374c2f01aa	runtime-rs: Simplify VhostUserType enum Remove unused string parameter from each item. Fixes: #8625 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-25 16:15:57 +08:00
Xuewei Niu	38eb4077a6	Merge pull request #8503 from justxuewei/vhost-user-net dragonball: Support vhost-user-net device	2023-12-25 13:47:51 +08:00
Xuewei Niu	4c5de72863	dragonball: Wrap config space into `set_config_space` Config space of network device is shared and accord with virtio 1.1 spec. It is a good way to abstract the common part into one function. `set_config_space()` implements this. Plus, this patch removes `vq_pairs` from vhost-net devices, since there is a possibility of data inconsistency. For example, some places read that from `self.vq_pairs`, others read from `queue_sizes.len() / 2`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-25 10:47:34 +08:00
Alex.Lyn	3a3f39aa2d	Merge pull request #8668 from Apokleos/pci-path-refactor runtime-rs: Refactor the code related to PCI paths and VFIO device driver initialize in DM.	2023-12-23 21:44:07 +08:00
Steve Horsman	1afce09858	Merge pull request #8721 from stevenhorsman/kata-deploy-typos kata-deploy: snapshotter typo fixes	2023-12-22 21:26:03 +00:00
stevenhorsman	4a95c0d07f	kata-deploy: snapshotter typo fixes - Add spaces so that the if statements are valid Fixes: #8720 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-12-22 16:32:02 +00:00
Dan Mihai	080541a0f2	genpolicy: add SPDX license header Add SPDX license header to rules.rego. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Saul Paredes	7f126be67e	genpolicy: Update oci_distribution to 0.10.0 Also support alternative media type and update samples Signed-off-by: Saul Paredes <saulparedes@microsoft.com> Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	9eb6fd4c24	docs: add agent policy and genpolicy docs Add docs for the Agent Policy and for the genpolicy tool. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	57f93195ef	genpolicy: add support for StatefulSet YAML input Generate policy for K8s StatefulSet YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	35958ec9cc	genpolicy: add support for ReplicationController Generate policy for K8s ReplicationController YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	7da17099f2	genpolicy: add support for ReplicaSet YAML input Generate policy for K8s ReplicaSet YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	d84300f1ee	genpolicy: add support for List YAML input Generate policy for K8s List YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	a03452637b	genpolicy: add support for Job YAML input Generate policy for K8s Job YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	2dbd01c80b	genpolicy: add support for Deployment YAML input Generate policy for K8s Deployment YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	a40a6003d0	genpolicy: add support for DaemonSet YAML input Generate policy for K8s DaemonSet YAML. Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Dan Mihai	48829120b6	policy: initial genpolicy commit Add application that infers K8s user's intentions based on user's K8s YAML file, and generates a Rego/OPA based policy for that YAML. Just Pod YAML files are supported as input using this initial source code. Support for other types of YAML files will come with upcoming commits. Fixes: #7673 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-22 15:35:05 +00:00
Chao Wu	555136c1a5	Merge pull request #8662 from openanolis/pci/4-upstream dragonball: introduce pci msi/msix interrupt	2023-12-22 18:08:31 +08:00
Steve Horsman	c5f939cdc1	Merge pull request #8655 from fidencio/topic/kata-deploy-add-snapshotter-support kata-deploy: Allow setting up snapshotters per runtime handler	2023-12-22 09:16:07 +00:00
Chao Wu	8cf3bcefd8	dragonball: introduce pci msi/msix interrupt introduce msi/msix mod to maintain information for PCI Message Signalled Interrupt Extended Capability. It will be initialized when parsing pci configuration space and used when getting interrupt capabilities. fixes: #8661 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com> Signed-off-by: Yang Su <yang.su@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Xin Lin <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-22 16:28:22 +08:00
Xuewei Niu	beadce54c5	dragonball: Support vhost-user-net devices This PR introduces vhost-user-net devices to Dragonball. The devices are allowed to run as server on the VMM side. Fixes: #8502 Signed-off-by: Eric Ren <renzhen@linux.alibaba.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-22 14:53:18 +08:00
Xuewei Niu	1f21d3cb2c	dragonball: Introduce address space for MmioV2DeviceState Vhost-user-net has a dependency on address space from `MmioV2DeviceState`. The addition of the address space is introduced in this patch. Plus, it makes sure all unit tests have the according parameter as well. Fixes: #8502 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-22 14:53:18 +08:00
Fupan Li	dc9a0ac8ce	Merge pull request #8718 from justxuewei/enable-vhost tests: Load vhost modules explicitly while Kata installing	2023-12-22 14:52:49 +08:00
Xuewei Niu	206ed6d77d	tests: Load vhost modules explicitly while Kata installing The default network backend of runtime-rs with Dragonball is vhost-net after #8609 merged. The tests might be failed if vhost modules are not loaded. Fixes: #8717 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-22 11:07:37 +08:00
alex.lyn	94c83cea84	runtime-rs: Refactor vfio driver implementation It's important to ensure that these tasks which setup vfio devices are completed before add_device. So Moving vfio device setup code to a dedicated method at device building time which does not affect the behavior of other code. And this change makes it easier to understand the difference between create and attach, and also makes the boundaries clearer. Fixes: #8665 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-22 10:37:40 +08:00
alex.lyn	82d3cfdeda	runtime-rs: Make VhostUserConfig's field pci_path type more specific Make VhostUserConfig pci_path's type more specific, change it from Option<String> to Option<PciPath>. Fixes: #8665 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-22 10:35:38 +08:00
alex.lyn	5cc2890a10	runtime-rs: refactor and re-implement pci path. Do refactor and re-implement to make the pci path more "rusty". Fixes: #8665 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-22 10:34:41 +08:00
Fabiano Fidêncio	32e1ba2525	Merge pull request #8714 from cmaf/libsh-update-loc tests: Use function from Kata repo	2023-12-21 12:30:31 -03:00
Fabiano Fidêncio	6cc6ca5a7f	kata-deploy: Allow setting up snapshotters per runtime handler Since containerd 1.7.0 we can easily set a specific snapshotter to be used with a runtime handler, and we should take advantage of this, mostly as it'll help setting up any runtime using devmapper or nydus snapshotters. This implementation here has a few caveats: * The format expected for the SNAPSHOTTER_HANDLER_MAPPING is: `shim:snapshotter,shim:snapshotter,...` * It only works with containerd 1.7 or newer * We never change the default containerd snapshotter * We don't do any check on our side to verify whether the snapshotter required is properly deployed * Users will have to add an annotation to their pods, in order to use the snapshotter set up per runtime handler * Example: ``` metadata: ... annotations: io.containerd.cri.runtime-handler: kata-fc ``` Fixes: #8615 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-21 07:20:10 -03:00
alex.lyn	1b5758c1f2	runtime-rs: Move the PciPath-related code to a dedicated file Move the pciPath code to a new file pci_path.rs and update the references. Fixes: #8665 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-21 11:35:18 +08:00
alex.lyn	275de453d5	runtime-rs: remove useless get_host_guest_map and its test case Fixes: #8665 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-21 11:07:56 +08:00
Chelsea Mafrica	9f394f6e18	tests: Use function from Kata repo Switch to use function from Kata repo in common.bash to reduce dependency on the tests repo. Fixes #8713 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-20 16:45:06 -08:00
Dan Mihai	d916da15dd	Merge pull request #8688 from microsoft/danmihai1/k8s-confidential tests: retry connection to pod SSH server	2023-12-20 15:01:26 -08:00
Fabiano Fidêncio	3482256340	Merge pull request #8709 from fidencio/topic/update-jq-for-kata-deploy kata-deploy: Update `jq` as part of the kata-deploy daemonset	2023-12-20 16:48:07 -03:00
James O. D. Hunt	7da6d0a845	runtime-rs: ch: Implement missing thread/pid APIs Add implementations for the following `Hypervisor` trait methods which simply return the same details as the `get_vmm_master_tid()` method: - `get_thread_ids()` - `get_pids()` Fixes: #6438. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-20 17:58:40 +00:00
Fabiano Fidêncio	c9e631dc0c	kata-deploy: Reapply "kata-deploy: Use tomlq to configure containerd" This reverts commit `ee5fa08a27`. This is perfectly fine to do as we narrwoed down the issue to be on the version of `jq` provided by alpine, and we've already updated it in the previous commit (in this very same series). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-20 12:52:41 -03:00
Fabiano Fidêncio	41320c586e	kata-deploy: Install jq from GitHub `jq` coming from alpine is in its 1.6 version, and that has a bug that hits us quite hard, as it changes a float to an int whenever the number is in the `x.0` format. One example is: ```bash / # jq --version jq-1.6 / # echo '{"foo": 1.0}' \| jq .foo 1 ``` With this in mind, let's switch, at least for now, to using the `jq` released directly on github, as it does address the issue we've been hitting. ```bash ⋊> Downloads ./jq-linux-amd64 --version jq-1.7 ⋊> Downloads echo '{"foo": 1.0}' \| jq .foo 1.0 ``` Fixes: #8678 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-20 12:52:41 -03:00
Greg Kurz	ce094ecdc2	Merge pull request #8679 from stevenhorsman/kata-deploy-containerd-config-fix gha: kata-deploy: Revert containerd config break	2023-12-20 12:58:56 +01:00
stevenhorsman	ee5fa08a27	Revert "kata-deploy: Use tomlq to configure containerd" This reverts commit `dd9f5b07b9`. Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-12-20 09:10:43 +00:00
stevenhorsman	9e718b4e23	gha: kata-deploy: Add containerd status check After kata-deploy has installed, check that the worker nodes are still in Ready state and don't have a containerd://Unknown container runtime versions, identicating that container isn't working to ensure that we didn't corrupt the containerd config during kata-deploy's edits Fixes: #8678 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-12-20 09:10:43 +00:00
Archana Shinde	7e5868a55f	Merge pull request #8588 from amshinde/runtime-rs-update-readme runtime-rs: Update readme to indicate cloud-hypervisor support	2023-12-19 22:09:14 -08:00
Dan Mihai	8aa390279e	tests: retry connection to pod SSH server To become more resilient against these kinds of errors: deployment.apps/confidential-unencrypted created pod/confidential-unencrypted-c5fdd6964-rrb6q condition met ssh: connect to host 10.42.0.109 port 22: Connection refused Fixes: #8687 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-20 02:48:05 +00:00
GabyCT	5504176e9a	Merge pull request #8699 from GabyCT/topic/fixconfidentialscript tests: k8s: Fix indentation in confidential common script	2023-12-19 16:01:28 -06:00
Dan Mihai	6cea8a5f2a	Merge pull request #8697 from microsoft/danmihai1/runk tests: additional run-runk logging	2023-12-19 11:27:29 -08:00
Dan Mihai	551a50cd72	tests: additional run-runk logging Add logging to run-runk, for debugging possible failures. Fixes: #8696 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-12-19 14:08:01 +00:00
Hyounggyu Choi	540a2a7fb1	runtime: Allow no initrd path for IBM Z Secure Execution This is to reintroduce a configuration rule for IBM Z Secure Execution, where no initrd path should be configured. For the TEE of interest, only a kernel image should be specified with `confidential_guest=true`. Fixes: #8692 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-19 11:21:16 +01:00
Xuewei Niu	ec30d5a9a8	Merge pull request #8700 from justxuewei/dbs-ut dragonball: Trigger unit tests of dbs_* subcrates by `make test`	2023-12-19 17:51:20 +08:00
Xuewei Niu	039fe7f391	dragonball: Trigger unit tests of dbs_* subcrates by `make test` `make SUPPORT_VIRTUALIZATION=1 test` iterates through all subcrates and does test. Plus, this patch fixes some issues about unit tests: - Feed too much parameters to `I8042Device::new()`. - Virtqueue checks have been introduced since `virtio-queue v0.7.0`. - GHA might have no access to `/var/tmp` dir on runner. Fixes: #8690 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-19 16:22:37 +08:00
Hyounggyu Choi	ceea8882db	Merge pull request #8672 from BbolroC/introduce-vsock-device-init runtime-rs: Separate init_config() from new() for struct VsockDevice	2023-12-18 22:04:37 +01:00
Gabriela Cervantes	1469a5efca	tests: k8s: Fix indentation in confidential common script This PR fixes the indentation of the confidential common script for kubernetes tests. Fixes #8698 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-18 20:25:06 +00:00
Chelsea Mafrica	312475508a	Merge pull request #8682 from cmaf/static-checks-update-loc ci: Use static checks from kata repo for lib functions	2023-12-18 09:53:01 -08:00
Hyounggyu Choi	3cd0cc1388	runtime-rs: Separate init_config() from new() for struct VsockDevice As a follow-up for #8516, guest_cid and vhost_fd are not necessarily initialised via new(). Instead, the fields should be initialised later when they are really used to construct hypervisor's parameters. This commit is to separate init_config() from new() to initialise guest_cid and vhost_fd and leave only the assignment of id for the existing function. Fixes: #8671 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-18 16:36:09 +01:00
Greg Kurz	2987d3eeb5	Merge pull request #8341 from jongwu/fix_cpushares agent: correct CPUShares and CPUWeight value	2023-12-18 15:40:04 +01:00
James O. D. Hunt	3c49120d2f	Merge pull request #8641 from jodh-intel/kata-ctl-add-cfg-file-cli-option kata-ctl: Add option to dump config files	2023-12-18 11:54:19 +00:00
Greg Kurz	1cfcc80018	Merge pull request #8664 from amshinde/remove-ignore-paths-ga github-actions: Remove ignore paths for required CI checks	2023-12-18 12:49:21 +01:00
Chelsea Mafrica	b785ef96ec	docs: Change location of static checks script We now use the static checks script from the main kata containers repo and not the tests repo; update documentation to reflect this. Fixes #8681 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-15 17:13:02 -08:00
Chelsea Mafrica	bfb756199f	ci: Use static checks from kata repo for lib functions Change the two functions in lib.sh to use the static checks script from the kata containers repo instead of tests. Remove cloning the repo from these functions since we don't need it anymore. Leave these two functions because the document checking one may be used locally and the static checks one is called from the virtcontainers Makefile. Fixes #8681 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-15 17:08:33 -08:00
Archana Shinde	510bc36a77	github-actions: Remove ignore paths for required CI checks If a PR contains files from the ignore-paths, these actions do not run as intended. However, the actions are make as required. And there does not seem to be a way to mark these as non-required in that case. As a result a PR containing the files from the ignore-paths remains stalled. Hence remove the ignore-paths until github provides a way to mark actions that are skipped due to ignore-paths as non-required/passed. Fixes: #8663 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-12-15 15:12:20 -08:00
Liu Wenyuan	61fe20cf9a	gha: Fix some of gha metrics failure for StratoVirt Update the Speed & Density metric tests baseline for StratoVirt and re-enable them, and skip other metric tests temporarily. Fixes: #8656 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-12-15 17:45:01 +08:00
Zhongtao Hu	0f80dc636c	Merge pull request #6876 from openanolis/memory_hotlug runtime-rs: support Memory hotplug	2023-12-15 14:28:35 +08:00
Zhongtao Hu	9a37e77f2a	runtime-rs: check the update memory size check the update memory size greater than default max memory size Fixes:#6875 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-12-15 11:25:34 +08:00
Zhongtao Hu	6039417104	runtime-rs: add default_maxmemory in config file add default_maxmemory in config file Fixes:#6875 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-12-15 10:25:20 +08:00
Zhongtao Hu	8d9fd9c067	runtime-rs: support memory resize Fixes:#6875 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-12-15 10:25:13 +08:00
Zhongtao Hu	81e55c424a	runtime-rs: add resize_memory trait for hypervisor Fixes: #6875 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-12-15 10:25:03 +08:00
Zhongtao Hu	d428a3f9b9	runtim-rs: get guest memory details get memory block size and guest mem hotplug probe Fixes:#6356 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-12-15 10:22:37 +08:00
GabyCT	4a49dd73db	Merge pull request #8676 from GabyCT/topic/fixins tests: k8s: Fix indentation in setup script	2023-12-14 13:57:47 -06:00
GabyCT	7a606a19c4	Merge pull request #8659 from GabyCT/topic/improvecleanuplatency metrics: Improve latency network cleanup	2023-12-14 13:57:28 -06:00
GabyCT	0831529279	Merge pull request #8644 from GabyCT/topic/updadockerresint metrics: Update TensorFlow ResNet50 Int8 Dockerfile	2023-12-14 13:56:41 -06:00
Jianyong Wu	58e88d9469	agent: correct CPUShares and CPUWeight value If cgroup driver is systemd, CPUShares, for cgroup v1, should be at least 2 [1] and CPUWeight for cgroup v2, should be at least 1 [2]. Fixes: #8340 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> [1] `d19434fbf8/src/basic/cgroup-util.h (L122)` [2] `d19434fbf8/src/basic/cgroup-util.h (L91)`	2023-12-15 02:04:31 +08:00
Steve Horsman	04de6eb4fd	Merge pull request #8674 from ChengyuZhu6/fix_statis_check static-checks: Add some dependencies to static checks for CoCo features	2023-12-14 16:47:01 +00:00
Greg Kurz	1bd9c1b4de	Merge pull request #8589 from wvell/patch-1 Remove warning for cgroupsv2 only operating systems	2023-12-14 17:37:59 +01:00
Gabriela Cervantes	c92b14da97	tests: k8s: Fix indentation in setup script This PR fixes the indentation of the kubernetes setup script. Fixes #8675 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-14 16:26:22 +00:00
Amulya Meka	ac7b3d4735	Merge pull request #8667 from Amulyam24/workflow gha: add a post cleanup script for cri-containerd ppc64le workflow	2023-12-14 21:52:54 +05:30
Alex.Lyn	c7c7632203	Merge pull request #8620 from Apokleos/enhance-directv-using-csi runtime-rs: Enhancement of DirectVolume when using a dedicated CSI	2023-12-14 22:59:09 +08:00
ChengyuZhu6	dfad0e6622	.github: fix the failure without devicemapper for host sharing fix error when running checks and tests: error: failed to run custom build command for `devicemapper-sys v0.1.5` fatal error: 'libdevmapper.h' file not found thread 'main' panicked at 'Could not generate dm.h bindings: ClangDiagnostic("dm.h:2:10: fatal error: 'libdevmapper.h' file not found\n")', /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/devicemapper-sys-0.1.5/build.rs:24:10 stack backtrace: 0: rust_begin_unwind at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panicking.rs:593:5 1: core::panicking::panic_fmt at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:67:14 2: core::result::unwrap_failed at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/result.rs:1651:5 3: core::result::Result<T,E>::expect 4: build_script_build::main 5: core::ops::function::FnOnce::call_once note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. warning: build failed, waiting for other jobs to finish... make: *** [../../utils.mk:177: standard_rust_check] Error 101 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-12-14 20:47:47 +08:00
ChengyuZhu6	983479748f	.github: fix error when making checks for CoCo guest pull Fix error when making checks: ``` error: failed to run custom build command for `image-rs v0.1.0 (https://github.com/confidential-containers/guest-components?tag=v0.8.0#e849dc89)` Caused by: process didn't exit successfully: `/home/runner/work/kata-containers/kata-containers/src/ agent/target/release/build/image-rs-fd932206d09362b7/build-script-build` (exit status: 101) --- stdout cargo:rerun-if-changed=./protos/getresource.proto cargo:rerun-if-changed=./protos --- stderr thread 'main' panicked at 'Could not find `protoc` installation and this build crate cannot proceed without this knowledge. If `protoc` is installed and this crate had trouble finding it, you can set the `PROTOC` environment variable with the specific path to your installed `protoc` binary.If you're on debian, try `apt-get install protobuf-compiler` or download it from https://github.com/protocolbuffers/protobuf/releases ``` Fixes #8673 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-12-14 20:47:42 +08:00
alex.lyn	aa42f0a03f	runtime-rs: Enhancement of DirectVolume when using CSI. We use a matching direct-volume path to determine whether an OCI mount is a DirectVolume. However, we should handle the case where no match is found appropriately. This error will be defined as a non-DirectVolume type when judging the OCI mount but not failed. Fixes: #8619 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-14 18:19:03 +08:00
alex.lyn	80d631ee84	runtime-rs: Add attribute serde rename to each field of DirectVolume. DirectVolume structure in runtime-rs is different from it in kata-runtime, which causes they has no unified handling method for DirectVolumeMountInfo and MountInfo. We should align the two by simply adding the attribute #[serde(rename="x") to each field in DirectVolumeMountInfo Fixes: #8619 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-14 18:18:40 +08:00
Xuewei Niu	7f611dfe84	Merge pull request #8609 from justxuewei/runtime-rs-vhost-net dragonball: Use vhost-net device by default	2023-12-14 16:33:29 +08:00
Amulyam24	0db820fa01	gha: add a post cleanup script for cri-containerd ppc64le workflow This PR identifies and adds an action to cleanup the ppc64le self hosted runner. Fixes: #8666 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-12-14 13:46:47 +05:30
Hyounggyu Choi	fbc04460f6	Merge pull request #8649 from BbolroC/put-pre-action-gha-s390x GHA: Put all the preliminary steps into pre-action for s390x	2023-12-14 07:16:17 +01:00
Xuewei Niu	82fde4431e	dragonball: Set default queue config for vhost-net device Dragonball sets a default queue config in the case of `None`. The queue_size and num_queues of vhost-net are set to `Some(0)` by default. Therefore, we might get an invalid queue config. This patch fixes this issue. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-14 11:18:33 +08:00
Xuewei Niu	c11b066728	runtime-rs: Use vhost-net device by default This patch set vhost-net as default backend of networking. It allows users to set `disable_vhost_net` to `true` to reenable virtio-net backend. Plus, which backend to use is a matter of hypervisor, runtime-rs will no longer need to know that. Fixes: #8608 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-14 11:18:26 +08:00
Chelsea Mafrica	6c2e2a9120	Merge pull request #8635 from cmaf/migrate-static-checks-gha static-checks: Direct Makefile to use new static checks	2023-12-13 16:00:16 -08:00
Gabriela Cervantes	8151117f73	metrics: Improve latency network cleanup This PR improves the latency network cleanup by removing the pods even if the test fails. Fixes #8658 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-13 17:56:01 +00:00
Fabiano Fidêncio	a998e89bcf	Merge pull request #8639 from fidencio/topic/kata-deploy-use-tomlq-to-configure-containerd kata-deploy: Use `tomlq` to configure containerd	2023-12-13 14:11:45 +01:00
Hyounggyu Choi	05e278de5b	GHA: Put all the preliminary steps into pre-action for s390x This is to introduce a pre-action to all the workflows for building artifacts. The action could take care of tasks such as cleaning up files and reinstalling packages, which prevents a workflow from getting affected by the environment. This also includes the removal of the step `Adjust a permission for repo`, because it could be incorporated into the action. Fixes: #8648 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-13 13:24:40 +01:00
Chao Wu	dfaf006fcc	Merge pull request #8564 from openanolis/chao/add_pci_root_bus_device dragonball: add pci root bus and root device	2023-12-13 17:57:16 +08:00
Fabiano Fidêncio	7ad873cf29	kata-deploy: Simplify shim configuration We never have to add a configuration for the "default" case, as we're already creating the runtime class pointing to what should be the "default" handler. This helps to simplify the logic by quite a lot. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-13 10:52:54 +01:00
Fabiano Fidêncio	e618949937	kata-deploy: Remove useless comment from CRI-O drop-in The comment adds absolutely nothing to the runtime handler added, and it'd make our life slightly harder to properly say which VMM is being used when setting the default `kata` handler. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-13 10:49:52 +01:00
Fabiano Fidêncio	dd9f5b07b9	kata-deploy: Use tomlq to configure containerd This save us a lot of trouble on properly sed'ing content that may or may not be in the containerd configuration file. Fixes: #8638 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-13 10:49:49 +01:00
Fabiano Fidêncio	4f01f294bb	kata-deploy: Install `tomlq` to the base image This will help us to have an easier time playing with the containerd configuration, instead of having to sed the **** out of it, which is super error prone. `tomlq` is a tool that comes from https://github.com/kislyuk/yq, and that depends on `jq` to do the toml parsing / editing. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-13 10:49:07 +01:00
James O. D. Hunt	d7c6219dfe	Merge pull request #8630 from jodh-intel/runtime-rs-ch-set-state-on-vm-stop runtime-rs: ch: Change state when VM stopped	2023-12-13 09:26:30 +00:00
Xuewei Niu	855adbc63b	Merge pull request #8634 from justxuewei/disable-packed-vq dragonball: Disable packed virtqueue for vhost-user devices	2023-12-13 17:03:05 +08:00
wvell	af4622fcc1	docs: Remove warning for cgroupsv2 only operating systems Removes warning for cgroupsv2 as it is not needed anymore according to #6259. Fixes #8650 Signed-off-by: wvell <w.vellema@slash2.nl>	2023-12-13 09:18:39 +01:00
Chelsea Mafrica	b46cb22270	static-checks: Direct Makefile to use new static checks Direct the Makefile to use the static checks script in the tests directory of the main Kata Containers repo so it is run in GHA. Fixes #8595 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-12 16:43:35 -08:00
Chelsea Mafrica	63636b869c	static-checks: Update copyright dates Some copyright dates were not updated with the most recent changes to code; update them. Fixes #8595 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-12 16:34:06 -08:00
Chelsea Mafrica	b11c772865	static-checks: Change dir for building tools Change directory for running make due to local errors when building with make -C. Fixes #8595 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-12 16:34:06 -08:00
James O. D. Hunt	2a518f0898	runtime-rs: ch: Change state when VM stopped Make the CH (Cloud Hypervisor) `stop_vm()` method check the VM state before attempting to stop the VM, and update the state once the VM has stopped. This avoids the method failing if called multiple times which will happen if the workload exits before the container manager requests that the container stop. This change ensures the CH driver finishes cleanly. Fixes: #8629. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-12 18:25:20 +00:00
Fabiano Fidêncio	39f5cea3b1	kata-deploy: Fix k0s cri notation comment We can safely assume we're using the newer notation, not the older one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-12 18:20:18 +01:00
Gabriela Cervantes	23f76653e5	metrics: Update command to run the tensorflow int8 benchmark This PR updates the command to run the tensorflow resnet50 int8 benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-12 16:24:09 +00:00
Gabriela Cervantes	8fd5ef7fb7	metrics: Update TensorFlow ResNet50 Int8 Dockerfile This PR updates the TensorFlow ResNet50 Int8 Dockerfile to use the proper python version for kata metrics. Fixes #8643 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-12 16:20:56 +00:00
James O. D. Hunt	1195692d3c	runtime-rs: ch: Move state handling to top-level APIs Move the state setting to the `Hypervisor` trait calls. This makes the code clearer. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-12 15:25:27 +00:00
James O. D. Hunt	5637f11a8c	kata-ctl: Add option to dump config files Add a `--show-default-config-paths` command line option for parity with `kata-runtime`. Note that this requires the `KataCtlCli.command` to be optional so that the user can run simply: ```bash $ kata-ctl --show-default-config-paths ``` ... without also specifying a (sub-)command. Fixes: #8640. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-12 14:20:04 +00:00
Chelsea Mafrica	a9d360728e	static-checks: Fix directory for github labels Fix paths for yqdir (where the install_yq.sh script currently is) so that static checks can run without error. Fixes #8595 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-12-12 02:16:35 -08:00
Xuewei Niu	86918e91b3	dragonball: Disable packed virtqueue for vhost-user devices The layout of packed virtqueue isn't supported by `Endpoint::negotiate()`. Communication between device and driver will be failed due to the failure of parsing virtqueue if we don't disable the packed feature. This patch fixes this issue. Fixes: #8633 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-12-12 17:24:20 +08:00
Chao Wu	b079e1aabc	dragonball: add pci root bus and root device In order to follow up the PCI implementation in Dragonball, we need to add PCI root device and root bus support. root device is a pseudo PCI root device to manage accessing to PCI configuration space. root bus is mainly for emulating PCI root bridge and also create the PCI root bus with the given bus ID with the PCI root bridge. fixes: #8563 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com> Signed-off-by: Yang Su <yang.su@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Xin Lin <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-12 11:43:14 +08:00
GabyCT	ee74fca92c	Merge pull request #8617 from GabyCT/topic/enabletestnerdctl tests: nerdctl: Enable nerdctl tests for cloud hypervisor runtime-rs	2023-12-11 14:09:58 -06:00
David Esparza	584a26dab0	Merge pull request #8542 from dborquez/metrics_fix_deployment_cleaning metrics: cleans k8s iperf deployment when the test finishes.	2023-12-11 13:14:39 -06:00
Chao Wu	198e4adcb1	Merge pull request #8599 from openanolis/chao/fix_cargo_fmt dragonball: add --all for fmt ci	2023-12-12 00:20:21 +08:00
GabyCT	43410e1918	Merge pull request #8560 from GabyCT/topic/enablek8srs gha: k8s: Add cloud-hypervisor (runtime-rs) support	2023-12-11 09:42:49 -06:00
Hyounggyu Choi	ea2a0dc69d	Merge pull request #7769 from BbolroC/opa-multiarch rootfs: build OPA binary from source for ppc64le and s390x	2023-12-11 15:25:33 +01:00
Chao Wu	52f7a40e4e	dragonball: add --all for fmt ci Right now, cargo fmt check in Dragonball only test with the default features but not all features. This will cause some code being untested by the fmt tool. This PR adds --all option for the Dragonball CI and also fix some code that forgets to do cargo fmt --all. fixes: #8598 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-12-11 20:54:25 +08:00
Hyounggyu Choi	375c787e09	rootfs: build OPA binary from source for ppc64le and s390x This PR is to build a binary for OPA from source code for ppc64le and s390x. Fixes: #7616 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-11 12:59:48 +01:00
Hyounggyu Choi	16e2a50d17	Merge pull request #8624 from BbolroC/fix-runtime-class-check-qemu-se GHA: Fix kata-deploy-runtime-classes-check for kata-qemu-se	2023-12-11 12:58:00 +01:00
James O. D. Hunt	2a35541af7	Merge pull request #8592 from jodh-intel/static-checks-try-multiple-user-agents CI: static-checks: Try multiple user agents	2023-12-11 11:52:29 +00:00
Hyounggyu Choi	28c3e0e5f0	GHA: Fix kata-deploy-runtime-classes-check for kata-qemu-se This is to fix an error on kata-deploy-runtime-classes-check for kata-qemu-se. Fixes: #8623 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-11 10:30:00 +01:00
Hyounggyu Choi	b469dbf92f	Merge pull request #8622 from BbolroC/hotfix-k3s-kubectl-version GHA: Use --client=true for k3s kubectl version	2023-12-11 10:00:16 +01:00
Hyounggyu Choi	40f0c8fbb7	GHA: Use --client=true for k3s kubectl version This is to fix a broken usage for `k3s kubectl version` by switching an option `--short` to `--client=true`. Fixes: #8621 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-11 08:26:39 +01:00
Chao Wu	df7f416cb8	Merge pull request #8566 from liubogithub/liubo/dev/panic_fix runtime-rs: fix panic when hypervisor mismatches with configuration	2023-12-10 21:33:59 +08:00
Gabriela Cervantes	1662a3e859	common: Add cloud hypervisor in enabling hypervisor function This PR adds the cloud hypervisor in the enabling hypervisor function. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-08 21:32:00 +00:00
Chelsea Mafrica	1c42d94550	Merge pull request #6826 from gabevenberg/log-parser-rs kata-ctl: Moved log-parser-rs into kata-ctl	2023-12-08 11:33:09 -08:00
James O. D. Hunt	5d085a3042	CI: static-checks: Try multiple user agents Make the URL checker cycle through a list of user agent values until we hit one the remote server is happy with. This is required since, unfortunately, we really, really want to check these URLs, but some sites block clients based on their `User-Agent` (UA) request header value. And of course, each site is different and can change its behaviour at any time. Our strategy therefore is to try various UA's until we find one the server accepts: - No explicit UA (use `curl`'s default) - Explicitly no UA. - A blank UA. - Partial UA values for various CLI tools. - Partial UA values for various console web browsers. - Partial UA for Emacs's built-in browser. - The existing UA which is used as a "last ditch" attempt where the UA implies multiple platforms and browser. > Notes: > > - The "partial UA" values specify specify the UA "product" but not the > UA "product version": we specify `foo` and not `foo/1.2.3`). We do > this since most sites tested appear to not care about the version. > This is as expected given that the version is strictly optional (see `[]`). > > - We now log all errors and display an error summary if none of the UAs > worked, in addition to the simple list of the URLs we believe to be > invalid. This should make future debugging simpler. `[]` - https://www.rfc-editor.org/rfc/rfc9110#section-10.1.5 Fixes: #8553. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 18:02:41 +00:00
James O. D. Hunt	3174c18772	docs: Remove problematic URL Removed the Azure Portal URL (https://portal.azure.com) since this causes problems with our static checks script: that URL returns HTTP 403 ("Forbidden") when queried using command-line tools like `curl(1)`, which is used by the static check script. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	3779261a99	docs: Fix whitespace Remove some extraneous whitespace. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	613def0328	CI: static-checks: Move curl to a separate function Split the call to `curl` in the URL checker out into a new `run_url_check_cmd()` function to make `check_url()` slightly clearer. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	6d859f97ee	CI: static-checks: Lint fixes Declare and then define a couple of variables separately. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	efa8e6547c	CI: static-checks: Check params have a value Check that the `check_url()` parameters have a value. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	563ea020b0	CI: static-checks: Fold long line Break up a long line as little to make it easier to read. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
James O. D. Hunt	3ad43df946	CI: static-checks: Improve markdown checker test Only attempt to build the markdown checker if it doesn't already exist. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-08 17:11:20 +00:00
Liu Bo	bf97051f11	runtime-rs: fix panic when hypervisor mismatches with configuration If a wrong configuration.toml file is used by accidentally, runtime-rs binary could run into panic because of unwrap(). This fixes the panic by returning errors instead of unwrap(). fixes: #8565 Signed-off-by: Liu Bo <liub.liubo@gmail.com>	2023-12-08 08:56:23 -08:00
Zvonko Kaiser	9d38f01c2f	Merge pull request #8612 from BbolroC/introduce-secret-inheritance-s390x GHA: make secrets inherited for build-kata-static-tarball-s390x	2023-12-08 17:32:47 +01:00
Gabriela Cervantes	f3eeab10ab	tests: nerdctl: Enable nerdctl tests for cloud hypervisor runtime-rs This PR enables the nerdctl tests for cloud hypervisor runtime-rs. Fixes #8616 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-08 16:12:36 +00:00
Hyounggyu Choi	636eef8907	GHA: make secrets inherited for build-kata-static-tarball-s390x This is to make GHA secrets inherited for the workflow titled `build-kata-static-tarball-s390x` to configure an environment variable `CI_HKD_PATH` for a `build-asset-boot-image-se` step. Fixes: #8611 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-08 13:55:45 +01:00
Chao Wu	5054e59ccb	Merge pull request #8429 from adamqqqplay/support-vhost-user-fs dragonball: introduce vhost-user-fs device	2023-12-08 17:20:52 +08:00
Hyounggyu Choi	588f639a69	Merge pull request #6755 from BbolroC/add-se-artifacts-to-main packaging: Add IBM Z SE artifacts to main	2023-12-08 05:17:38 +01:00
Gabe Venberg	69fdd05ce5	kata-ctl: Moved log-parser-rs into kata-ctl Log-parser-rs was always intended to become a sub-functionality of kata-ctl, but it was useful to develop it and initaly merge it as a standalone program, and migrate it to a subcommand later. Fixes #6797 Signed-off-by: Gabe Venberg <gabevenberg@gmail.com>	2023-12-07 21:35:28 -06:00
David Esparza	b2577000e7	metrics: Expose iperf3 pods over a k8s networks. A prerequisite for measuring kata network bandwidth is run Iperf3 tool at a the transport layer provided by a k8s service for exposing a network where the clients inside the cluster can use to contact Pods in the service. Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-12-07 18:07:05 -06:00
David Esparza	a062ba166b	metrics: cleans k8s iperf deployment when the test finishes. This PR fixes small issues like: 1. Cleaning up the k8s environment by removing the iperf test implementation even when the test fails. 2. Checks if the workload returned a result before generating an empty results json file as it was bein done. 3. Removes the redundancy of calls to functions that process subtests and should compose the results json file only when all results are ready and not before. 4. The tcp service manifest was added to the server deployment which targets TCP port 5201. Fixes: #8534 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-12-07 18:02:39 -06:00
Archana Shinde	a5105b4227	Merge pull request #8582 from amshinde/runtime-rs-tryfrom-blkconfig Implement and use try_from for DiskConfig	2023-12-07 15:02:00 -08:00
Archana Shinde	458e91b289	runtime-rs: Update readme to indicate cloud-hypervisor support Since cloud-hypervisor is no longer built as an optional feature, lets mention cloud-hypervisor in the list of hypervisors supported by runtime-rs. Fixes: #8587 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-12-07 14:59:43 -08:00
GabyCT	0e0a7d9410	Merge pull request #8604 from GabyCT/topic/enablenerdctlrs gha: nerdctl: Enable cloud hypervisor runtime-rs for nerdctl CI	2023-12-07 14:35:26 -06:00
Hyounggyu Choi	3fab1690a4	local-build: make strip support for cross-compilation This is to adjust a name of the binary `strip` to a target architecture for cross-compilation. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 20:05:40 +01:00
Hyounggyu Choi	f38c7f14c5	gha: remove build redundancy of kernel and rootfs-initrd It is to remove the build redundancy of `kernel` and `rootfs-initrd` by making `boot-image-se` built based on them at the second build stage. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 20:05:40 +01:00
Hyounggyu Choi	31db56207b	local-build: add support for key verification for IBM Secure Execution This is to make `build_se_image.sh` incorporate the key verification originally supported by `genprotimg`. It can be achieved by specifying two environment variables called `SIGNING_KEY_CERT_PATH` and `INTERMEDIATE_CA_CERT_PATH`. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 20:05:40 +01:00
Hyounggyu Choi	52bdc87fe9	local-build: make kernel parameters configurable This is to make kernel parameters configurable during the secure image build by adding an environment variable SE_KERNEL_PARAMS. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 20:05:40 +01:00
Hyounggyu Choi	9ceb2c27e0	local-build: consider cross-compilation env This is to make a base builder image build genprotimg without a package manager under the cross-compilation environment. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 20:05:40 +01:00
David Esparza	298be4aa1c	Merge pull request #8594 from GabyCT/topic/updatedockerfilet metrics: Update TensorFlow ResNet FP32 dockerfile	2023-12-07 11:14:48 -06:00
Gabriela Cervantes	ce694b905b	tests: Fix indentation of gha-run script This PR fixes the indentation of gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-07 16:56:19 +00:00
Gabriela Cervantes	33b300431e	tests: Enable but do not run k8s tests for cloud hypervisor This PR enables but do not run k8s tests for cloud hypervisor for runtime-rs. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-07 16:39:15 +00:00
Gabriela Cervantes	acee3d8438	gha: k8s: Add cloud-hypervisor (runtime-rs) support This PR adds the Cloud Hypervisor driver, integrated with the runtime-rs, as part of the kubernetes tests. Fixes #8559 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-07 16:33:59 +00:00
Gabriela Cervantes	50a5fa9a65	tests: Enable but do not run the nerdctl tests for cloud hypervisor This PR enables but do not run the nerdctl tests for cloud hypervisor runtime-rs until we find out how stable they are. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-07 16:29:51 +00:00
Gabriela Cervantes	e70b2ea95d	gha: nerdctl: Enable cloud hypervisor runtime-rs for nerdctl CI This PR enables the cloud hypervisor runtime-rs for the nerdctl gha CI. Fixes #8603 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-07 16:24:36 +00:00
Hyounggyu Choi	ad6aab9918	Merge pull request #8601 from BbolroC/conflict-handling-for-self-hosted-runners GHA: remove GITHUB_WORKSPACE when workflow fails due to merge conflict	2023-12-07 12:17:31 +01:00
Hyounggyu Choi	0d5a970e54	GHA: remove GITHUB_WORKSPACE when workflow fails due to merge conflict It is to remove a GITHUB_WORKSPACE directory for self-hosted runners when a workflow fails due to the merge conflict. This will prevent the subsequent workflows from getting stuck in the same situation. Fixes: #8600 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-07 10:25:57 +01:00
Greg Kurz	501910d743	Merge pull request #8509 from zvonkok/stable-overlay deployment: Add stable overlay for kata-deploy.yaml	2023-12-07 09:43:41 +01:00
Huang Jianan	5629b7454f	dragonball: support vhost-user-fs in device manager This patch implements the virtio-fs device used for filesystem sharing and heavily based on the vhost-user protocol. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com> Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com>	2023-12-07 11:59:07 +08:00
Archana Shinde	a661ac3a0e	runtime-rs: Implement and use try_from for DiskConfig Implement try_from trait function to convert runtime-rs BlockConfig to cloud-hypervisor DiskConfig. This can allow for code reuse in the future. Fixes: #8581 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-12-06 12:10:34 -08:00
Fabiano Fidêncio	c14e3096c8	Merge pull request #8580 from amshinde/runtime-rs-clh-network-hotplug runtime-rs: add network hotplug for clh	2023-12-06 20:50:04 +01:00
Gabriela Cervantes	56dddab04f	metrics: Update command to run tensorflow resnet fp32 benchmark This PR updates the command needed to run the tensorflow benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-06 17:02:10 +00:00
Gabriela Cervantes	62fdebeeb5	metrics: Update TensorFlow ResNet FP32 dockerfile This PR updates the python version for the TensorFlow ResNet FP32 dockerfile so the benchmark can run without issues. Fixes #8593 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-06 16:53:21 +00:00
GabyCT	3d149d3455	Merge pull request #8578 from GabyCT/topic/fixlinkconfig docs: Update config containerd url link	2023-12-06 10:40:29 -06:00
Zvonko Kaiser	16380558e0	deployment: Create a stable overaly for kata-deploy Fixes: #8508 Create a stable overlay for kata-deploy.yaml so we do not have to maintain two files, only one. Single source for both. This is also preparation for the helm-overlay Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-12-06 14:23:22 +00:00
Huang Jianan	2a1fc29e84	dragonball: add unit test for vhost-user-fs Add some test cases for vhost-user-fs function. Signed-off-by: Beiyue <beiyue@linux.alibaba.com> Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>	2023-12-06 10:43:24 +08:00
Huang Jianan	d6cfbe9436	dragonball: support vhost-user-fs This patch implements the virtio-fs device used for filesystem sharing and heavily based on the vhost-user protocol. This vhost-user-fs device defines 5 parameters: - path: vhost-user socket path - tag: mount tag used from the guest to mount the filesystem - req_num_queues: number of request virtqueues - queue_size: depth of each virtqueue - cache_size: cache window size for dax This device needs to be defined before the VM instance is started, which can be done through the dbs-cli tool with --fs option: --fs '{ "sock_path":"/path/to/virtiofs.socket", "tag":"myfs", "num_queues":1, "queue_size":1024, "cache_size":0, "thread_pool_size":1, "cache_policy":"auto", "writeback_cache":true, "no_open":true, "xattr":true, "drop_sys_resource":false, "mode":"vhostuser", "fuse_killpriv_v2":true, "no_readdir":false, }' Fixes: #8428 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>	2023-12-06 10:43:17 +08:00
Archana Shinde	955dec06da	runtime-rs: add network hotplug for clh This is required for clh to work with nerdtcl and docker. This fixes the issues seen with nerdctl while starting a container. Hoewever, container exit with docker is still broken due to an unrelated issue. Fixes: #8579 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-12-05 15:29:53 -08:00
Fabiano Fidêncio	b056683b7a	Merge pull request #8436 from Lu-Biao/main image-builder: bugfix incorrect partition location	2023-12-06 00:10:06 +01:00
Fabiano Fidêncio	2cd003156e	Merge pull request #8573 from fidencio/topic/gha-add-a-timeout-for-tests gha: basic-ci: Add a timeout for the tests	2023-12-05 22:20:49 +01:00
Fabiano Fidêncio	d149b9f9ca	Merge pull request #7231 from wainersm/measured_rootfs-improvements Build for measured rootfs improvements	2023-12-05 22:20:33 +01:00
Fabiano Fidêncio	f75f17c4ff	Merge pull request #8570 from fidencio/topic/gha-dragonball-enable-some-tests-but-do-not-run-them-yet gha: dragonball: Enable, but do not run, cri-containerd, stability, and devmapper tests	2023-12-05 20:00:24 +01:00
Jeremi Piotrowski	e2c6b8ae6e	Merge pull request #4743 from yuchen0cc/main mount: support checking multiple kinds of block device driver	2023-12-05 18:04:51 +01:00
Gabriela Cervantes	61b868692b	docs: Update config containerd url link This PR updates the config containerd url link in the containerd kata documentation. Fixes #8577 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-05 16:35:21 +00:00
Fabiano Fidêncio	05ce52d746	devmapper: dragonball: Enable, but do not run, the tests This will make the life easier for dragonball developers to properly enable the tests once the tests are ready. Fixes: #8569 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-05 15:29:23 +01:00
Fabiano Fidêncio	a8a156b1af	stability: dragonball: Enable, but do not run, the tests This will make the life easier for dragonball developers to properly enable the tests once the tests are ready. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-05 15:29:23 +01:00
Fabiano Fidêncio	16ad721eda	cri-containerd: dragonball: Enable, but do not run, the tests This will make the life easier for dragonball developers to properly enable the tests once the tests are ready. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-05 15:29:23 +01:00
James O. D. Hunt	d9daadf15c	Merge pull request #8558 from jodh-intel/load-config-improvement runtime-rs: Show config files attempted on config load failure	2023-12-05 11:48:42 +00:00
Greg Kurz	1650d02b91	Merge pull request #8516 from Apokleos/vsock-dev move vsock device into device manager	2023-12-05 11:28:37 +01:00
James O. D. Hunt	93c0fc2ad3	Merge pull request #8551 from amshinde/runtime-rs-setns-clh runtime-rs: Launch cloud-hypervisor in given netns	2023-12-05 10:18:34 +00:00
James O. D. Hunt	d627893975	runtime-rs: Show config files attempted on config load failure PR #8483 changed the location of the rust runtime config files to `/etc/kata-containers/runtime-rs/`. However, if you haven't updated your system to create that directory, attempting to create a container using the rust runtime was giving the following cryptic message (formatted for easier reading): ``` failed to handler message try init runtime instance Caused by: 0: load config 1: load toml config 2: entity not found ``` Now, the message is as follows (again, reformatted for easier reading): ``` failed to handle message try init runtime instance Caused by: 0: load config 1: load TOML config failed (tried [ \"/etc/kata-containers/runtime-rs/configuration.toml\", \"/usr/share/defaults/kata-containers/runtime-rs/configuration.toml\", \"/opt/kata/share/defaults/kata-containers/runtime-rs/configuration.toml\" ]) ``` Fixes: #8557. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-05 09:10:18 +00:00
James O. D. Hunt	45c0364d4c	runtime-rs: Fix typo in task service "failed to handler message" -> "failed to handle message". Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-05 09:10:18 +00:00
Fabiano Fidêncio	a14f2fc180	gha: runk: Fix typo in the test name tracing -> runk Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-05 09:44:42 +01:00
Fabiano Fidêncio	1a74142a16	gha: basic-ci: Add a timeout for the tests This will ensure no job will be stuck forever, as we've noticed with a few jobs already. Fixes: #8572 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-05 09:42:46 +01:00
GabyCT	e8b28fed2a	Merge pull request #8540 from GabyCT/topic/fixctrdoc docs: Update cri installation url link	2023-12-04 17:36:33 -06:00
Archana Shinde	2df8144cfe	runtime-rs: Launch cloud-hypervisor in given netns Launch cloud-hypervisor binary in the netns provided at the prepare_vm stage. Fixes: #6441 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-12-04 13:02:43 -08:00
Hyounggyu Choi	511dd5feac	local-build: add support to build IBM Z SE image This is to add an artifact for IBM Z SE(TEE) to main. Fixes: #6754 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:08:51 +01:00
Hyounggyu Choi	4de8ef3d18	local-build: add build target boot-image-se This is to add a build target boot-image-se for s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:08:51 +01:00
Hyounggyu Choi	a63a6959d1	local-build: install s390-tools in Dockerfile This is to install s390-tools including genprotimg during the docker build. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:08:51 +01:00
Hyounggyu Choi	6d0dabd81e	gha: build secure image for s390x release This is add a build target boot-image-se with a host-key-document config for s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:08:51 +01:00
Hyounggyu Choi	bb1d4adaa9	config: add SE configuration This is to add SE configuration which is used by kata runtime. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:08:49 +01:00
Gabriela Cervantes	2b05029347	docs: Update cri installation url link This PR updates the cri installation url link for the containerd documentation. Fixes #8539 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-04 20:07:49 +00:00
Hyounggyu Choi	8de4241d3b	kata-deploy: add kata-qemu-se runtimeclass This is to increase resources for relaxing the limitation of hotplug for SE. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:06:53 +01:00
Hyounggyu Choi	9ede2bcd95	local-build: differentiate build targets based on architecture This is to rule out unnecessary build targets for s390x. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-12-04 21:06:53 +01:00
GabyCT	1c00a9a6a9	Merge pull request #8524 from GabyCT/topic/addiperfinfo docs: Update iperf3 network documentation	2023-12-04 14:03:30 -06:00
GabyCT	1b204cc3cb	Merge pull request #8550 from GabyCT/topic/enableclhstability gha: Add cloud runtime rs as part of the stability tests	2023-12-04 11:37:58 -06:00
Gabriela Cervantes	dfc07d1c72	gha: stability: Add cloud-hypervisor (runtime-rs) support This PR adds the Cloud Hypervisor driver, integraedwith the runtime-rs, as part of the stability tests. Fixes #8462 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-12-04 15:32:29 +00:00
Fabiano Fidêncio	8d7e0f7721	Merge pull request #8556 from fidencio/topic/kernel-add-tdx-guest-driver kernel: Add CONFIG_TDX_GUEST_DRIVER to the tdx.conf	2023-12-04 15:13:57 +01:00
James O. D. Hunt	e4aebb4560	Merge pull request #8549 from jodh-intel/tdx-no-root libs: protection: x86_64: drop root requirement for querying	2023-12-04 13:03:10 +00:00
Chao Wu	1550ee6767	Merge pull request #8480 from openanolis/chao/add_dbs_pci dragonball: init dbs-pci lib with pci bus & pci conf	2023-12-04 18:08:40 +08:00
Fabiano Fidêncio	03c3f4275e	kernel: Add CONFIG_TDX_GUEST_DRIVER to the tdx.conf The driver enables the userspace interface to communicate with the TDX module to request the TDX guest details, like the attestation report. Fixes: #8555 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-12-04 10:25:59 +01:00
Biao Lu	b816dca3ed	image-builder: fix incorrect part start position The 'part_start' of image and dax_image should exactly specify the same location, according to the parted documentation, to exactly specify the location, the units of start and end should use MiB. https://www.gnu.org/software/parted/manual/parted.html#IEC-binary-units Fixes: #8435 Signed-off-by: Biao Lu <biao.lu@intel.com>	2023-12-04 17:20:26 +08:00
Chao Wu	52fd57e49a	Merge pull request #8301 from Apokleos/do-direct-volume runtime-rs: Enhancing DirectVolMount Handling with Patching Support	2023-12-04 16:49:46 +08:00
James O. D. Hunt	7beab11d9e	Merge pull request #8547 from jodh-intel/unbreak-logger libs:logging: Fix logger	2023-12-04 08:38:03 +00:00
alex.lyn	0fabfa336d	runtime-rs: bring support for legacy vsock device. Bring support for legacy vsock and add Vsock to the ResourceConfig enum type, and add the processing flow of the Vsock device to the prepare_before_start_vm function. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-04 15:54:51 +08:00
alex.lyn	6c08cf35d5	runtime-rs: Introduce prepare_vm_socket_config to VirtSandbox. Instroduce prepare_vm_socket_config to VirtSandbox for vm socket config, including Vsock and Hybrid Vsock. Use the capabilities() trait of the hypervisor to get the vm socket supported in VMM. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-04 15:54:50 +08:00
alex.lyn	60f88da5e1	runtime-rs: add Capability of HybridVsockSupport for Hypervisor. Add Cap of HybridVsockSupport for hypervisors CLH and Dragonball which use hybrid-vsock, default for Qemu, which uses legacy vsock. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-04 15:54:50 +08:00
alex.lyn	c5178dd258	runtime-rs: Introduce Capability of HybridVsockSupport. Introduce HybridVsock Cap to judge which kind of vm socket will be supported by the Hypervisor. Use `is_hybrid_vsock_supported` to tell if an hypervisor supports hybrid-vsock, if not, it supports legacy vsock. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-12-04 15:54:29 +08:00
James O. D. Hunt	e1caca3e41	kata-ctl: Remove root requirement for "env" Remove the redundant `kata-ctl` `root` check when running the `env` command. This check duplicated the `GuestProtection` check, and that check is now no longer necessary anyway. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-01 15:55:45 +00:00
James O. D. Hunt	f05ada592f	libs: protection: x86_64: drop root requirement for querying It is no longer necessary to be `root` to query the guest protection (TDX) on `x86_64` systems, so drop the requirement. > Note: > > This change drops the `nix` `Uid` import required for the `root` check. > But at the same time it adds it for PPC64le since that implementation of > `available_guest_protection()` needs it and it was previously missing. Fixes: #8548. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-01 15:55:21 +00:00
Fabiano Fidêncio	852021e416	Merge pull request #8483 from fidencio/topic/move-rust-config-files-to-subdir-based-on-jodh-approach build/kata-deploy: Move rust runtime config files to runtime-rs directory -- based on #8445	2023-12-01 16:22:51 +01:00
James O. D. Hunt	f9f1d3a071	libs:logging: Fix logger PR #8311 inadvertently broke the logging since no log messages below the `Info` level are logged now, regardless of the requested log level. Resolve the issue by storing the requested log level in the `RuntimeComponentLevelFilter` and using that level in the `log()` function, rather than hard-coding `Info` as the default where no entry is found in the `FILTER_RULE` hashmap. Fixes: #8546. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-12-01 12:21:20 +00:00
yuchen.cc	1cd1558a92	mount: support checking multiple kinds of block device driver Device mapper is the only supported block device driver so far, which seems limiting. Kata Containers can work well with other block devices. It is necessary to enhance supporting of multiple kinds of host block device. Fixes #4714 Signed-off-by: yuchen.cc <yuchen.cc@alibaba-inc.com>	2023-12-01 11:59:30 +08:00
Chelsea Mafrica	818b8f93b1	Merge pull request #8288 from cmaf/migrate-static-checks Migrate static checks	2023-11-30 17:44:16 -08:00
Chelsea Mafrica	207a7fef90	Merge pull request #7815 from cmaf/runtime-rs-ch-vsock runtime-rs: Add Hybrid VSOCK device handling for CH	2023-11-30 12:22:36 -08:00
GabyCT	2bd21f7831	Merge pull request #8531 from GabyCT/topic/fixiperfli metrics: Fix iperf parallel bandwidth limit	2023-11-30 13:47:00 -06:00
Chao Wu	b3da71f21e	dragonball: init dbs-pci lib with pci bus & pci conf This commit inits dbs-pci lib for Dragonball to use. It contains several implementation now: 1. PCI configuration space 2. PCI bus More info of the design & behavior of those two features could be found in the README of dbs-pci. fixes: #8479 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Shifang Feng <fengshifang@linux.alibaba.com> Signed-off-by: Yang Su <yang.su@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Xin Lin <jingshan@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-11-30 23:40:26 +08:00
Dan Mihai	38f24c41c0	Merge pull request #8271 from microsoft/danmihai1/exec-test-failure tests: more k8s-exec-rejected debug output	2023-11-30 07:11:01 -08:00
Greg Kurz	48e5596186	Merge pull request #8456 from cheriL/8447/alpine_bash osbuilder: add pkg bash for alpine	2023-11-30 13:43:48 +01:00
Steve Horsman	c6110284d5	Merge pull request #8520 from stevenhorsman/hypervisor-ttrpc runtime: Update hypervisor generated code	2023-11-30 10:01:56 +00:00
Amulya Meka	3d5db65b2e	Merge pull request #8526 from Amulyam24/workflow-ppc gha: fix artefacts build on ppc64le	2023-11-30 15:00:06 +05:30
Fabiano Fidêncio	80fcc56cef	Merge pull request #8528 from fidencio/topic/stop-building-and-shipping-log-parser-rs tools: Stop building / shipping log-parser-rs	2023-11-30 09:14:10 +01:00
Fabiano Fidêncio	9b30d97885	Merge pull request #8533 from fidencio/topic/fix-invalid-cpu-topology-for-tdx Revert "runtime: confidential: Do not set the max_vcpu to cpu"	2023-11-30 09:06:45 +01:00
Amulyam24	6a922f0e37	gha: fix artefacts build on ppc64le Add step in the right place to prepare the runner for the builds/tests. Fixes: #8525 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-11-30 09:50:47 +05:30
soup	811ec07359	osbuilder: add pkg bash for alpine The bash component is required in the guest for debug console to work properly. Fixes: #8447 Signed-off-by: soup <lqh348659137@outlook.com>	2023-11-30 09:42:39 +08:00
Fabiano Fidêncio	f15e16b692	Revert "runtime: confidential: Do not set the max_vcpu to cpu" This reverts commit `b0157ad73a`. ``` commit `b0157ad73a` Refs: 3.3.0-alpha0-124-gb0157ad73 Author: Fabiano Fidêncio <fabiano.fidencio@intel.com> AuthorDate: Fri Aug 11 14:55:11 2023 +0200 Commit: Fabiano Fidêncio <fabiano.fidencio@intel.com> CommitDate: Fri Nov 10 12:58:20 2023 +0100 runtime: confidential: Do not set the max_vcpu to cpu We don't have to do this since we're relying on the `static_sandbox_resource_mgmt` feature, which gives us the correct amount of memory and CPUs to be allocated. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> ``` This commit was removing a requirement that was made previously, but due to the SMP issue we're facing with the QEMU used for TDX (see commit d1b54ede290e95762099fff4e0bcdad10f816126), QEMU will fail to start due to: ``` Invalid CPU topology: product of the hierarchy must match maxcpus: sockets (1) dies (1) * cores (1) * threads (1) != maxcpus (240)" ``` This has no affect on the SEV / SNP workflow and hopefully we'll be able to re-revet this soon enough, when this gets solved on te QEMU side. Last but not least, this is not a "clean" revert as we're using conf.NumVCPUs() instead of conf.NumVCPUs, to ensure we're dealing with uint32. Fixes: #8532 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-30 00:41:27 +01:00
Fabiano Fidêncio	1284b4e80d	tools: Stop building / shipping log-parser-rs This is a commit that's a pre-req for #6826, as that PR will merge log-parser-rs into kata-ctl, but that will result in a CI breakage. So, let's deal with the CI changes here, thanks to GHA and our favourite `pull_request_target` event, unblocking that PR to be merged. Fixes: #6797 (not really, but related). Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-30 00:32:10 +01:00
Gabriela Cervantes	37633d3cc2	metrics: Fix iperf parallel bandwidth limit This PR fixes the iperf parallel bandwidth limit for the kata metrics CI. Fixes #8530 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-29 19:59:45 +00:00
Dan Mihai	96deea52f2	tests: more k8s-exec-rejected debug output Print more information useful for debugging. Also, use a separate YAML file for this test, instead of reusing someone else's file. Fixes: #8270 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-11-29 18:05:15 +00:00
stevenhorsman	47b8c3181f	runtime: remote hypervisor updates to ttrpc - Update the remote hypervisor code to match the re-genned code for the ttrpc Hypervisor Service Fixes: #8519 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-11-29 18:04:40 +00:00
stevenhorsman	613c75ba8c	runtime: Update hypervisor generated code Update to use ttrpc_out instead of grpc_out Fixes: #8519 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-11-29 18:04:40 +00:00
GabyCT	1f1e5377e5	Merge pull request #8497 from GabyCT/topic/removemetricsstratovirt gha: Disable stratovirt for gha metrics	2023-11-29 11:16:53 -06:00
Fabiano Fidêncio	8fd39d11c4	tests: Adapt `enable_hypervisor`to the runtime-rs config location change As the configuration for the runtime-rs based drivers are now placed in a different location than the golang ones, we should adapt this script accordingly. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-29 14:51:35 +01:00
Fabiano Fidêncio	38183acbcb	tests: Use `kata-ctl` instead of `kata-runtime` for runtime-rs `kata-ctl` is the tool for runtime-rs, and it should be used instead of `kata-runtime`. `kata-ctl` requires sudo, and that's the reason it's also been added as part of the calls. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-29 14:51:35 +01:00
Fabiano Fidêncio	a5a73a11cb	tests: Replace `kata-runtime kata-env` by `kata-runtime env` `kata-runtime env` is an alias for `kata-runtime kata-env, and calling it with the `env` paramenter allows us to easily extend the scripts to use `kata-ctl` instead of `kata-runtime` when dealing with runtime-rs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-29 14:51:31 +01:00
Chelsea Mafrica	05efb23261	tests: update go.mod and go.sum Generate a go.sum file for tests. Fixes #8187 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-11-28 17:40:41 -08:00
Fabiano Fidêncio	30acb5a0c0	tests: nydus: Adapt the default config file for runtime-rs based drivers As we've done some changes in the runtime-rs based drivers to install their configuration into a different location, this should also be reflected as part of this test. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-28 20:37:59 +01:00
Chelsea Mafrica	6d9cb9325d	tests: update scripts for static checks migration Updates to scripts for static-checks.sh functionality, including common functions location, the move of several common functions to the existing common.bash, adding hadolint and xurls to the versions file, and changes to static checks for running in the main kata containers repo. The changes to the vendor check include searching for existing go.mod files but no other changes to expand the test. Fixes #8187 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-11-28 11:13:55 -08:00
Chelsea Mafrica	66f3944b52	tests: move github-labels to main repo Move tool as part of static checks migration. Fixes #8187 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: Derek Lee <derlee@redhat.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Graham Whaley <graham.whaley@intel.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Marco Vedovati <mvedovati@suse.com> Signed-off-by: Peng Tao <bergwolf@hyper.sh> Signed-off-by: Shiming Zhang <wzshiming@foxmail.com> Signed-off-by: Snir Sheriber <ssheribe@redhat.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:13:55 -08:00
Chelsea Mafrica	7f3c12f1dd	tests: move spell check tool to main repo Move tool as part of static checks migration. Fixes #8187 Signed-off-by: Bo Chen <chen.bo@intel.com> Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: Dan Middleton <dan.middleton@intel.com> Signed-off-by: Derek Lee <derlee@redhat.com> Signed-off-by: Eric Ernst <eric.ernst@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Graham Whaley <graham.whaley@intel.com> Signed-off-by: Hui Zhu <teawater@antfin.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Jimmy Xu <xjmmyshcn@gmail.com> Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com> Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com> Signed-off-by: Shiming Zhang <wzshiming@foxmail.com> Signed-off-by: Snir Sheriber <ssheribe@redhat.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:13:55 -08:00
Chelsea Mafrica	8ad433d4ad	tests: move markdown check tool to main repo Move the tool as a dependency for static checks migration. Fixes #8187 Signed-off-by: Bin Liu <bin@hyper.sh> Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Julio Montes <julio.montes@intel.com>	2023-11-28 11:13:55 -08:00
Chelsea Mafrica	eaa6b1b274	tests: move static checks and dependencies from tests Move static checks scripts and dependencies from tests to kata-containers repo. Fixes #8187 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com> Signed-off-by: Archana Shinde <archana.m.shinde@intel.com> Signed-off-by: Bin Liu <bin@hyper.sh> Signed-off-by: Carlos Venegas <jos.c.venegas.munoz@intel.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com> Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: Dan Middleton <dan.middleton@intel.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Derek Lee <derlee@redhat.com> Signed-off-by: Dov Murik <dovmurik@linux.ibm.com> Signed-off-by: Eric Ernst <eric_ernst@apple.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Fupan Li <fupan.lfp@antgroup.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com> Signed-off-by: Graham Whaley <graham.whaley@intel.com> Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com> Signed-off-by: Jon Olson <jonolson@google.com> Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com> Signed-off-by: Julio Montes <julio.montes@intel.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com> Signed-off-by: Marco Vedovati <mvedovati@suse.com> Signed-off-by: Nitesh Konkar <niteshkonkar@in.ibm.com> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com> Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Shiming Zhang <wzshiming@foxmail.com> Signed-off-by: Snir Sheriber <ssheribe@redhat.com> Signed-off-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Signed-off-by: Xu Wang <xu@hyper.sh> Signed-off-by: Yang Bo <bo@hyper.sh> Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-11-28 11:13:55 -08:00
Fabiano Fidêncio	61aa84b158	Revert "tests: k8s: Allow passing rust-runtime env var to kata-deploy" This reverts commit `44899d4cdf`, as we've decided to keep both golang and rust runtime installable and usable at the same time. The decision of having both runtimes installable and usable will help users to test and easily catch any possible differences between those runtimes, helping us to get on par with both implementations. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-28 18:02:07 +01:00
James O. D. Hunt	158ca17ae7	kata-deploy: Add cloud-hypervisor Now that we have a separate Cloud Hypervisor configuration file for the rust runtime, add it to the kata-deploy. See: https://github.com/kata-containers/kata-containers/pull/8250 Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-28 18:02:06 +01:00
Fabiano Fidêncio	d4e00238ab	kata-deploy: Improve the logic for linking to the rust runtime This change for now doesn't do much, apart from making it easier to expand which runtimes should be linked to the runtime-rs containerd shim binary. Also, this matches the logic used for the config files. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-28 18:01:27 +01:00
James O. D. Hunt	fc28deee0e	kata-deploy: Use rust runtime config files in runtime-rs directory Update `kata-deploy` to modify the rust runtime configuration files in their new `runtime-rs/` directory. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-28 18:01:25 +01:00
Gabriela Cervantes	9166d0aabb	docs: Update iperf3 network documentation This PR updates the iperf3 network documentation to include the parallel bandwidth. Fixes #8523 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-28 15:59:38 +00:00
Wainer dos Santos Moschetta	48bdca4c49	tests/k8s: add k8s-measured-rootfs.bats Implements the following test case: Scenario: Check incorrect hash fails Given I have a version of kata installed that has a kernel with the initramfs built and config with rootfs_verity.scheme=dm-verity rootfs_verity.hash=<incorrect hash of rootfs> set in the kernel_params When I try and create a container a basic pod Then The pod is doesn't run And Ideally we'd get a helpful message to indicate why Currently on CI only qemu-tdx is built with measured rootfs support in the kernel, so the test is restriced to that runtimeclass. Fixes #7415 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:54 -03:00
Wainer dos Santos Moschetta	1eae657b91	tests/k8s: add set_node() to lib.sh Use this new function to set the node where the pod should be scheduled to. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	c6075c8627	tests/k8s: add setup common Bring the setup_common() from CCv0 branch test's integration/kubernetes/confidential/tests_common.sh. It should be used to reduce boilerplates on the setup() of the tests. Unlike the original code, this won't export the `test_start_time` variable as it wouldn't be accurate to grab logs from the worker nodes due date/time mismatch between the running tests machine and the worker node. The function export the `node` variable which holds the name of a random node which has kata installed. Apart from that, it exports the `node_start_time` which capture the date/time when the test started, relative to the `node`. Tests that should inspect the logs can schedule pods/resources to the `node` and use `node_start_time` as the value reference to grep the logs. Fixes #7590 Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	220a2d9a15	tests/k8s: add assert_logs_contain() to lib.sh Bring the assert_logs_contain() from CCv0 branch tests' integration/kubernetes/confidential/lib.sh. Introduced the print_node_journal() which uses `kubectl debug` to print the systemd's journal of a k8s's node. Fixes #7590 Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	9a9c7a5c6f	tests/k8s: add set_metadata_annotation() to lib.sh This new function allow to the annotations to metadata section in a yaml configuration file. Co-authored-by: Ryan Savino <ryan.savino@amd.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	a13eecf7f3	runtime(-rs): add clean-generated-files target The new clean-generated-files make target allows for removing the generated files (including the configuration.toml files). The tools/packaging/static-build/shim-v2/build.sh script now uses that target to always force the re-generation of those files. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	36ea1b8ee7	tests/k8s: add new_pod_config() to lib.sh Copied the new_pod_config() and pod-config.yaml.in from CCv0 branch tests' integration/kubernetes/confidential/tests_common.sh and fixtures. Unlike the original version, new_pod_config() now gets the runtimeclass by parameter as the RUNTIMECLASS environment variable seems not broadly used on main branch's CI. The pod-config.yaml.in was changed as the diff shows below. In particular the imagePullSecrets was removed to avoid it throwing a warning on the pod's log. ``` --- a/tests/integration/kubernetes/runtimeclass_workloads/pod-config.yaml.in +++ b/tests/integration/kubernetes/runtimeclass_workloads/pod-config.yaml.in @@ -5,12 +5,10 @@ apiVersion: v1 kind: Pod metadata: - name: busybox-cc + name: test-e2e spec: runtimeClassName: $RUNTIMECLASS containers: - - name: nginx + - name: test_container image: $IMAGE - imagePullPolicy: Always - imagePullSecrets: - - name: cococred \ No newline at end of file + imagePullPolicy: Always \ No newline at end of file ``` Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com> Co-authored-by: Megan Wright <Megan.Wright@ibm.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	428daf9ebc	tests/k8s: add utilities functions for the tests The following functions were copied from CCv0's branch test's integration/kubernetes/confidential/lib.sh. I did just smalls refactorings (shortened their names and delinted shellcheck warnings): - k8s_delete_all_pods_if_any_exists() - k8s_wait_pod_be_ready() - k8s_create_pod() - assert_pod_fail() Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com> Co-authored-by: Jordan Jackson <jordan.jackson@ibm.com> Co-authored-by: Megan Wright <Megan.Wright@ibm.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Co-authored-by: Wang, Arron <arron.wang@intel.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	ba4f806c30	initramfs: re-wrote devices checking on init.sh Re-wrote the logic of init.sh to follow the rules: * the root device MUST exist always because it will be either mounted or verified (then mounted) * if rootfs verifier is enabled then the hash device MUST exist. Avoid the case where dm-verity is set but the hash device does not exist and so the verification is silently skipped Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	72ef82368c	shim-v2: ensure root hash exist when measured rootfs When measured toofs is enabled then the shim-v2 build should find the guest rootfs hash file, otherwise might (silently) generate configuration files with empty hash. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	1465e58854	kernel: ensure initramfs exist when measured rootfs The KATA_BUILD_CC variable plus the existence (or not) of the initramfs were used to determine whether to build the kernel for measured rootfs or not. Currently the variable MEASURED_ROOTFS has been used to trigger the feature build and when it is activated it should expect the initramfs exist. In other words, this changed the kernel build so that if `MEASURED_ROOTFS=yes` then the initramf file must exist and be found. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	4dbba5215f	shim-v2: moved measured rootfs logic to its builder Moved the measure rootfs logic from kata-deploy-binaries.sh to the shim-v2's builder script so that the former get less bloated with components's specific code. Fixes #6674 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	34be78df19	kernel: moved measured rootfs logic to its builder Moved the measure rootfs logic from kata-deploy-binaries.sh to the kernel's builder script so that the former get less bloated with components's specific code. Fixes #6674 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:53 -03:00
Wainer dos Santos Moschetta	3f16d29593	kernel: measured rootfs as argument to build-kernel.sh By convention the caller of tools/packaging/kernel/build-kernel.sh changes the script behavior by passing arguments, whereas, for measured rootfs it has used an environment variable (MEASURED_ROOTFS). This refactor the script so that the caller now must pass the "-m" argument to enable the build of the kernel with measured rootfs support. Fixes #6674 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-11-28 11:21:51 -03:00
Fabiano Fidêncio	80860478bf	runtime-rs: Remove the golang config paths As the configuration files are different, we can safely remove those as any new installation of the binary should also bring in the new configurations. This makes things less error-prone in the future, as we're ensuring that the rust runtime will only be reading the rust configuration files. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-28 15:16:53 +01:00
James O. D. Hunt	b86ab5aa21	runtime-rs: Update list of config paths to check Update the `DEFAULT_RUNTIME_CONFIGURATIONS` list to include a number of rust runtime specific paths to try to load before checking the "traditional" (golang) runtime configuration paths. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-28 15:16:53 +01:00
James O. D. Hunt	89ef464b7c	build: Install rust config files to runtime-rs directory Install the rust runtime configuration files to a `runtime-rs/` directory to distinguish them from the golang config files (which may have a different syntax). The default values mean that the rust config files are now installed to `/opt/kata/share/defaults/kata-containers/runtime-rs/` rather than `/opt/kata/share/defaults/kata-containers/`. See: https://github.com/kata-containers/kata-containers/issues/6020 Fixes: #8444. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-28 15:16:53 +01:00
alex.lyn	fe68f25bea	runtime-rs: enhancement of vfio volume. Reimplement vfio volume into direct_volume and do alignment of rawblock/spdk volume. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-28 10:08:05 +08:00
alex.lyn	e3fd403126	runtime-rs: enhancement of spdk volume. (1) Add enum DirectVolumeType for direct volumes. (2) Reimplement spdk volume into direct_volume and do alignment of rawblock volume. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-28 10:08:05 +08:00
alex.lyn	f973729029	runtime-rs: Enhancing DirectVolMount Handling for current Infra. The current infra(K8S, CSI, CRI, Containerd) for Kata containers is unable to properly handle direct volumes, resulting in the need for workarounds like searching/comparision and then patch up volume type. In this commit, reimplement of handling method is added to support raw block volume which backends may be rawdisk or other format file. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-28 10:08:05 +08:00
alex.lyn	e3becea566	runtime-rs: add support kata/multi-containers sharing one vfio volume. Fiexes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-28 10:07:23 +08:00
Steve Horsman	891f488ee3	Merge pull request #8501 from Amulyam24/containerd-tests gha: add cri-containerd workflow for ppc64le	2023-11-27 17:22:59 +00:00
James O. D. Hunt	45cc417a4e	Merge pull request #8461 from jodh-intel/update-codeowners CODEOWNERS: Expand scope	2023-11-27 15:38:39 +00:00
Fabiano Fidêncio	bb4c51a5e0	Merge pull request #8494 from ChengyuZhu6/kata_virtual_volume runtime: Pass `KataVirtualVolume` to the guest as devices in go runtime	2023-11-27 16:02:28 +01:00
Steve Horsman	bee6fba5c7	Merge pull request #8459 from Amulyam24/workflow-1 github: add workflows for building and publishing kata artefacts on ppc64le	2023-11-27 14:31:20 +00:00
Amulyam24	754aec02c3	gha: add cri-containerd workflow for ppc64le This PR adds workflow to run containerd tests on Power as a part of CI migration. Fixes: #8500 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-11-27 17:58:58 +05:30
alex.lyn	6af0592274	runtime-rs: Add vsock device in device manager. (1) Implement Device Trait for vsock device. (2) add vsock device in device manager. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-27 15:23:18 +08:00
alex.lyn	1a6b45d3b7	runtime-rs: Reintroduce Vsock and add it to the DeviceType enum As vsock device will be used in Qemu or other VMMs, the Vsoock is reintroduced to DeviceType enum. Fixes: #8474 Signed-off-by: Pavel Mores <pmores@redhat.com> Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-27 15:12:44 +08:00
alex.lyn	e31dbc94a5	runtime-rs: remove vhost_fd from VsockConfig and make it cloneable. Currently encounters difficulty in utilizing the clone operation on VsockConfig due to the implicit management of the vhost fd within the runtime-rs. This responsibility should be delegated to the VMM(especially QEMU) child process, as it's not runtime-rs core responsibilities. We'll remove the member vhost_fd from VsockConfig and make the VsockConfig/VsockDevice Cloneable. Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-27 15:11:21 +08:00
alex.lyn	eb90962b27	runtime-rs: introduce a new function generate_vhost_vsock_cid. Introduce a new function generate_vhost_vsock_cid to generate a guest CID and set guest CID for vsock fd. Also this commit wouldn't introduce functional change and it's just splited from the previous VsockDevice::new(). Fixes: #8474 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-27 15:06:58 +08:00
alex.lyn	b952c5c5ce	runtime-rs: add support kata/multi-containers sharing one spdk volume. Fiexes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-25 21:13:03 +08:00
alex.lyn	17d2d465d1	runtime-rs: re-organize the volumes with adding new direct_volumes. Add a new dire direct_volumes containing spdk, rawblock and vfio volume. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-25 21:04:55 +08:00
alex.lyn	6731466b13	runtime-rs: set a standard NotFound when direct volume path not found. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-25 19:51:12 +08:00
alex.lyn	d23867273f	runtime-rs: split the block volume into block and rawblock volume (1) rawblock volume is directvol mount type. (2) block volume is based on the bind mount type. Fixes: #8300 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-24 23:30:30 +08:00
Amulyam24	ae2c0c5696	github: add workflows for building and publishing kata artifacts on ppc64le Adds workflows for building kata static tarball and releasing it. Fixes: #8458 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-11-24 15:53:38 +05:30
ChengyuZhu6	5318afe273	runtime: support to create VirtualVolume rootfs storages 1) Creating storage for all `io.katacontainers.volume=` messages in rootFs.Options, and then aggregates all storages into `containerStorages`. 2) Creating storage for other data volumes and push them into `volumeStorages`. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-11-23 23:22:55 +08:00
ChengyuZhu6	0b4f7c2ee7	runtime: redefine and add functions to handle VirtualVolume to storage 1) Extract function `handleBlockVolume` to create Storage only. 2) Add functions to handle KataVirtualVolume device and construct corresponding storages. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-11-23 23:07:32 +08:00
ChengyuZhu6	bd099fbda9	runtime: extend SharedFile to support mutiple storage devices To enhance the construction and administration of `Katavirtualvolume` storages, this commit expands the 'sharedFile' structure to manage both rootfs storages(`containerStorages`) including `Katavirtualvolume` and other data volumes storages(`volumeStorages`). NOTE: `volumeStorages` is intended for future extensions to support Kubernetes data volumes. Currently, `KataVirtualVolume` is exclusively employed for container rootfs, hence only `containerStorages` is actively utilized. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-11-23 23:05:14 +08:00
ChengyuZhu6	e4f33ac141	runtime: add functions to create devices in KataVirtualVolume The snapshotter will place `KataVirtualVolume` information into 'rootfs.options' and commence with the prefix 'io.katacontainers.volume='. The purpose of this commit is to transform the encapsulated KataVirtualVolume data into device information. Fixes: #8495 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Feng Wang <feng.wang@databricks.com> Co-authored-by: Samuel Ortiz <sameo@linux.intel.com> Co-authored-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-11-23 23:05:13 +08:00
Dan Mihai	756022787c	Merge pull request #8239 from Sumynwa/sumsharma/fix_configmap_update_propagation runtime: Fix configmap/secrets updates with FS sharing disabled	2023-11-23 06:50:53 -08:00
Chelsea Mafrica	98aa291c9e	runtime-rs: Add Hybrid VSOCK device handling for CH Update cloud hypervisor implementation to allow hybrid vsock device to be handled. Fixes #6692 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-11-22 14:42:09 -08:00
Gabriela Cervantes	8839ca93ba	gha: Disable stratovirt for gha metrics This PR disables the stratovirt for gha metrics. Fixes #8496 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-22 16:17:31 +00:00
briwan01	231b9dfd9d	runtime-rs/clh: Fix unable to boot container In the case of Cloud Hypervisor running on arm64 architecture, only arm AMBA UART (pl011) is supported as the TTY. Consequently, when enabling Hypervisor debug mode, it's essential to configure the console as "ttyAMA0" rather than "ttyS0 Fixes: #8381 Signed-off-by: briwan01 <brian.wang@arm.com>	2023-11-22 17:52:11 +08:00
GabyCT	358f32e8bb	Merge pull request #8467 from GabyCT/topic/fixresult metrics: Fix result finding in tensorflow benchmark	2023-11-21 13:41:46 -06:00
Fabiano Fidêncio	45a41c3431	Merge pull request #8481 from ChengyuZhu6/guest-kernel kernel: backport erofs patch to 6.1.52 guest kernel	2023-11-21 12:22:24 +01:00
Fabiano Fidêncio	8425c78c91	Merge pull request #8476 from fidencio/topic/gha-pass-rust-runtime-to-kata-deploy tests: k8s: Allow passing rust-runtime env var to kata-deploy	2023-11-21 11:09:01 +01:00
Chao Wu	6a6c3c53b5	Merge pull request #8450 from adamqqqplay/vhost-user-general dragonball: add vhost-user connection management logic	2023-11-21 16:05:17 +08:00
ChengyuZhu6	6de01eacfd	kernel: backport erofs patch to 6.1.52 guest kernel Backport the erofs patch from linux kernel to solve the error #8083 Fixes: #8083 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Co-authored-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2023-11-21 15:22:40 +08:00
Amulyam24	d8a8cc4491	tools: install oras from source on ppc64le Since the release is not yet out for ppc64le, build oras from source and use it. Fixes: #8458 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-11-21 11:38:20 +05:30
Amulyam24	08f3603123	tools: fix static build of qemu and shimv2 on ppc64le - statically linked qemu requires slof.bin to run, hence remove it from blacklist - By default, initrd is used for Power, modify the configuration.toml accordingly Fixes: #8458 Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>	2023-11-21 11:38:20 +05:30
Alex.Lyn	4fd2914a33	Merge pull request #7932 from Apokleos/wrap-virtiofs-in-dm runtime-rs: bringing virtio-fs device in device-manager	2023-11-21 13:48:15 +08:00
Huang Jianan	a9571398a6	dragonball: add test utils for vhost-user The test utils will be used by the upcoming feature tests: vhost-user-net, vhost-user-blk and vhost-user-fs. Signed-off-by: Beiyue <beiyue@linux.alibaba.com> Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>	2023-11-21 09:51:56 +08:00
Qinqi Qu	a6a399d5bc	dragonball: add vhost-user connection management logic The vhost-user connection management logic will be used by the upcoming features: vhost-user-net, vhost-user-blk and vhost-user-fs. Fixes: #8448 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Qinqi Qu <quqinqi@linux.alibaba.com> Signed-off-by: Huang Jianan <jnhuang@linux.alibaba.com>	2023-11-21 09:51:48 +08:00
Fabiano Fidêncio	9445a967b6	Merge pull request #8471 from ChengyuZhu6/kata-virtual-volume runtime: Introduce `KataVirtualVolume` structure into go runtime	2023-11-20 21:58:27 +01:00
Fabiano Fidêncio	8002de895a	Merge pull request #8439 from fidencio/topic/kata-manager-install-a-given-kata-tarball utils: kata-manager: Allow installing kata from a given tarball	2023-11-20 20:02:25 +01:00
Wainer Moschetta	728565d1e4	Merge pull request #7046 from stevenhorsman/remote-hypervisor-cherry-picks CC: Remote hypervisor merge to main	2023-11-20 15:22:37 -03:00
Chao Wu	5ee8829700	Merge pull request #8451 from openanolis/chao/pci	2023-11-21 00:29:22 +08:00
Fabiano Fidêncio	41f3f6f93e	Merge pull request #8465 from justxuewei/rename-virtio dragonball: Uniform the spelling of Virtio	2023-11-20 16:31:33 +01:00
Hyounggyu Choi	506b127df8	Merge pull request #8478 from BbolroC/set-default-allowed_hypervisor_annotations kata-deploy: Set a default value for ALLOWED_HYPERVISOR_ANNOTATIONS	2023-11-20 15:39:56 +01:00
alex.lyn	fe62e656a7	runtime-rs: Name the ShareFs Mount Option type more accurately Fixes: #7915 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-20 20:05:50 +08:00
alex.lyn	856315ff87	runtime-rs: bringing virtio-fs device in device-manager It mainly focus on the two parts: (1) redesign the ShareFsConfig with ShareFsMountConfig The device mount operation must depend on the fact that sharefs device exists, and re-design the structure of SharesFsConfig and move the ShareFsMountConfig into it with Option type, which is to describe the relation between ShareFsConfig and ShareFsMountConfig. (2) move virtiofs into device manager Currently, virtio-fs is still outside of the device manager. To do Enhancement of device manager, it will bring virtio-fs device in device-manager for unified management Fixes: #7915 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-20 20:04:47 +08:00
Chao Wu	b3318e59eb	Merge pull request #8332 from Apokleos/bugfix-directvol-multicontainers runitme-rs/bugfix: kata pod with multi-containers sharing one direct volume	2023-11-20 19:37:58 +08:00
Hyounggyu Choi	c489f1f504	kata-deploy: Set a default value for ALLOWED_HYPERVISOR_ANNOTATIONS As a follow-up PR for #8404, this is to set a default value for an environment variable `ALLOWED_HYPERVISOR_ANNOTATIONS`. This will prevent a pod launching without an explicit configuration for the variable from getting into a `CrashLoop` state. Fixes: #8477 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-11-20 12:33:34 +01:00
Chao Wu	ee55897827	fmt: refactor in pci & balloon 1. merge hashmap get logic according to Xuewei suggestion. 2. do cargo fmt Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-11-20 17:53:51 +08:00
Chao Wu	baf3db9e6e	Dragonball: add PCI bus and PCI interrupt support in mptable Spec In order to support PCI VFIO functionality in Dragonball, we should first add PCI bus and PCI device Interrupt information in Dragonball mptable setup process. This patch add : 1. pci_legacy_irqs transfered to setup_mptable function. 2. pci bus support in mptable mem 3. pci interrupt support in mptable mem fixes: #8449 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-11-20 17:53:51 +08:00
Xuewei Niu	c305634b4e	dragonball: Uniform the spelling of Virtio The changes are: - VirtIoError -> VirtioError - VirtIoResult -> VirtioResult - VirtIoDevice -> VirtioDevice Fixes: #8464 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-20 17:00:58 +08:00
Fabiano Fidêncio	44899d4cdf	tests: k8s: Allow passing rust-runtime env var to kata-deploy This will be used for selecting the correct runtimes and runtimeclasses to be deployed with kata-deploy. Fixes: #8475 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-20 09:13:05 +01:00
ChengyuZhu6	1353b14e6c	runtime: Add KataVirtualVolume struct in runtime Add the corresponding data structure in the runtime part according to kata-containers/kata-containers/pull/7698. Fixes: #8472 Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-11-19 13:30:32 +08:00
Greg Kurz	110574353d	Merge pull request #8345 from beraldoleal/issues/8343 Fixes make check errors	2023-11-17 17:38:29 +01:00
Gabriela Cervantes	37916e7a58	metrics: Fix result finding This PR fixes the result finding for the general throughput for the tensorflow benchmark. Fixes #8466 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-17 15:59:51 +00:00
stevenhorsman	ebf9d2725a	kata-deploy: Add remote shim - Add remote to the list of shims in kata-deploy and kata-cleanup Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-11-17 13:38:49 +00:00
Fabiano Fidêncio	d5cf169adf	kata-deploy: Add missing kata-remote runtimeclass It's CCv0 specific for now, and it's needed as the Operator is now delegating the runtimeclass creation to the kata-deploy daemonset. Fixes: #7550 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> (cherry picked from commit `2df6cb7609`)	2023-11-17 13:34:40 +00:00
Pradipta Banerjee	39e8c84269	runtime: Add support for key annotations to remote hyp In order to support different pod VM instance type via remote hypervisor implementation (cloud-api-adaptor), we need to pass machine_type, default_vcpus and default_memory annotations to cloud-api-adaptor. The cloud-api-adaptor then uses these annotations to spin up the appropriate cloud instance. Reference PR for cloud-api-adaptor https://github.com/confidential-containers/cloud-api-adaptor/pull/1088 Fixes: #7140 Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> (based on commit `004f07f076`)	2023-11-17 13:33:27 +00:00
Yohei Ueda	2910e333a8	runtime: Use static resource in remote hypervisor This patch updates the template configuration file for the remote hypervisor to set static_sandbox_resource_mgmt to be true. The remote hypervisor uses the peer pod config to determine the sandbox size, so requires this to be set to true by default. Fixes: #6616 Signed-off-by: Yohei Ueda <yohei@jp.ibm.com> (based on commit `938447803b`)	2023-11-17 13:33:27 +00:00
stevenhorsman	26d56678a9	config: Add initial remote hypervisor config - Remote hypervisor template config - Add annotation enablement for machine_type, default_memory and default_vcpus for flexible instance types Fixes: #6349 Signed-off-by: stevenhorsman <steven@uk.ibm.com> (based on commits `7c9a791d67` and `335a456425`)	2023-11-17 13:33:24 +00:00
stevenhorsman	ad63439a3e	runtime: Update the remote hypervisor config Add the SELinux setting to ensure it is passed through to the remote hypervisor Fixes: #5936 Signed-off-by: stevenhorsman <steven@uk.ibm.com> (based on commit `3ef2fd1784`)	2023-11-17 13:32:52 +00:00
Lei Li	50e0d43dad	runtime: Support privileged containers in peer pod VM This patch fixes the issue of running containers with privileged as true. See the discussion at this URL for the details. https://github.com/confidential-containers/cloud-api-adaptor/issues/111 Signed-off-by: Lei Li <cdlleili@cn.ibm.com> (based on commit `c3e6b66051`)	2023-11-17 13:32:52 +00:00
Yohei Ueda	57d4dd8e57	runtime: Support the remote hypervisor type This patch adds the support of the remote hypervisor type. Shim opens a Unix domain socket specified in the config file, and sends TTPRC requests to a external process to control sandbox VMs. Fixes #4482 Co-authored-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> Signed-off-by: Yohei Ueda <yohei@jp.ibm.com> (based on commit `f9278f22c3`)	2023-11-17 13:32:49 +00:00
Yohei Ueda	8ac9a22097	runtime: Add hypervisor proto to support peer pod VMs This patch adds a protobuf definiton of the remote hypervisor type. Signed-off-by: Yohei Ueda <yohei@jp.ibm.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> (based on commit `150e8aba6d`)	2023-11-17 13:31:09 +00:00
Fabiano Fidêncio	f8322ffad2	Merge pull request #7796 from WenyuanLau/7794/StratoVirt_VMM_support StratoVirt: add support for a lightweight VMM StratoVirt in Kata	2023-11-17 10:53:17 +01:00
Fabiano Fidêncio	d6d9b45007	Merge pull request #7931 from BbolroC/migrate-to-gha-s390x tests\|gha: add containerd and k8s tests for s390x	2023-11-17 10:24:14 +01:00
Sumedh Alok Sharma	4aaf54bdad	runtime: Fix configmap/secrets update propagation with FS sharing disabled This PR fixes k8's configmap/secrets etc update propagation when filesystem sharing is disabled. The commit introduces below changes with some limitations: - creates new timestamped directory in guest - updates the '..data' symlink - creates user visible symlinks to newly created secrets. - Limitation: The older timestamped directory and stale user visible symlinks exist in guest due to missing DELETE api in agent. Fixes: #7398 Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>	2023-11-17 13:01:23 +05:30
Hyounggyu Choi	0c7aa1f307	gha: Set nightly test for s390x to 5 UTC This is to push back the time for the s390x nightly test to 5 a.m. UTC. Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-11-17 05:47:44 +01:00
Hyounggyu Choi	ffe1ea52cf	tests\|gha: add containerd and k8s tests for s390x As part of the CI migration, this PR is to add workflows for containerd and k8s for s390x. Fixes: #7930 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-11-16 18:14:26 +01:00
GabyCT	8586308dcd	Merge pull request #8453 from GabyCT/topic/udpreadme metrics: Add iperf udp information to README	2023-11-16 10:38:56 -06:00
GabyCT	494174a98e	Merge pull request #8421 from GabyCT/topic/enablestressng tests: Enable stressng scalability test	2023-11-16 10:25:05 -06:00
James O. D. Hunt	4a4fc9c648	CODEOWNERS: Expand scope Improve the `CODEOWNERS` file by specifying more groups. Since GitHub automatically checks the `CODEOWNERS` file when a PR is created and adds all matching groups as reviewers for the PR, this may help reduce the PR backlog since the right people will be alerted and requested to review the PR. That should improve the quality of reviews (and thus the quality of the landed code). It may also have a positive effect on PR velocity. > Note: > > This PR combines the other `CODEOWNERS` files so we have > a single, visible, top-level file. See: https://github.com/kata-containers/community/issues/253 Fixes: #3804. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-16 16:09:20 +00:00
Fabiano Fidêncio	10996f3bbb	Merge pull request #8460 from ldoktor/artifacts gha: Keep kata tarballs for 15 days	2023-11-16 13:56:25 +01:00
Liu Wenyuan	c77e990c3e	tests: Enable tests for StratoVirt hypervisor This commit enables StratoVirt hypervisor to be tested in kata GHA, incluing k8s, metrics, cri-containerd, nydus and so on. Meanwhile, adding some unit tests for StratoVirt to make sure it works. Fixes: #7794 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-11-16 20:47:26 +08:00
Liu Wenyuan	14d8790d83	kata-deploy: Add StratoVirt support to deploy process Allow kata-deploy process to pull StratoVirt from release binaries, and add them as a part of kata release. Fixes: #7794 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-11-16 20:47:26 +08:00
Liu Wenyuan	9542211e71	configuration: add configuration for StratoVirt hypervisor. Add configuration-stratovirt.toml.in to generate the StratoVirt configuration, and parser to deliver config to StratoVirt. Fixes: #7794 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-11-16 20:47:26 +08:00
Liu Wenyuan	561c85be54	build: Makefile for StratoVirt hypervisor Add support for building StratoVirt hypervisor, including x86_64 and arm64. Fixes: #7794 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-11-16 20:47:26 +08:00
Liu Wenyuan	26966c8469	virtcontainers: Add StratoVirt as a supported hypervisor Initial support of the MicroVM machine type of StratoVirt hypervisor for the kata go runtime. Fixes: #7794 Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>	2023-11-16 20:47:24 +08:00
Fabiano Fidêncio	edb791315e	Merge pull request #7987 from BbolroC/nightly-ci-s390x tests\|gha: add nightly tests for s390x	2023-11-16 11:45:32 +01:00
Lukáš Doktor	8959e3ca05	gha: Keep kata tarballs for 15 days these tarballs are useful for debugging and re-running jobs, keep them for 15 days. Fixes: #8000 Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>	2023-11-16 10:35:20 +01:00
Gabriela Cervantes	9cc6908b09	stability: Update stressng to run on the gha This PR updates the stressng test to run on the gha for kata CI. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-15 19:34:36 +00:00
Gabriela Cervantes	9d8eb298c3	metrics: Add iperf udp information to README This PR adds the iperf udp information to the network README for the kata metrics CI. Fixes #8452 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-15 15:22:06 +00:00
Gabriela Cervantes	4b7854b668	stability: Add missing dependencies This PR adds missing dependencies to run stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-15 14:51:14 +00:00
Gabriela Cervantes	79177bb9cb	tests: Enable stressng scalability test This PR enables the stressng scalability test for kata CI. Fixes #8420 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-11-15 14:51:14 +00:00
Xuewei Niu	f18794d880	Merge pull request #8426 from justxuewei/vhost-rm-virtio-net dragonball: Remove vhost-net dependency on virtio-net	2023-11-15 10:39:27 +08:00
alex.lyn	ba632ba825	runitme-rs: kata with multi-containers sharing one direct volume When multiple containers in a kata pod share one direct volume, it's important to make sure that the corresponding block device is only mounted once in the guest. This means that there should be only one mount entry for the device in the mount information. Fixes: #8328 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-15 10:37:01 +08:00
alex.lyn	d7594d830c	runtime-rs: correct the path from cid to device_id. When a direct volume is used by multiple containers in Kata, Generating many shared paths with cids will cause IO error as the result of one direct volume mounts more than once. To correct it, use the device_id instead of cid which ensures that the guest only mounts the FS once. Fixes: #8328 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-15 10:30:39 +08:00
Fabiano Fidêncio	906f6b7380	Merge pull request #8431 from UiPath/fix-vsock-packets-drop kernel: Fix vsock packets drop when the driver initializes	2023-11-14 18:52:53 +01:00
Fabiano Fidêncio	1699b84f13	utils: kata-manager: Remove $enable_debug from the install_kata call This was added as part of `d4d65bed38`, but install_kata has never actually used the passed enable_debug var. With this in mind, let's just remove it. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-14 17:34:03 +01:00
Fabiano Fidêncio	38d2edd83b	utils: kata-manager: Allow installing kata from a given tarball With this change, we give the users the change to try kata-containers with their own pre-built tarball. This will become very useful in the CI context, as we won't be downloading a specific version of kata-containers, but rather installing whatever was built in previous steps of the CI pipeline. Fixes: #8438 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-14 17:34:01 +01:00
Fabiano Fidêncio	fd9b6d6837	Merge pull request #7623 from fidencio/topic/runtime-improve-vcpu-allocation-on-host-side runtime: Improve vCPU allocation for the VMMs	2023-11-14 14:10:54 +01:00
Alexandru Matei	bfd1ce30e1	kernel: Fix vsock packets drop when the vsock driver starts The virtio vsock driver has a small window during initialization where it can silently drop replies to connection requests. Because no reply is sent, kata waits for 10 seconds and in the end it generates a connection timeout error in HybridVSockDialer. Fixes: #8291 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2023-11-14 11:02:52 +02:00
Xuewei Niu	49c2e6e23c	dragonball: Remove vhost-net dependency on virtio-net This patch is to remove vhost-net dependency on virtio-net for dbs-virtio-devices crate. Then, the feature of vhost-net is able to enable without enabling virtio-net device, error, etc. Fixes: #8423 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-14 15:35:10 +08:00
Fabiano Fidêncio	dffc6f611c	Merge pull request #8432 from justxuewei/rm-ci-docker-and-nerdctl gha: Remove docker and nerdctl tests from ci.yaml	2023-11-14 08:34:18 +01:00
alex.lyn	4d65c2e8a2	runtime-rs: introduce `update_device` in trait Hypervisor Introduce the `update_device` trait in Hypervisor to enable device updates for VMMs.This trait will initially be utilized for virtiofs Mount operations. Fixes: #7915 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-11-14 11:56:36 +08:00
Xuewei Niu	481486c6d5	gha: Remove docker and nerdctl tests from CI Two workflows, run-nerdctl-tests-on-garm.yaml and run-docker-tests-on-garm.yaml, are removed from commit `b481d39`. However, they are referenced by CI workflow. It leads to the CI not working properly. This patch is to remove those files from ci.yaml. Fixes: #8433 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-14 10:44:14 +08:00
Fabiano Fidêncio	c858ea1460	Merge pull request #8174 from fidencio/topic/re-revert-8115 ci: Re-add tracing tests and move docker/nerdctl to the basic-ci-amd64.yaml file	2023-11-13 18:19:40 +01:00
James O. D. Hunt	a781ce33b0	Merge pull request #8383 from jodh-intel/kata-manager-add-list-option utils: kata-manager: Add option to list versions	2023-11-13 16:18:36 +00:00
David Esparza	98ec34b04c	Merge pull request #8338 from dborquez/improve_metrics_init_environment metrics: Fix function that completely stops kata containers before running a test	2023-11-13 09:35:27 -06:00
Fabiano Fidêncio	b481d396fc	gha: Move docker / nerdctl content to the basic-ci-amd64 file There's no need to keep those as separate files, and by having those in the basic-ci-amd64.yaml file actually helps us to avoid the undocummented GHA limitation about the number of files imported. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-13 15:34:00 +01:00
Fabiano Fidêncio	3c735c236d	ci: tracing: Adapt to basic-ci-amd64.yaml Peng Tao made this move as part of `1280f85343`, and here we're simply adjusting to the move. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-13 15:27:39 +01:00
Fabiano Fidêncio	ee17fe9d20	Revert "gha: ci: Revert tracing test PR to unbreak CI" This reverts commit `e9bd852113`.	2023-11-13 15:27:39 +01:00
James O. D. Hunt	4d5b23b73a	Merge pull request #8419 from jodh-intel/2023-11-10-fix-tdx runtime-rs: ch: Fix TDX	2023-11-13 11:58:16 +00:00
James O. D. Hunt	7f666f783d	runtime-rs: ch: Fix TDX PR #8311 inadvertently broke the runtime-rs / Cloud Hypervisor TDX handling. It also introduced unrecoverable failure scenarios. Hence, replace slow, fallible regex matching in logging fast path with single pass non-failing multi-string log level matching. Also, added a unit test for `parse_ch_log_level()`. Fixes: #8418. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-13 08:49:47 +00:00
Xuewei Niu	0a9125e629	Merge pull request #7675 from justxuewei/vhost-net	2023-11-12 20:38:18 +08:00
Xuewei Niu	d1deaf0538	dragonball: Minor changes for a comment from Bian - Add feature control for InsertNetworkDevice. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-12 14:14:10 +08:00
Xuewei Niu	e4f83e27c4	dragonball: vhost-net set_offload with acked features set_offload() for tap devices depends on acked features. Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-12 14:10:39 +08:00
Xuewei Niu	6cd572dbbb	dragonball: Minor changes for Chao's comments - Remove two panic statements from InsertNetworkDevice test. - Rename `NUM_QUEUES` to `DEFAULT_NUM_QUEUES`, `QUEUE_SIZE` to `DEFAULT_QUEUE_SIZE` for vhost-net and virtio-net. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-12 14:10:39 +08:00
Xuewei Niu	dcdf3c6556	runtime-rs: Supply missing fields of NetworkConfig `test_networkconfig_to_netconfig` from clh depends on `NetworkConfig` which has some new fields in this PR. Therefore, this commit gives the test missing fields. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-12 14:10:39 +08:00
Xuewei Niu	58e9709c1f	dragonball: Changes for ZizhengBian's comments - Dragonball's vhost-net feature not depends on virtio-net feature. - Remove `TapError` from dbs-virtio-devices's Error, and add `VirtioNet` and `VhostNet` two fields. - Downgrade visiblity of two fields of `VhostNetDeviceMgr` from `pub(crate)`. - File an issue to record a todo for network rate limiter. - Print internal errors with `{0:?}. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-12 14:10:33 +08:00
Fabiano Fidêncio	849253e55c	tests: Add a simple test to check the VMM vcpu allocation As we've done some changes in the VMM vcpu allocation, let's introduce basic tests to make sure that we're getting the expected behaviour. The test consists in checking 3 scenarios: * default_vcpus = 0 \| no limits set * this should allocate 1 vcpu * default_vcpus = 0.75 \| limits set to 0.25 * this should allocate 1 vcpu * default_vcpus = 0.75 \| limits set to 1.2 * this should allocate 2 vcpus The tests are very basic, but they do ensure we're rounding things up to what the new logic is supposed to do. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-10 18:26:01 +01:00
Fabiano Fidêncio	5e9cf75937	vc: utils: Rename CalculateMilliCPUs() to CalculateCPUsF() With the change done in the last commit, instead of calculating milli cpus, we're actually converting the CPUs to a fraction number, a float. Let's update the function name (and associated vars) to represent that change. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-10 18:26:01 +01:00
Fabiano Fidêncio	e477ed0e86	runtime: Improve vCPU allocation for the VMMs First of all, this is a controversial piece, and I know that. In this commit we're trying to make a less greedy approach regards the amount of vCPUs we allocate for the VMM, which will be advantageous mainly when using the `static_sandbox_resource_mgmt` feature, which is used by the confidential guests. The current approach we have basically does: * Gets the amount of vCPUs set in the config (an integer) * Gets the amount of vCPUs set as limit (an integer) * Sum those up * Starts / Updates the VMM to use that total amount of vCPUs The fact we're dealing with integers is logical, as we cannot request 500m vCPUs to the VMMs. However, it leads us to, in several cases, be wasting one vCPU. Let's take the example that we know the VMM requires 500m vCPUs to be running, and the workload sets 250m vCPUs as a resource limit. In that case, we'd do: * Gets the amount of vCPUs set in the config: 1 * Gets the amount of vCPUs set as limit: ceil(0.25) * 1 + ceil(0.25) = 1 + 1 = 2 vCPUs * Starts / Updates the VMM to use 2 vCPUs With the logic changed here, what we're doing is considering everything as float till just before we start / update the VMM. So, the flow describe above would be: * Gets the amount of vCPUs set in the config: 0.5 * Gets the amount of vCPUs set as limit: 0.25 * ceil(0.5 + 0.25) = 1 vCPUs * Starts / Updates the VMM to use 1 vCPUs In the way I've written this patch we introduce zero regressions, as the default values set are still the same, and those will only be changed for the TEE use cases (although I can see firecracker, or any other user of `static_sandbox_resource_mgmt=true` taking advantage of this). There's, though, an implicit assumption in this patch that we'd need to make explicit, and that's that the default_vcpus / default_memory is the amount of vcpus / memory required by the VMM, and absolutely nothing else. Also, the amount set there should be reflected in the podOverhead for the specific runtime class. One other possible approach, which I am not that much in favour of taking as I think it's less clear, is that we could actually get the podOverhead amount, subtract it from the default_vcpus (treating the result as a float), then sum up what the user set as limit (as a float), and finally ceil the result. It could work, but IMHO this is less clear, and less explicit on what we're actually doing, and how the default_vcpus / default_memory should be used. Fixes: #6909 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>	2023-11-10 18:25:57 +01:00
Fabiano Fidêncio	8d958b8c47	Merge pull request #8406 from microsoft/danmihai1/policy-doc docs: add agent policy documentation	2023-11-10 17:19:04 +01:00
James O. D. Hunt	f588d31324	Merge pull request #8374 from jodh-intel/kata-manager-check-dl-url-count utils: kata-manager: Ensure only one download URL	2023-11-10 13:19:07 +00:00
Fabiano Fidêncio	b0157ad73a	runtime: confidential: Do not set the max_vcpu to cpu We don't have to do this since we're relying on the `static_sandbox_resource_mgmt` feature, which gives us the correct amount of memory and CPUs to be allocated. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-10 12:58:20 +01:00
Steve Horsman	b23952c852	Merge pull request #8309 from gkurz/update-release-process-doc Update release process documentation	2023-11-10 09:44:18 +00:00
James O. D. Hunt	0ead018d0a	utils: kata-manager: Add Docker details to list output Add Docker version details to the output of the list versions CLI option. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 09:19:56 +00:00
James O. D. Hunt	be3044fd01	utils: kata-manager: Add option to list versions Add a command-line option to list the installed and available versions of Kata and containerd. Fixes: #8355. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 09:19:56 +00:00
James O. D. Hunt	9969f5a94a	utils: kata-manager: Make test container name more unique Rather than creating a container called `test-kata`, prefix with the script name to make it a bit "more unique" and less likely for users to have an existing container with the test container name. The new test container name is `kata-manager-sh-test-kata`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 09:19:56 +00:00
James O. D. Hunt	436d7d1275	utils: kata-manager: Improve usage message Update the usage to show that the latest Kata version can also be queried using `kata-ctl`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 08:29:14 +00:00
James O. D. Hunt	1625a5ce48	utils: kata-manager: Improve version check Update `github_get_latest_release()` to use `sort -V` rather than sub-sorting on the major, minor and patch level version number elements. The new approach is safer and more accurate. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 08:29:14 +00:00
James O. D. Hunt	c72a27e219	utils: kata-manager: Ensure only one download URL Add an extra sanity check to ensure that only a single download URL is found for the specified release version. Fixes: #8364. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 08:27:23 +00:00
James O. D. Hunt	839f6c3d44	utils: kata-manager: Improve info messages Improve some of the information messages a little by adding more detail and quoting file names. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-10 08:27:20 +00:00
Archana Shinde	21e45bebc8	Merge pull request #8376 from fidencio/topic/kata-manager-add-support-for-docker-installation kata-manager: Add support for Docker CLI installation	2023-11-09 22:11:50 -08:00
Chao Wu	a62fb83c91	Merge pull request #8169 from openanolis/chao/fix_typo_shm runtime-rs: fix a typo in shm	2023-11-10 14:00:11 +08:00
Chao Wu	820b578aa3	Merge pull request #8370 from gaohuatao-1/bugfix agent: update AGENT_THREADS metrics value	2023-11-10 13:16:29 +08:00
gaohuatao	78df1bb851	agent: update AGENT_THREADS metrics value Fixes: #8369 Signed-off-by: gaohuatao <gaohuatao@bytedance.com>	2023-11-10 10:39:57 +08:00
Chao Wu	afb002c25c	runtime-rs: fix a typo in shm is_shim_volume should be is_shm_volume in shm_volume mod. fixes: #8168 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-11-10 10:36:58 +08:00
Fabiano Fidêncio	2b937400fe	Merge pull request #8404 from fidencio/topic/kata-deploy-allow-users-to-enable-hypervisor-annotations kata-deploy: Allow users to set hypervisor annotations	2023-11-09 17:44:52 +01:00
Dan Mihai	bc49c553ef	docs: add agent policy documentation Add initial agent policy documentation. Fixes: #7671 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-11-09 16:43:00 +00:00
Fabiano Fidêncio	5d10aed9ba	kata-manager: Make containerd_config a global var As "/etc/containerd/config.toml" is used from more than one place, let's just make it a global var. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 13:47:52 +01:00
Fabiano Fidêncio	66d1b2c173	kata-manager: Add support for docker installation Add support for also installing the Docker CLI, giving users the chance to try Kata Containers with docker in the same way we provide users the chance to try Kata Containers with `ctr`. Fixes: #8357 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 13:47:52 +01:00
Fabiano Fidêncio	1a81989d20	tests: k8s: Use the "ALLOWED_HYPERVISOR_ANNOTATIONS" The current kata-deploy code has been doing a `sed` to add allowed hypervisor annotations, so CBL mariner can be tested with their own kernel and initrd. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 13:42:31 +01:00
Fabiano Fidêncio	023c4a17cf	kata-deploy: Allow users to set hypervisor annotations Currently the only way one can specify allowed hypervisor annotations is during build time, which is a big issue for users grabbing kata-deploy as we provide. Fixes: #8403 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 13:42:31 +01:00
Fabiano Fidêncio	0352f1e029	kata-manager: Allow passing a specific tool to test_installation Right now we're only testing with `ctr` and there's no change in behaviour with this commit. However, allowing to pass a tool to run the tests with gives us an easier time when expanding kata-manager to support, for instance, docker and nerdctl. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 11:24:37 +01:00
Fabiano Fidêncio	50df1129ea	Merge pull request #8411 from fidencio/topic/fix-k3s-deployment gha: Fix regex used to get kubectl version from the k3s version	2023-11-09 10:44:34 +01:00
Fabiano Fidêncio	455b7bf776	gha: k3s: Avoid unnecessary escape There's no reason to escape the first + on the +k3s[0-9]\+ regex, as shown here: ```sh ubuntu@k3s:~$ /usr/local/bin/k3s kubectl version --short 2>/dev/null \| \ grep "Client Version" \| \ sed \ -e 's/Client Version: //' \ -e 's/+k3s[0-9]\+//' v1.27.7 ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 08:42:25 +01:00
Fabiano Fidêncio	e7890ee8f6	gha: Fix regex used to get kubectl version from the k3s version It seems that with the new k3s release, they've bumped their kubectl version from x.y.z+k3s1 to x.y.z+k3s2. Let's ensure our regexp is more generic and future proof for such changes. Fixes: #8410 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-09 07:08:02 +01:00
Archana Shinde	1611723465	Merge pull request #8379 from likebreath/1103/clh_v36.0 Upgrade to Cloud Hypervisor v36.0	2023-11-08 21:10:41 -08:00
Archana Shinde	268d4d622f	Merge pull request #8389 from justxuewei/vm-capable-test runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue	2023-11-08 12:14:04 -08:00
Archana Shinde	92a517156c	Merge pull request #8367 from amshinde/add-nerdctl-ipvlan-test network: Fix network hotplug for ipvlan and macvlan endpoints for qemu and add tests	2023-11-08 11:45:13 -08:00
Chelsea Mafrica	83e731328f	Merge pull request #8023 from cmaf/runtime-rs-ch-pause-resume runtime-rs: Update status for pause and resume	2023-11-08 11:34:47 -08:00
Hyounggyu Choi	84b5618733	tests\|gha: add internal nightly tests for s390x This is to add a workflow for internal nightly tests for s390x in Jenkins. Fixes: #7986 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-11-08 16:07:41 +01:00
Xuewei Niu	acd9057c7b	runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue TestCheckHostIsVMContainerCapable removes sysModuleDir to simulate a case that the kernel modules are not loaded. However, checkKernelModules() executes modprobe <module> if a module not found in that directory. Loading those modules is required to be denied temporarily. Fixes: #8390 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 22:40:08 +08:00
Fupan Li	100a73d2fd	Merge pull request #7531 from justxuewei/device-cgroup agent: Restrict device access at upper node of container's cgroup	2023-11-08 22:01:48 +08:00
Chao Wu	4435c1efd7	Merge pull request #8386 from jodh-intel/runtime-rs-ch-tidy-up runtime-rs: ch: Simplify VSOCK error handling	2023-11-08 17:31:40 +08:00
Xuewei Niu	023d8dc01e	agent: Changes according to Pan's comments - Disable device cgroup restriction while pod cgroup is not available. - Remove balcklist-related names and change whitelist-related names to allowed_all. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:08 +08:00
Xuewei Niu	136fb76222	tests: Add a integrated test for device cgroup `TestDeviceCgroup` is added to cri-containerd's integration tests. The test launches two containers. Each container has a block device. It checks the validity of device cgroup. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	b5f3a8cb39	agent: Fix container launching failure with systemd cgroup FSManager of systemd cgroup manager is responsible for setting up cgroup path. The container launching will be failed if the FSManager is in read-only mode. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	6477825195	agent: Minor changes according to Zhou's comments The changes include: - Change to debug logging level for resources after processed. - Remove a todo for pod cgroup cleanup. - Add an anyhow context to `get_paths_and_mounts()`. - Remove code which denys access to VMROOTFS since it won't take effect. If blackmode is in use, the VMROOTFS will be denyed as default. Otherwise, device cgroups won't be updated in whitelist mode. - Add a unit test for `default_allowed_devices()`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	cec8044744	agent: Make devcg_info optional for LinuxContainer::new() The runk is a standard OCI runtime that isnt' aware of concept of sandbox. Therefore, the `devcg_info` argument of `LinuxContainer::new()` is unneccessary to be provided. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Xuewei Niu	ef4c3844a3	agent: Restrict device access at upper node of container's cgroup The target is to guarantee that containers couldn't escape to access extra devices, like vm rootfs, etc. Assume that there is a cgroup, such as `/A/B`. The `B` is container cgroup, and the `A` is what we called pod cgroup. No matter what permissions are set for the container (`B`), the `A`'s permission is always `a : rwm`. It leads that containers could acquire permission to access to other devices in VM that not belongs to themselves. In order to set devices cgroup properly, the order of setting cgroups is that the pod cgroup comes first and the container cgroup comes after. The `Sandbox` has a new field, `devcg_info`, to save cgroup states. To avoid setting container cgroup too early, an initialization should be done carefully. `inited`, one of the states, is a boolean to indicate if the pod cgroup is initialized. If no, the pod cgroup should be created firstly, and set default permissions. After that, the pause container cgroup is created and inherits the permissions from the pod cgroup. If whitelist mode which allows containers to access all devices in VM is enabled, then device resources from OCI spec are ignored. This feature not supports systemd cgroup and cgroup v2, since: - Systemd cgroup implemented on Agent hasn't supported devices subsystem so far, see: https://github.com/kata-containers/kata-containers/issues/7506. - Cgroup v2's device controller depends on eBPF programs, which is out of scope of cgroup. Fixes: #7507 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-08 09:39:07 +08:00
Archana Shinde	c075fa6817	tests: Add test with nerdctl to verify macvlan support Add test to verify kata supports macvlan networks. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-07 10:13:51 -08:00
Archana Shinde	07db673eb9	tests: Add test with nerdctl to verify ipvlan support Add test to verify kata supports ipvlan networks. This test can be bit tricky as it requires knowledge about host interfaces to be used as a master for the ipvlan network. However, with github actions, we can assume interface called eth0 to be present on the host and functioning. Fixes: #8366 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-07 10:13:51 -08:00
Archana Shinde	a6272733e7	network: Fix network hotplug for ipvlan and macvlan endpoints. Since moving from network coldplug to hotplug, the only case verified was veth endpoints. Support for network hotplug for ipvlan and macvlan was broken/not added. Fix it. Fixes: #8391 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-07 10:13:51 -08:00
James O. D. Hunt	59d0d4caff	runtime-rs: ch: Simplify VSOCK error handling Remove the redundant `VmConfigError::EmptyVsockSocketPath` error from the Cloud Hypervisor config crate since this scenario is already handled by the `VsockConfigError::NoVsockSocketPath` error. Fixes: #8385. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-07 17:45:38 +00:00
James O. D. Hunt	bdb83f8282	runtime-rs: ch: Remove unused function Remove the redundant `parse_mac()` function: this was never used and we already have an implementation in `crates/resource/src/network/utils/mod.rs`. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-11-07 17:45:38 +00:00
Wainer Moschetta	949ac4d810	Merge pull request #8217 from beraldoleal/issues/8216 tests: fixes permission denied when running test	2023-11-07 12:25:23 -03:00
Wainer Moschetta	7f5d70f48b	Merge pull request #8061 from beraldoleal/gogo-removal-v3 Updating containerd to a GogoProtobuf free version	2023-11-07 12:18:50 -03:00
Xuewei Niu	8ea87405ed	runtime-rs: Remove virtio config from Backend Virtio-net and vhost-net share a common virtio config, and vhost-user-net uses another config, named `VhostUserConfig`. Thus, the virtio config could be added into `NetworkConfig` instead of `Backend`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-07 19:35:02 +08:00
Xuewei Niu	ad66378bf5	runtime-rs: Move Dragonball stuff out of device drivers Moving Dragonball structs convertions out of device drivers to keep driver neutral. The convertions include `NetworkBackend` to `DragonballNetworkBackend` and `NetworkConfig` to `DragonballNetworkConfig`. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-07 19:35:02 +08:00
Xuewei Niu	3e0614cdf0	dragonball: Minor changes to comments Changes include: - Merge `VhostNetDeviceError` import item. - Replace if with match in `add_vhost_net_device()` Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-07 19:35:02 +08:00
Xuewei Niu	a047331a34	runtime-rs: Network config distinguishes backends Network backends determine the virtio dataplane implementations. Common protocols include virtio-net, vhost-net and vhost-user-net, etc. Network config has a new field named `backend` to specify which protocol to use. Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-07 19:35:02 +08:00
Xuewei Niu	9203371833	dragonball: Introduce vhost-net device PLEASE NOTE THAT this pull request just implements vhost-net support for Dragonball, and adaptation for the Runtime-rs. And this pull request DOESN'T provide an item to config which backend to use. To sum up, virtio-net as a default backend is only choice for the user so far. This pull request introduces vhost-net device for the Dragonball. In addition, this pull request includes changes of Runtime-rs to improve network configuration abilities. The Dragonball part implements a vhost-net device and a vhost-net device manager, named `VhostNetDeviceMgr`, to manage vhost-net device. `NetworkInterfaceConfig` is introduced as a high-level abstract for network config. Then, the Dragonball is able to distinguish network backends, e.g. virtio-net, vhost-net, vhost-user-net(WIP), etc. The Runtime-rs part adds support of multiple network backends as well. `NetworkConfig` has a couple of new fields, like `backend`, `use_shared_irq`, etc. And Dragonball's network config structs are implmented `From` trait which allow to be converted from the Runtime-rs's network config conveniently. Fixes: #7674 Signed-off-by: Eric Ren <renzhen@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-11-07 19:35:02 +08:00
Greg Kurz	b27b4ce104	doc: No longer release the test repository Now that most of the test repository got migrated to the main Kata repository, it is no longer needed to tag the test repository when doing a release. Update the documentation accordingly by dropping all references to the test repository and only mention the Kata repository. Fixes #8302 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-11-07 10:28:43 +01:00
Greg Kurz	af2d897fb1	doc: Release now uses the official GitHub CLI The hub tool is deprecated. Releases are now based on the official gh CLI. A notable improvement : when properly setup (see [1]), gh allows to directly use HTTPS with one's GitHub credentials, instead of having to setup proper SSH access for pushes to the repo. Adjust the documentation accordingly. Fixes #8302 [1] https://docs.github.com/en/github-cli/github-cli/quickstart#prerequisites Signed-off-by: Greg Kurz <groug@kaod.org>	2023-11-07 10:22:54 +01:00
Greg Kurz	2af9419fa4	doc: No longer run kata-deploy test when releasing This is already tested by CI for every PR. Drop this step from the release process documentation. Fixes #8302 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-11-07 10:19:32 +01:00
Beraldo Leal	dd530ba8ee	tests: fixes AMD errors TestCheckHostIsVMContainerCapable is failing on AMD machines. kata-check_amd64_test.go:96 has no AMD modules, also getCPUType is missing. Fixes #8384. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:59 +00:00
Beraldo Leal	7641c19f74	runtime: bump containerd for gogo deprecation This update includes necessary changes due to the version bump of containerd and its dependencies. It's part of a broader initiative to phase out gogo protobuf, which has been deprecated, and to align with the current supported libraries. Fixes #7420. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:59 +00:00
Beraldo Leal	16fa2c39e6	protocols: replace gogo/types.Empty and Any by Google versions. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
Beraldo Leal	c61f4a8592	protocols: remove unused fieldpath option The +fieldpath option, specific to gogoprotobuf, enabled dynamic field access in protobuf messages, allowing nested fields to be accessed via string paths. This change is part of a larger effort to transition to the official Go protobuf library for better maintainability and community support. Upon review, no instances of dynamic field access were found in the codebase, confirming that the feature is not in use. By removing this unused feature, we simplify the build process and make it easier to complete the transition away from gogoprotobuf. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
Beraldo Leal	c87bc60ea0	protocols: removing unused mappings Those mappings are not used by our .proto files and there is no difference between .pb.go files generated. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
Beraldo Leal	c5d845b30a	agent: updating Cargo.lock files Probably previous changes missed updating Cargo.lock. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
Beraldo Leal	5d88c78a6e	protocols: generating agent.pb.go `a3b003c345` modified agent but agent.pb.go was not updated. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-11-06 16:49:58 +00:00
David Esparza	28e7b3467b	metrics: improving stop and remove running containers This PR makes the change to using the SIGKILL signal instead of SIGTERM to force stop each kata component before start running any metric test. Fixes: #8336 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-11-06 09:54:32 -06:00
Archana Shinde	3b2fb6a604	Merge pull request #8284 from amshinde/runtime-rs-update-device-pci-info runtime-rs: update device pci info for vfio and virtio-blk devices	2023-11-06 01:09:20 -08:00
Archana Shinde	036b7787dd	runtime-rs: Use PCI path from hypervisor for vfio devices Remove earlier functionality that tries to assign PCI path to vfio devices from the host assuming pci slots to start from 1. Get this from the hypervisor instead. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-05 21:59:44 -08:00
Archana Shinde	c3ce6a1d15	runtime-rs: Provide PCI path to the agent for virtio-block If PCI path for block device is not empty for a block device, use that as identifier for agent instead of virt path which is valid only for mmio devices. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-05 21:59:44 -08:00
Archana Shinde	a2bbbad711	runtime-rs: change hypervisor add_device trait to return device copy Block(virtio-blk) and vfio devices are currently not handled correctly by the agent as the agent is not provided with correct PCI paths for these devices. The PCI paths for these devices can be inferred from the PCI information provided by the hypervisor when the device is added. Hence changing the add_device trait function to return a device copy with PCI info potentially provided by the hypervisor. This can then be provided to the agent to correctly detect devices within the VM. This commit includes implementation for PCI info update for cloud-hupervisor for virtio-blk devices with stubs provided for other hypervisors. Removing Vsock from the DeviceType enum as Vsock currently does not implement the Device Trait, it has no attach and detach trait functions among others. Part of the reason is because these functions require Vsock to implement Clone trait as these functions need cloned copies to be passed down the hypervisor. The change introduced for returning a device copy from the add_device hypervisor trait explicitly requires a device to implement Copy trait. Hence removing Vsock from the DeviceType enum for now, as its implementation is incomplete and not currently used. Note, one of the blockers for adding the Clone trait to Vsock is that it currently includes a file handle which cannot be cloned. For Clone and Device Traits to be implemented for Vsock, it requires an implementation change in the future for it to be cloneable. Fixes: #8283 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-05 21:59:44 -08:00
Bo Chen	071667f1ca	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v35.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #8378 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-11-03 10:47:06 -07:00
Bo Chen	d1163141b9	versions: Upgrade to Cloud Hypervisor v36.0 Details of this release can be found in ourroadmap project as iteration v36.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #8378 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-11-03 10:46:56 -07:00
Fabiano Fidêncio	0aac3c76ee	Merge pull request #8365 from fidencio/topic/kata-manager-restrict-containerd-versions-to-be-used kata-manager: Accept only "lts" or "active" as containerd versions	2023-11-03 11:54:05 +01:00
Fabiano Fidêncio	8b4fc847d7	kata-manager: Accept only "lts" or "active" as containerd versions kata-manager is a very nice tool, but we shouldn't be trying to take care of "everything" in "all possible scenarios", and we should focus on installing Kata Containers dependencies that are supported. With this in mind, let's limit a little bit the scope of which versions of containerd can be installed, limitting to "active" and "lts", which will then install the latest version of those "flavours". The default value will always be "lts" as that's supposed to be the stable one. NOTE: This is a breaking change, as it changes the behaviour of what the script takes in its `-c` parameter. I'm assuming here we're safe to do so as the majority of the users should / would only be using the full installation by default. Fixes: #8356 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-03 10:30:37 +01:00
Fabiano Fidêncio	d395ae8198	Merge pull request #8368 from fidencio/topic/gha-stale-fixes gha: stale: Fix typo and allow manually triggering it	2023-11-03 10:07:56 +01:00
Fabiano Fidêncio	994615ca28	gha: stale: Allow manually triggering it This will help us to avoid waiting till the next time cron would trigger the action to test Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-03 08:17:48 +01:00
Fabiano Fidêncio	6abcf03611	gha: stale: Fix typo action -> actions This is causing the following error: ``` Unable to resolve action action/stale, repository not found ``` Fixes: #8347 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-03 08:15:18 +01:00
Steve Horsman	a7a14e33d8	Merge pull request #8285 from sazzy4o/patch-1 Docs: Fix Dragonball link	2023-11-02 17:54:47 +00:00
Fabiano Fidêncio	37233622da	kata-manager: Ensure we run apt-get update before apt-get install As that's an operation that can easily fail, and it's quite simple / cheap for us to run it, let's just do it and avoid the failure. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-02 14:14:32 +01:00
Fabiano Fidêncio	d547798284	Merge pull request #7057 from brianwang12/kata-manager-fix kata-manager: Fix deployment of containerd on architectures other than amd64.	2023-11-02 14:14:18 +01:00
Fabiano Fidêncio	8905286767	Merge pull request #8348 from fidencio/topic/gha-add-stale-action-for-PRs gha: Add workflow to close stale PRs	2023-11-02 11:34:35 +01:00
Fabiano Fidêncio	abec287058	gha: Add workflow to close stale PRs Our goal. as discussed in the Architecture Committee meeting held on October 31st, 2023, is to take a more aggressive action on issues and PRs that have been opened for a long time. This commit is the very first step, and it's only targetting PRs. What this action will do is: * Mark all the PRs that have no activity for more than 180 days, starting from May 1st, 2023, as stale. * A message will be added, letting the contributor know that they can simply comment on the PR in order to make it "not stale". * If there's no activity on the PR for 7 days, the PR will be automatically closed. Fixes: #8347 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-11-02 09:19:44 +01:00
briwan.wang	437db15916	kata-manager: Fix Mulit-Arch deployment for containerd Fix: Kata-Manager fails to retrieve the correct Containerd string name for architectures other than amd64. Update the 'github_get_release_file_url()' function to make it compatible with different architecture expressions. eg. aarch64/arm64, or x86_64/amd64, allowing it to acquire the correct URL addresses Fixes: #7071 Signed-off-by: briwan.wang <briwan.wang@arm.com>	2023-11-02 06:12:04 +00:00
Archana Shinde	004646162e	Merge pull request #8308 from gkurz/fully-drop-hub release: Fully migrate from hub to gh	2023-11-01 22:46:44 -07:00
Peng Tao	b3dbd4f1c7	Merge pull request #8351 from amshinde/update-agent-cargo-lock cargo: Agent cargo.lock updated	2023-11-02 11:31:24 +08:00
Archana Shinde	58b4d1a264	cargo: Agent cargo.lock updated The Cargo.lock for agent needs to be updated to include "safe-path" dependency. Fixes: #8350 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-11-01 11:54:33 -07:00
Fabiano Fidêncio	40cc397218	Merge pull request #8255 from cmaf/migrate-checks-fixes-links docs: Fix broken links	2023-11-01 14:46:30 +01:00
Beraldo Leal	afec54799e	libs: fixes dereferenced reference make check is giving us the following error: error: this expression creates a reference which is immediately dereferenced by the compiler. Fixes #8344 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-10-31 15:55:32 -04:00
Beraldo Leal	c57df607ad	libs: fixes comparison to empty slice Make check gives us an "error: comparison to empty slice". Fixes #8343 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-10-31 15:51:03 -04:00
Greg Kurz	d20b7381f0	release: Drop obsolete comment in workflow file This comment belongs to the hub tool that got sunset by `710eb8ab9d`. Just drop it. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 16:03:12 +01:00
Greg Kurz	6236fa4617	release: Drop build_hub helper Not used anymore. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 15:28:57 +01:00
Greg Kurz	bc4c66caaf	release: Migrate tag_repos.sh to GitHub CLI The hub tool is deprecated. Convert this script to use the official GitHub CLI gh instead of hub. A typical gh setup is able to access repos using HTTPS along with GitHub credentials. It is only needed to patch the remote url when using SSH. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 15:11:28 +01:00
Greg Kurz	e331102ba3	release: Migrate update-repository-version.sh to GitHub CLI The hub tool is deprecated. Convert this script to use the official GitHub CLI gh instead of hub. A couple of adjustments had to be made : - the notes.md temporary file is moved to ${tmp_dir} in order to silent gh, otherwise it complains about an untracked file, - title of a PR no longer goes to the notes.md file since gh requires the title to be passed with a dedicated --title option. Fixes #8303 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 15:10:50 +01:00
Greg Kurz	b83a7149ee	release: Introduce helper to get GitHub CLI If gh isn't installed already, download it from GitHub. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 15:09:24 +01:00
Fabiano Fidêncio	53cda12a71	Merge pull request #8311 from TimePrinciple/log-system-enhancement runtime-rs: Log system enhancement	2023-10-31 10:14:41 +01:00
Greg Kurz	ceeabe3714	release: Allow to test release scripts with an alternate repo We don't want to mess with the official repo when testing a change in the release scripts. Adapt `update-repository-version.sh` to be able to use an alternate repo just like `tag_repos.sh` already does. This means that the following command : $ OWNER="$SOME_ORG" ./update-repository-version.sh -p "$NEW_VERSION" "$BRANCH" will only create a PR in this repo : http://github.com/$SOME_ORG/kata-containers.git Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-31 09:49:27 +01:00
Archana Shinde	148c565b2f	Merge pull request #8289 from BbolroC/skip-create-tmpfs-s390x agent: Skip flaky create_tmpfs on s390x	2023-10-30 22:26:28 -07:00
Ruoqing He	4ad2cfe0c2	runtime-rs: Log system enhancement By modifying RuntimeLevelFilter drain to improve logging control, enabling isolation of change effect of the loggers between components, tuning clh logs to be logged according to their log levels given by cloud-hypervisor. Fixes: #8310 Signed-off-by: Ruoqing He <linuxwatcher@outlook.com>	2023-10-31 04:57:46 +00:00
David Esparza	2a17d3889e	Merge pull request #8334 from amshinde/ipvlan-nerdctl-fix network: Fix network attach for ipvlan and macvlan	2023-10-30 16:00:32 -06:00
David Esparza	5573705800	Merge pull request #8202 from dborquez/enable_fio_checkmetrics Enable fio checkmetrics	2023-10-30 15:55:37 -06:00
David Esparza	c232869af9	metrics: removes double-quotes in checkemtrics when parsing results This PR removes double quotes in jq output to return raw strings as input of checkmetrics tool. Fixes: #8331 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-30 09:43:03 -06:00
David Esparza	c42a2f2eda	metrics: increase the number of attempts to stop kata This PR increases the number of attempts to stop kata components when it is required usually before starting a metrics test. Fixes: #8307 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-30 09:43:03 -06:00
David Esparza	1626253d9e	metrics: FIO ci test enablement This PR enables the new FIO test based on the containerd client which is used to track the I/O metrics in the kata-ci environment. Additionally this PR fixes the parsing of results. Fixes: #8199 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-30 09:42:54 -06:00
David Esparza	873386a349	metrics: update iodepth and job size fio parameters to improve workload This PR updates the values of the fio parameters for iodepth requests and for the number of jobs, in order to increase the number of sequential operations. Additionally, it adds the list of packages needed to parse the results. Fixes: #8198 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-30 08:43:06 -06:00
James O. D. Hunt	d93275224b	Merge pull request #8323 from jodh-intel/utils-kata-manager-fix-version-checks utils: kata manager: Fix version checks	2023-10-30 12:25:51 +00:00
Chao Wu	7d26604061	Merge pull request #7831 from lisongqian/feat/dragonball_trace dragonball: add tracing feature for dragonball	2023-10-30 17:27:30 +08:00
James O. D. Hunt	d7e410ad2b	Merge pull request #8314 from jodh-intel/kata-ctl-show-confidential-guest kata-runtime/kata-ctl: Add security details to output	2023-10-30 07:41:22 +00:00
Songqian Li	2f533c3003	dragonball: add tracing feature for dragonball This PR adds the tracing capability for dragonball and it depends on the tracing::Subscriber of the upper layer. Fixes: #7249 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-28 19:52:24 +08:00
Chao Wu	f1f4410537	Merge pull request #7695 from lisongqian/feat/legacy_metrics dragonball: add metrics support for legacy device	2023-10-28 16:48:57 +08:00
Archana Shinde	f53f86884f	network: Fix network attach for ipvlan and macvlan We used the approach of cold-plugging network interface for pre-shimv2 support for docker.Since the hotplug approach was not required, we never really got to implementing hotplug support for certain network endpoints, ipvlan and macvlan being among them. Since moving to shimv2 interface as the default for runtime, we switched to hotplugging the network interface for supporting docker and nerdctl. This was done for veth endpoints only. Implement the hot-attach apis for ipvlan and macvlan as well to support ipvlan and macvlan networks with docker and nerdctl. Fixes: #8333 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-10-27 21:42:37 -07:00
Peng Tao	52a014d9cd	Merge pull request #8033 from h56983577/6715/shared-mount agent: use open_tree()/move_mount() to set up bind mounts between containers directly.	2023-10-28 10:57:34 +08:00
Songqian Li	da77b19449	dragonball: output legacy device metrics to runtime Legacy device manager adds device metrics to METRICS when a device is created and removes metrics when a device is dropped. Fixes: #7248 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-27 14:09:42 +08:00
Songqian Li	65213e9fbe	dragonball: unify the metric interface of legacy device Fixes: #7248 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-27 14:09:42 +08:00
Chao Wu	b508091305	Merge pull request #8322 from wainersm/git_helper-fix tests/git-helper: cancel any previous rebase left halfway	2023-10-27 14:07:16 +08:00
Spencer von der Ohe	fee97e219c	docs: Fix Dragonball link Update dragonball link to be the current repo (from archived repo) Fixes #8324 Signed-off-by: Spencer von der Ohe <s.vonderohe40@gmail.com>	2023-10-26 21:12:31 -06:00
Archana Shinde	f5c17f89a3	Merge pull request #8250 from amshinde/runtime-rs-clh-config runtime-rs: Add default configuration file for cloud-hypervisor	2023-10-26 14:54:47 -07:00
Chelsea Mafrica	0608e20a01	docs: Fix broken links Update broken links so that static checks pass. Fixes #8254 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-10-26 10:17:01 -07:00
Chelsea Mafrica	4ede63fa4d	Merge pull request #8317 from cmaf/gha-spellcheck-reqs gha: add dependencies for spell checker	2023-10-26 10:11:26 -07:00
James O. D. Hunt	ae3ea1421d	utils: kata-manager: Fix containerd version check Contained release files include the version number without a "v" prefix. However, the tag for the equivalent release does include it so handle this distinction and also tighten up the Kata check by specifying an explicit version number in the regex. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-26 16:34:56 +01:00
James O. D. Hunt	346f195532	utils: kata-manager: Fix whitespace Use tabs consistently. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-26 16:06:51 +01:00
Wainer dos Santos Moschetta	0ce0abffa6	tests/git-helper: cancel any previous rebase left halfway In bare-metal machines the git tree might get on unstable state with the previous rebase left halfway. So let's attempt to abort any rebase before. Fixes #8318 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-26 11:50:12 -03:00
James O. D. Hunt	2ac7ac1dd2	utils: kata-manager: Fix "Cannot determine download URL" issue The archive names for x86_64 [Kata releases](https://github.com/kata-containers/kata-containers/releases) used to include the tag `x86_64`, but that has now been changed to `amd64`, which unfortunately broke `kata-manager.sh`: ``` kata-static-3.1.3-x86_64.tar.xz ~~~~~~ expected kata-static-3.2.0-alpha3-x86_64.tar.xz ~~~~~~ expected kata-static-3.2.0-alpha4-amd64.tar.xz ~~~~~ changed ``` Fixes: #8321. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-26 15:27:37 +01:00
James O. D. Hunt	59bd534827	utils: kata-manager: Lint fixes Improve the code by fixing some lint issues: - defining variables before using them. - Using `grep -E` rather than `egrep`. - Quoting variables. - Adding a check for invalid CLI arguments. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-26 15:24:46 +01:00
HanZiyao	a3b003c345	agent: support bind mounts between containers This feature supports creating bind mounts directly between containers through annotations. Fixes: #6715 Signed-off-by: HanZiyao <h56983577@126.com>	2023-10-26 16:34:50 +08:00
Archana Shinde	1b8ec08278	Merge pull request #8281 from amshinde/add-clh-config-kata-manager kata-manager: Add clh config to containerd config file	2023-10-25 13:44:53 -07:00
Chelsea Mafrica	c20aadd7a8	gha: add dependencies for spell checker In the migration from the tests repo to the kata containers repo we missed two huspell dictionaries for static checks; add them. Fixes #8315 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-10-25 12:49:09 -07:00
James O. D. Hunt	d707fa2c0d	kata-runtime/kata-ctl: Add security details to output Add the hypervisor security details to the output of the `kata-runtime env` and `kata-ctl env` commands so the user can see, amongst other things, the value of `confidential_guest`. Fixes: #8313. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-25 16:34:42 +01:00
Chao Wu	29d863350f	Merge pull request #7697 from lisongqian/feat/balloon_metrics dragonball: add metrics support for balloon device	2023-10-25 02:42:14 -05:00
Fabiano Fidêncio	328ba0da99	Merge pull request #7647 from jongwu/use_pcie_virt AArch64: runtime: use pcie root port to do pci/pcie device hotplug	2023-10-25 09:17:13 +02:00
Archana Shinde	f99de4d5a1	runtime-rs: Make default kernel params as empty The default kernel params passed to any hypervisor except dragonball is empty. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-10-24 15:50:12 -07:00
Archana Shinde	a813012785	runtime-rs: Add default configuration file for clouf-hypervisor The config template file for clh is in the new format for runtime-rs. It is a result of merging the new format file and options supportted by cloud-hypervisor. Some config options from the golang runtime are missing as they may not be currently supported by the rust runtime. An example of this is the selinux options, rate limiting options as these are not currently supported or verified with the rust runtime. Fixes: #8249 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-10-24 15:17:24 -07:00
Chao Wu	43675bd485	Merge pull request #8294 from ZizhengBian/jason/for-master runtime-rs: fix a typo in device manager	2023-10-24 04:52:04 -05:00
Songqian Li	dce365d5b4	dragonball: add conditional compilation for BalloonDeviceMetrics Fixes: #7248 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-24 13:33:39 +08:00
GabyCT	4c3a664358	Merge pull request #8278 from GabyCT/topic/udpparallel metrics: Add parallel udp iperf3 benchmark	2023-10-23 10:30:53 -06:00
Fabiano Fidêncio	a001021721	Merge pull request #8292 from fidencio/topic/release-ensure-gh-is-used-from-a-git-repo release: Always use actions/checkout to ensure we're in a git repo	2023-10-23 15:16:12 +02:00
Songqian Li	3819f0ee6f	dragonball: output balloon device metrics to runtime Balloon device manager adds balloon device metrics to METRICS when a device is created and remove metrics when a device is dropped. Fixes: #7248 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-23 21:15:22 +08:00
Zizheng Bian	7d7c25c1d6	runtime-rs: fix a typo in device manager Fixes: #8293 Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>	2023-10-23 20:33:47 +08:00
Fabiano Fidêncio	c5cfad7023	actions: Move all the checkout actions to v4 It's been released for a while now, and we need to keep consistency between what we used. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-23 14:01:53 +02:00
Fabiano Fidêncio	b32c6bf805	release: Always use actions/checkout to ensure we're in a git repo Otherwise we'll face issues like: ``` Run tag=$(echo $GITHUB_REF \| cut -d/ -f3-) tag=$(echo $GITHUB_REF \| cut -d/ -f3-) tarball="kata-static-$tag-amd64.tar.xz" mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}" pushd $GITHUB_WORKSPACE echo "uploading asset '${tarball}' for tag: ${tag}" GITHUB_TOKEN=*** gh release upload "${tag}" "${tarball}" popd shell: /usr/bin/bash -e {0} ~/work/kata-containers/kata-containers ~/work/kata-containers/kata-containers uploading asset 'kata-static-3.3.0-alpha0-amd64.tar.xz' for tag: 3.3.0-alpha0 failed to run git: fatal: not a git repository (or any of the parent directories): .git ``` Fixes: #8286 (or better, just a follow up of that) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-23 14:00:39 +02:00
Fabiano Fidêncio	8fe88696c0	Merge pull request #8287 from fidencio/topic/release-use-gh-cli-instead-of-hub actions: release: Use GH cli instead of hub	2023-10-23 12:40:22 +02:00
Hyounggyu Choi	a0746c8d7b	agent: Skip flaky create_tmpfs on s390x This is to skip a flaky test `create_tmpfs()` on s390x until a root cause is identified and fixed. Fixes: #4248 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-10-23 11:22:14 +02:00
Fabiano Fidêncio	710eb8ab9d	actions: release: Use GH cli instead of hub hub is now deprecated, which has been causing issues with our release process. Let's move to the GH cli (https://cli.github.com/manual), and unblock this release. NOTE: This commit is purposefully not touching anywhere else hub is used, as that would require more time and investigation to do the switch, and right now we just want to unblock the release. Fixes: #8286 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-23 08:49:55 +02:00
Fabiano Fidêncio	74d4865189	Merge pull request #8275 from fidencio/topic/ci-adapt-kata-deploy-regex-on-repo-version-update release: Adapt the CIs using the kata-deploy image	2023-10-23 00:37:19 +02:00
Archana Shinde	d3250dff34	kata-manager: Add clh config to containerd config file kata-manager currently adds default config which currently is qemu. Add config for clh as well to containerd configuration. This should allow new users to get started with clh using kata-manager. Also add config related to enabling privileged_without_host_devices. Always good to have this config enabled when users try to run privileged containers so that devices from host are not inadverdantly passed to the guest. Fixes: #8280 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-10-20 18:16:16 -07:00
Gabriela Cervantes	2d0518cbe6	metrics: Add parallel udp iperf3 benchmark This PR adds the parallel udp iperf3 benchmark for network metrics. Fixes #8277 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-20 19:54:06 +00:00
Dan Mihai	732fe163f3	Merge pull request #8229 from microsoft/danmihai1/no-config-toml-endpoints agent: no endpoint blocking from agent-config.toml	2023-10-20 11:30:43 -07:00
Fabiano Fidêncio	026f6a1a4c	release: Adapt the CIs using the kata-deploy image This is needed in order to properly run the CIs in branches that are not the main one, as the kata-deploy.yaml file on those branches do not have the `latest` tag, but rather the latest stable release. Fixes: #8274 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-20 18:59:14 +02:00
Fabiano Fidêncio	124f498830	Merge pull request #8266 from fidencio/3.3.0-alpha0-branch-bump # Kata Containers 3.3.0-alpha0	2023-10-20 17:40:44 +02:00
GabyCT	8486283012	Merge pull request #8247 from GabyCT/topic/iperfudp metrics: Add iperf udp benchmark	2023-10-20 09:21:37 -06:00
Fabiano Fidêncio	0fb69ddf6a	release: Kata Containers 3.3.0-alpha0 - kata-deploy-stable: Switch to using the ubuntu based payload - libs: protection: Fix typo in TDX output - ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat - tests: Enable agent stability test - docs: Fix paths to build kernel in SNP VMs documentation - runtime-rs: ch: Add TDX CH features check - runtime: Validate hypervisor section name in config file - tests: query data from the OPA service - release: tag_repos: Stop tagging the `tests` repo - metrics: fixes common.sh function to always return true - Memory footprint test removing trailing commas to make json results file valid - policy: allow access to ReseedRandomDev - runtime/kata-ctl: update dependencies - runtime-rs : fix Nydus support for runtime-rs + Dragonball - metrics: removal of reference in the documentation to the fio dax subtest. - runtime-rs: ch: Detect Intel TDX version - runitme-rs: use the same base64 as kata-runtime/direct-volume does - tests: Enable scability test for stability CI - runtime-rs: Add support for adding vfio device for cloud-hypervisor - tests: Enable soak parallel stability test - dragonball: vcpu metrics change to be recorded per vcpu - ci: k8s: adapt gha-run.sh to run locally - metrics: removes kata components and k8s deployment when test finishes - GHA: fix up referenced yaml exceeding 20 limit problem - gha: ci: Revert tracing test PR to unbreak CI - runtime-rs: ch: Enable feature - gha: ci: Port runk tests over - ci: gha: Port tracing tests over - Enable fio test using containerd client - gha: Add stability tests workflow for gha - gha: arm64: Ensure the builder is arm64-builder - kata-deploy: Build kata-agent as we build all the other components - versions: migrate out of k8s.gcr.io - doc: Update crictl pod-config - gha: Fix k0s deployment - tests: Add stability test for kata CI - docs: Update url in kata vra document - gpu: Adding CDI support for cold and hot-plug of VFIO devices - kata-deploy: build & ship the rust components from src/tools/ - metrics: Add latency value limits for kata CI - runtime: fix reading cgroup stats of sandboxes - Upgrade to Cloud Hypervisor v35.0 - ci: Port kata-monitor tests from Jenkins to GHA - metrics: Fix latency yamls path - metrics: Fix metrics README - metrics: Fix C-Ray documentation - runtime-rs: ch: Enable Intel TDX - ci: k8s: crio: Follow up patches to have CRI-O also working as part of our CI - metrics: Enable latency test in gha run script - local-build: Fix .docker ownership before build-payload - runtime-rs: Add network support for cloud-hypervisor - osbuild: Reduce guest components binary size with strip - gha: Add pandoc as a dependency for static checks - ci: rootfs-image build-asset is failing - feat(runtime-rs): introduce huge page mode to select VM RAM's backend - clh: Direct IO support for block devices - gha: Install hunspell for static checks - ci: Trigger payload-after-push on workflow_dispatch - ci: Actually enable the CRI-O tests - protocol: remove gogoprotobuff tests - ci: k8s: Also run tests with CRI-O - runtime: support kernel params including spaces - ci: kata-deploy: Fix runner name - metrics: Enable parallel bandwidth iperf limit - ci: kata-deploy: Enable all k8s flavours that we support - ci: Create clusters in individual resource groups - versions: Bump virtiofsd to v1.8.0 - clh: arm: Use static_sandbox_resource_mgmt=true - Bump nydus versions and update nydus tests - runtime/qemu: Rework QMP/HMP support - clh:arm64: use arm AMBA UART for hypervisor debug - ci: Use variable size of VMs depending on the tests running - ci: Rework static checks - runtime: incorrect handling of non-empty []Endpoint parameter in Remo… - ci: cache: Check the sha256sum of the components & fix ovmf-sev cache usage - ci: cache: Use the artefacts stored in ghcr.io/kata-containers/cached-artefacts/${component} - ci: Run some of the GARM tests in smaller instances - ci: Reduce the size of the AKS VMs - ci: cache: Allow pushing our artefacts to an OCI registry - metrics: Add iperf value for cpu utilization - ci: cache: Export env vars needed to use ORAS - gha: vfio: Import test script - tests: fix kernel and initrd annotations - metrics: Add iperf bandwidth value for kata metrics - metrics: Add Cassandra Metrics documentation - metrics: Remove warning from metrics documentation - ci: docker: nerdctl: Switch to tcp port 80 ping - runtime: Naming conflict of network devices - Remove gogoproto.nullable extension - metrics: Ensure docker is running in init_env - metrics: this PR skips the FIO test temprarily to fix issues - ci: Add a very basic nerdctl sanity test - runtime-rs: hypervisor: Remove debug kernel options - versions: Bump rust version - ci: Add a very basic docker sanity test - dragonball: fix for non-deterministic builds - runtime-rs: bring hybrid vsock devices in manager. - ci: use github.ref_name instead of $GITHUB_REF_NAME - ci: Add more target-branch related fixes - ci: Fix target-branch usage - agent: optimize the code of systemd cgroup manager - gha: Manually rebase PR atop of the target branch before testing - Update kernel to the latest LTS release (v6.1.52) and bring in erofs patches needed for the CC work - kata-deploy: Fix aarch64 image build - runtime: Fix more virtiofs args - kata-deploy: Switch to an alpine image - metrics: Use TensorFlow optimized image - metrics: fix FIO test initialization - ci: k8s: Add clean-up-garm argument for gha-run.sh - ci: k8s: Second round of fix-ups with the devmapper CI - metrics: re-enable memory-usage initialization step - Dragonball: optimize the placement of dbs-upcall features - ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml - ci: k8s: Add k8s devmapper tests (part 0) - kata-deploy: Create kata-static.tar with correct ownership - runtime: run prestart hooks before starting VM for FC - metrics: Add write 95 percentile FIO value - runtime: Allow virtio_fs_extra_args annotation - packaging: do not install docker-compose-plugin for s390x\|ppc64le - runtime-rs: Fix volumes and rootfs cleanup issues - metrics: Enable iperf benchmark on gha for kata metrics - CI: switch static-checks-dragonball CI machines to Azure - metrics: Add README for kata metrics report - osbuilder: Remove chcon operation for guest SELinux - kata-sys-util: protection: Update TDX checks - Improve the way to clean up storage devices for sandbox - agent: avoid possible leakage of storage device - tests: add policy to existing tests - gha: Rebase PR atop of the target branch before testing - versions: Update alpine to its 3.18 version - runtime: Fix data race in ioCopy - metrics: Add grabdata script for metrics report - Fixes tests on AMD machines - metrics: Enable FIO limits for kata metrics - metrics: Add metrics report script - metrics: Fix memory inside limits for kata metrics - metrics: fix parsing issue on memory-usage test - dragonball: vsock add fifo/pipe stream support for passed fd hybridSt… - tests: Add confidential test - tdx: Update the components needed for using the 6.2 kernel stack - tests: delete k8s deployment at the test's end - tests: use unique test name - runtime-rs: check peer close in log_forwarder - gha: Avoid "fail-fast" in tests that are known to be flaky - Refine storage device management for kata-agent - metrics: Remove unused variable in tensorflow nhwc script - kata-deploy: Don't try to remove /opt/kata - metrics: Add TensorFlow ResNet50 FP32 benchmark - gha: vfio: Run on Ubuntu 23.04 runner - kata-agent: use default filemode for block device when it is set to 0 - kata-types: introduce KataVirtualVolume to support nydus, direct volume and image pull - libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml - local-build: Remove GID before creating group - kata-deploy: Avoid failing on content removal - runtime: fix image and initrd assets handling - metrics: Add disk link to README - metrics: Fix FIO path - gha: capture additional kata-deploy output - metrics: Use function from metrics common in pytorch script - metrics: Enable kata runtime in K8s for FIO test. - metrics: Fix README for pytorch - metrics: Remove unused variable in tensorflow mobilenet script - rootfs: agent: Policy support with AGENT_INIT=yes - gha: k8s: kata-deploy: Move kata-deploy specific tests from integration/kubernetes to functional/kata-deploy - metrics: Fix check results for tensorflow benchmark - metrics: Add Tensorflow ResNet50 int8 benchmark - kata-deploy: Properly create default runtime class - agent: simplify error handling - metrics: Fix MobileNet help me description - gha: ci: Start running kata-deploy tests - runk: Modify kill command's error message for containerd tests - runtime-rs: add driver option - gha: cri-containerd: Enable tests - metrics: Rename tensorflow scripts - gha: tests: Add kata-deploy functional tests -- Part 1 - agent: runtime: add Agent Policy feature - runk: Support without pid ns - metrics: Add Cassandra Kubernetes benchmark for kata metrics - metrics: Add common functions to the common script - metrics: fix the loop used to stop kata components - docs: Remove installation step in virtcontainers doc - Propogate secrets, config maps etc into guest if sharedFS not available - kata-deploy: Preliminary k0s support - gha: static-checks: Move to the Azure instances - versions: Update firecracker version to 1.4.0 - agent: Allow clippy::redundant_clone in the unit tests - agent: avoid creating new `Vec` instances when easily avoidable - metrics: compute tensorflow statistics - metrics: Add network nginx benchmark - metrics: install kata once and run multiple checks - ci: unencrypted-image: Fix build context - ci: create-confidential-image: Add dependent actions - Follow up fixes for https://github.com/kata-containers/kata-containers/pull/7596 - tests: Create image that will be used in the unencrypted confidential tests - kata-deploy: Ensure we cover SHIMS / DEFAULT_SHIM as part of our tests - tests: upgrade bats version - Fix mimor bugs and improve coding stype of agent rpc/sandbox/mount - deps: Bump dependent crate versions - fix number of queues handling in dragonball share fs device - runtime-rs: Introduce directly attachable network - metrics: General improvements to mobilenet tensorflow test - gha: Add iperf network metrics - docs: Use control-plane term instead of master - agent: avoid unnecessary calls to `Arc::clone` - metrics: Add network latency test - Image pulling on the host - Use version 0.10.4 of `fuse-backend-rs` - kata-deploy: Use host's systemctl - release: Revert kata-deploy changes after 3.2.0-rc0 release - metrics: stop kata components before start a metric test. - runtime-rs: Add block device handling for cloud hypervisor `a93fdb014` kata-deploy-stable: Adapt to what we're using in the stable branch `36109da93` ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat `d01daf749` tests: Adjust timeout for agent stability test `9b14dda14` libs: protection: Fix typo in TDX output `0e0867f15` runtime-rs: ch: Add TDX CH features check `409eadddb` runtime-rs: ch: Improve readability of guest protection checks `82a0814fc` tests: Enable agent stability test `32be8e3a8` tests: query data from the OPA service `b81c0a669` tests: encode policy file during test `4f9681b41` metrics: fixes common.sh function to always return true `2ef2b2a6d` docs: Fix paths to build kernel in SNP VMs documentation `408b59c02` runtime-rs: fix bugs to support Nydus v5 `157caea9f` Revert "nydus: Temporarily skip tests on dragonball" `678fe3cd3` Dragonball: fix Nydus config serde problem `b6ec62138` policy: allow access to ReseedRandomDev `908519db9` metrics: skips docker restart when it is not installed or is masked. `c2763120a` metrics: removing trailing comma characters from json file. `3e8cf6959` runtime: Validate hypervisor section name in config file `ef6388e81` tests: Remove unused function from scability test `fbc8f8f46` scripts: Use install_yq from the `kata-containers` repo `65b1a2d27` release: tag_repos: Stop tagging / updating the `tests` repo `87b760f56` runtime-rs: ch: Detect Intel TDX version `73e81f5e3` runitme-rs: unify base64 encoding for direct-volume `c6463cb5a` tests: Fix path for versions yaml for soak parallel test `89c9454fc` metrics: removal of reference in the documentation to the dax test. `30ff58904` tests: Enable scability test for stability CI `8d6f7b909` runtime-rs: Add support for handling vfio device for cloud-hypervisor `e786b2b01` gha: Add install dependencies for stability tests `dbfe6512f` dragonball: vcpu metrics change to be recorded per vcpu `fa60fbe02` dragonball: METRICS is refactored to RwLock<DragonballMetrics> `500d1c5ce` kata-ctl: update rustls-webpki/webpki dependency `d7660d82a` runtime: unify gopkg.in/yaml.v3 to v3.0.1 `fc9a107e8` runtime: unify swag and testify dependency `79ebb959c` runtime: update runc dependency to v1.1.9 `7f3e8bd65` runtime: unify golang.org/x/text to v0.7.0 `df325ae37` runtime: update golang.org/x/net to v0.7.0 `bba34910d` metrics: stops kata components and k8s deployment when test finishes `84e3d884e` gha: Add general dependencies to stability tests `dec3951ca` tests: Add soak parallel stability test `0f04d527d` tests: Enable soak parallel test `e669282c2` ci: k8s: set KUBERNETES default value `c30c3ff18` tests: run k8s-volume on a given node `666993da8` tests: run k8s-file-volume on a given node `3a00fc910` tests: exec_host() now gets the node name `61c9c17bf` tests: add get_one_kata_node() to tests_common.sh `68f083c4d` ci: k8s: set KATA_HYPERVISOR default value `6677a61fe` ci: k8s: configurable deploy kata timeout `200e54292` ci: k8s: shellcheck fixes to gha-run.sh `4af78be13` kata-deploy: re-format kata-[deploy\|cleanup].yaml `d54e6d9cd` ci: k8s: run_tests() for kcli `c2ef1f0fb` ci: k8s: add deploy-kata-kcli() to gh-run.sh `d2be8eef1` ci: k8s: add cleanup-kcli() to gha-run.sh `cbb9aa15b` ci: k8s: set default image for deploy_kata() `89bef7d03` ci: k8s: create k8s clusters with kcli `954d40cce` gha: combine coco jobs into a single yaml `b60e0a9b5` gha: combine basic amd64 jobs into a single yaml `e9bd85211` gha: ci: Revert tracing test PR to unbreak CI `b8a46a4b8` runtime-rs: ch: Enable feature `0f2dc8c67` gha: Add containerd stability tests to ci yaml `da91c9df8` ci: Port runk tests to this repo `7f2377276` ci: Add placeholder for runk tests `9205acc3d` ci: Move tracing tests here `85d290a04` gha: Add stability gha run script `54f0c8f88` gha: Add stability tests workflow for gha `3bb2923e5` ci: Add placeholder for tracing tests `2c3bf406d` ci: Create a function to install docker `119f03de2` gha: arm64: Ensure the builder is arm64-builder `8c498ef5e` metrics: Use jq tool to pretty-print json metrics output `a2159a636` metrics: Enables FIO test for kata containers `70e7ec3e2` gha: Fix k0s deployment `560bbffb5` packaging: tools: Remove `set -x` leftover `18fa483d9` packaging: release: Mention newly added images `ca3b88837` packaging: tools: Fix container image env var name `5ca66795c` packaging: Allow passing the TOOLS_CONTAINER_BUILDER `02acef957` gha: Build the kata-agent as part of our workflows `5208386ab` packaging: Build the kata-agent `1727487ee` agent: Allow specifying DESTDIR and AGENT_POLICY via env vars `45c118883` packaging: Add get_agent_image_name() `0db8fb8f9` versions: migrate out of k8s.gcr.io `a1a054367` doc: Fix spelling `6339605a1` tests: Add general stability fixes `59ae24444` doc: Update crictl pod-config `fd19f4082` tests: Add agent stability test `215577032` tests: Add cassandra stress in stability tests `f2d3ea988` tests: Add stressng dockerfile for stability tests `6493aa309` tests: Add stressor CPU test for stability tests `ef68a3a36` metrics: Add stability test for kata CI `7c934dc7d` gpu: Fix cold-plug of VFIO devices `8d66ef518` metrics: Increase qemu jitter value `5600e28b5` metrics: Increase jitter value for clh `a6b1f5e21` ci: Build src/tools components as part of our tests / releases `501a168a8` kata-deploy: Build components from src/tools `6ef42db5e` static-build: Add scripts to build content from src/tools `4d08ec29b` packaging: Add get_tools_image_name() `98097c96d` packaging: Use git abbreviated hash `489caf1ad` ci: kata-monitor: Move tests over `a3fb067f1` ci: Add placeholder for kata-monitor tests `57cb4ce20` ci: Make install_kata aware of container engines `de1eeee33` ci: Create a generic install_crio function `64a200085` ci: Add install_cni_plugins helper `8132fe15c` ci: Modify containerd default config `8cb7df1be` metrics: Add checkmetrics for latency test `e90440ae2` metrics: Add qemu latency value limit `a74a8f8a9` metrics: Add latency value limits for kata CI `d7def8317` metrics: Fix general check static warnings `928553d1b` docs: Update url in kata vra document `b0a3293d5` runtime-rs: ch: Enable Intel TDX `523399c32` runtime-rs: ch: Add more consts `dea806581` runtime-rs: ch: Remove unused function `995f2c015` runtime-rs: ch: Only handle particular pending device types `b1b96a5c4` runtime-rs: ch: Remove erroneous "virtio-blk-mmio" check `9ac29b8d3` metrics: Add init_env function to latency test `dfd0c9fa9` runtime: clh: Re-generate the client code `8f9f087e3` versions: Upgrade to Cloud Hypervisor v35.0 `81c8babca` metrics: Fix latency yamls path `481573682` metrics: Fix C-Ray documentation `ef63d67c4` ci: crio: Trail '\r' from exec_host() output `74c12b292` ci: crio: Enable default capabilities `358dc2f56` kata-deploy: Fix CRI-O detection `ebaa4fa4c` ci: crio: Pass `-y` to apt `97e73b223` metrics: Fix spelling warnings `36c8cd6f1` metrics: Fix metrics README `15425a2b8` local-build: Fix .docker ownership before build-payload `13ca7d9f9` gha: Add pandoc as a dependency for static checks `08bc8e4db` metrics: Add latency benchmark for gha `6776b55d7` metrics: Enable latency test in gha run script `94e2ccc2d` runtime: fix reading cgroup stats of sandboxes `d507d189b` fc: Add support for noflush cache option `2ca781518` clh: Direct IO support for block devices `0c95697cc` ci: Trigger payload-after-push on workflow_dispatch `28cbc3b51` ci: rootfs-image build-asset is failing Fixes: #8027 `87a861648` gha: Install hunspell for static checks `8c3c50ca8` ci: Actually enable the CRI-O tests `3a6510ad6` osbuild: Reduce guest components binary size with strip `07a6e63a6` ci: k8s: rke2: Use sudo to call systemd `03b82e848` ci: k8s: Add a CRI-O test `d7105cf7a` ci: k8s: Add a method to install CRI-O `54c0a471b` ci: k8s: k0s: Allow passing parameters to the k0s installer `730ef5169` deps: updating dependencies `3a2c83d69` ci: kata-deploy: Fix runner name `82ff2db46` runtime: support kernel params including spaces `604a9dd67` protocol: remove gogoprotobuff tests `f7fa7f602` ci: Enable kata-deploy tests for all the supported k8s flavours `2c908b598` ci: kata-deploy: Add the ability to deploy rke2 `eaf616491` ci: kata-deploy: Add the ability to deploy k0s `001525763` ci: kata-deploy: Add deploy-k8s argument to gha-run.sh `bf2cb0228` ci: kata-deploy: Expland tests to run on k0s / rke2 `b12b9e188` ci: kata-deploy: Add placeholder for tests on GARM `9e1fb8a96` ci: kata-deploy: Export KUBERNETES env var `09cc0ed43` ci: Move deploy_k8s() to gha-run-k8s-common.sh `486fe14c9` ci: Properly set K8S_TEST_UNION `d9ef1352a` ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name `68267a399` ci: Create clusters in individual resource groups `9aa8d1c91` metrics: Add parallel bandwidth limit for qemu `44c7c082d` versions: Bump virtiofsd to v1.8.0 `af59d4bf4` metrics: Enable parallel bandwidth iperf limit `aba36ab18` nydus: Temporarily skip tests on dragonball `b8a8dfcd1` nydus: Use `kata-${KATA_HYPERVISOR}` instead of `kata` `f6df3d6ef` static-build: Fix arch error on nydus build `2f9c9e2e6` tests: nydus: Update nydus tests `c9a4e7e46` versions: Bump nydus and nydus-snapshotter to its latest release `b73bde320` gha: nydus: Populate run() `b3904a1a3` gha: nydus: Populate install_dependencies() `d2b3b67f5` gha: nydus: Actually install kata when `install-kata` is called `0ec00ad42` gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh `568439c77` tests: nydus: Add timeout to the crictl calls `5ac3b76eb` tests: nydus: Add uid / namespace to the nydus container / sandbox `376574a16` tests: nydus: Decorate some calls with `sudo` `4290fd4b6` tests: nydus: Adapt "source ..." to GHA `a84efa3e8` tests: nydus: Adapt check to "clh" instead "cloud-hypervisor" `56a14b395` tests: common: Add install_nydus_snapshotter() `b6563783e` tests: common: Add install_nydus() `72599f191` clh: arm: Use static_sandbox_resource_mgmt=true `1f16b6627` runtime/qemu: Rework QMP/HMP support `8b1e9b0c7` ci: static-checks: Clean up static-checks job `2c5ca2eaf` ci: static-checks: Run tests depending on KVM `509c309ab` ci: static-checks: Move "sudo make test" to the new test matrix `4e963cedf` ci: static-checks: Move "make test" to the new test matrix `08f2e5ae0` runtime-rs: Ensure static-checks-build is a dep of `make test` `2bc3a616a` kata-ctl: Use `loop` instead of `kvm` module in tests `46daddc50` kata-ctl: Ensure GENERATED_CODE is a dep of `make test` `ec826f328` agent: Ensure GENERATED_CODE is a dep of `make test` `1d32410a8` ci: install_libseccomp: Do not depend on the tests repo `bf888b9a5` ci: static-checks: Move "make check" to the new test matrix `473ec8780` kata-ctl: Add `kata-types` to the Cargo.lock file `ea19549a9` kata-ctl: Ensure GENERATED_CODE is a dep of `make check` `e12577586` tests: install_rust: Also install clippy `e2c61a152` ci: static-checks: Move vendor check to its own job `6794d4c84` tests: Move install_rust.sh from the tests repo `e64508c30` tests: install_go: Remove tests repo dependency `11dff731b` tests: Move functions from kata_arch script here `75c974c80` ci: static-checks: Move kernel config check to its own job `9c233bb9e` test: Add test to verify try_from for clh Netconfig `c69a1e33b` ci: Use variable size of VMs depending on the tests running `9049d311d` runtime-rs: Add network support for cloud-hypervisor `eecd5bf2a` ci: cache: Fix ovmf-sev cache `86c41074b` ci: cache: Check the sha256sum of the component `460988c5f` ci: cache: Remove the script used to cache artefacts on Jenkins `4533a7a41` ci: cache: Also store the ${component} sha256sum `eccc76df6` ci: cache: Use the cached artefacts from ORAS `7f5e77bcb` kernel: enable Arm pl011 support `241c355e0` clh:arm64: use arm AMBA uart for hypervisor debug `094b6b2cf` ci: k8s: Temporarily disable tests that require a bigger VM instance `d0c257b3a` ci: cache: Push cached artefacts to ghcr.io `108f1b60d` kata-deploy: Generate latest_{artefact,image_builder} files `be2eb7b37` ci: cache: Install ORAS in the kata-deploy binaries builder container `fb24fb0dc` ci: k8s: devmapper: Use a smaller / cheaper VM instance `1daf02f5d` ci: nydus: Use a smaller / cheaper VM instance `e60d81f55` ci: nerdctl: Use a smaller / cheaper VM instance `4db416997` ci: docker: Use a smaller / cheaper VM instance `32841827b` ci: cri-containerd: Use a smaller / cheaper VM instance `92fff129f` ci: k8s: Don't set cpu limit request for k8s-inotofy test `faf98c062` ci: Reduce the size of the AKS VMs `adc18ecdb` ci: cache: For consistency, read all used env vars `c7a851efd` ci: cache: Pass the exposed env vars to the kata-deploy binaries in docker `6bd15a85d` ci: cache: Export env vars needed to use ORAS `cd4fd1292` metrics: Add iperf cpu utilization limit for qemu `df5cd10ea` metrics: Add iperf value for cpu utilization `a96050a7a` tests: Apply timeout to 'ctr t kill' `9d9303678` tests/vfio: Bump VM image to Fedora 38 `faee59b52` tests/vfio: Accept single device in vfio group for CLH `df3dc1105` tests/vfio: Get rid of sync's `7211c3dcc` gha: vfio: Set test timeout to 15m `1b02f89e4` packaging: kernel: Enable VIRTIO_IOMMU on x86_64 `3a1db7a86` runtime: clh: Support enabling iommu `9f1a42c6c` tests/vfio: Give commands 30s to execute `b46b0ecf8` tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms `bfc93927f` runtime: Remove redundant check in checkPCIeConfig `7c4e73b60` runtime: Add test cases for checkPCIeConfig `fc51e4b9e` runtime: Check config for supported CLH (cold\|hot)_plug_vfio values `509771e6f` runtime: clh: Add hot_plug_vfio entry to config `5f6475a28` tests/vfio: Gather debug info and disable tdp_mmu `8fffdc81c` tests/vfio: Capture journal from vm `df815087e` tests/vfio: Change to get the test working in GHA `a92ddeea1` tests/vfio: Move dependency installation to gha-run.sh `5a551a85b` gha: vfio: Import jobs scripts from tests repo `49e2fa189` metrics: Increase jitter value for qemu `49234433a` metrics: Increase value limit for jitter in clh `813bfdec0` ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io `46bc0b1c0` ci: nerdctl: Create the containerd config `13968aa7f` ci: nerdctl: Switch to tcp port 80 ping `e0c811678` ci: docker: Switch to tcp port 80 ping `1636abbe1` runtime: issue with non-empty []Endpoint in RemoveEndpoints `0aa073967` metrics: Add iperf bandwidth value for qemu `c0ad91476` tests: fix kernel and initrd annotations `615c1cbf1` metrics: Add iperf bandwidth value for kata metrics `d53eb73ee` metrics: Ensure docker is running in init_env `ad08321b8` metrics: Add Cassandra Metrics documentation `a58ea6659` metrics: this PR skips the FIO test temprarily to fix issues `f536ef5ce` ci: docker: Also run the smoke test with runc `c83f167c5` ci: docker: Run the tests after the kata-static is created `12d833d07` ci: Add a very basic nerdctl sanity test `348b8644d` ci: Add a very basic docker sanity test `a75fd5eb8` runk: Fix rust unecessary mut error `a31c14517` kata-ctl: useless-vec warning `c8419fc3b` kata-ctl: Resolve non-minimal-cfg warning `3eaf68d95` agent-ctl: Allow clippy lint `1d8b78959` runtime-rs: Fix useless-vec warning `99f3d69e9` runtime-rs: Remove mut `16fbc27b0` dragonball: Allow ambiguous-glob-reexports `bbf191951` dragonball: Resolve non-minimal-cfg warning `75cfdd5d5` agent: config: Allow clippy lint `f3a0fd590` agent: config: Fix useles-vec warning `9e423bd3d` libs: Fix clippy unnecesary hashes error `444395050` versions: Bump rust version `a16b0962b` chore(cargo): update cargo lock `ca4b6b051` runtime: Naming conflict of network devices `202049f35` feat(runtime-rs): introduce huge page type to select VM RAM's backend `f811b064c` ci: use github.ref_name instead of $GITHUB_REF_NAME `6d795c089` ci: Add more target-branch related fixes `8509c3187` ci: Fix target-branch usage `060499dca` metrics: Remove warning from metrics documentation `c0f697fcc` runtime: Allow kernel_params annotation `b03e49794` dragonball: fix for non-deterministic builds `976d10150` runtime-rs: hypervisor: Remove debug kernel options `fde34610c` kernel: Add erofs patches needed for CC related work `dc6a4588a` versions: Bump kernel to the latest LTS release (6.1.52) `52f6449b7` kata-manager: Remove initcall_debug kernel option `8b4a0b368` kata-deploy: Remove curl after it's used `139c7f03a` kata-deploy: Fix aarch64 image build `470d06541` agent: optimize the code of systemd cgroup manager `bd24afcf7` gha: Manually rebase PR atop of the target branch before testing `72c510d05` runtime/virtiofsd: Drop all references to "--cache=none" `ead724bec` protocol: removing gogo.nullable feature `d8e4bb985` protocol: remove unused PROTO_FILE env `5e1106a77` protocol: remove unused import_path `87accaaec` protocol: use workdir during build `711a7ed96` protocol: remove mapping definitions `8db84c1bd` protocol: force GOPATH to be set `68156d77a` protocol: breaking lines to improve readability `670a8e9c7` kata-deploy: Switch to an alpine image `9d74b7ccc` k8s: ci: Skip "Pod quota" test with firecracker `f6cd3930c` ci: k8s: Remove useless skip statement from tests `3cc20b47a` ci: k8s: Also check for "fc" (for firecracker) `b5bad3cb0` ci: k8s: Add clean-up-garm argument for gha-run.sh `aaec5a09f` ci: k8s: devmapper tests should be using ubuntu 20.04 `27fa7d828` ci: k8s: Add a kata-deploy-garm target `fa62a4c01` ci: k8s: Export KUBERNETES env var `8c9380a79` ci: k8s: Install bats on GARM runners `3de23034f` ci: k8s: Wait some time after restarting k3s `adfea55b8` metrics: fix FIO test initialization `2df183fd9` ci: k8s: Append, instead of overwrite, the devmapper config `369a8af8f` ci: k8s: Decrease k3s sleep from 4 to 2 minutes `ada65b988` ci: k8s: Use vanilla kubectl with k3s `ad45ab5d3` ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644 `028a97e0d` ci: k8s: Use the proper command for sleep `3a427795e` metrics: Use TensorFlow optimized image `8d99972a8` ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml `deed1b927` Dragonball: optimize the placement of dbs-upcall features `0e8bd50cb` ci: k8s: Add k8s devmapper tests (part 0) `b28b54df0` ci: k8s: Add a function to configure devmapper for containerd `54f711721` ci: k8s: Add a function to deploy k3s `81536f21a` runtime/qemu: Pass "--xattr" to virtiofsd instead of "-o xattr" `b1dd09a4d` runtime: Allow virtio_fs_extra_args annotation `2efda20c7` packaging: do not install docker-compose-plugin for s390x\|ppc64le `438fbf966` metrics: Add write 95 percentile for FIO for qemu `024b4d2ff` metrics: Add write 95 percentile FIO value `e98e5cdea` metrics: Add checkmetrics to gha run script `c1edfe551` metrics: Add checkmetrics value for qemu for iperf `6a79ecedf` metrics: Add jitter value for clh `f609a9a75` metrics: Add test selector to iperf metrics `5b8db3042` metrics: Enable iperf benchmark on gha for kata metrics `60f733d30` CI: switch static-checks-dragonball CI machines to Azure `7870b33a2` runtime-rs: bring hybridVsock devices in manager. `18c94ebbe` kata-deploy: Create kata-static.tar with correct ownership `57e7bf14a` agent: refine StorageDeviceGeneric::cleanup() `53edb1937` agent: implement StorageDeviceGeneric::cleanup() `0c63453e2` types: make StorageDevice::cleanup() return possible error code `3a3d77b3b` agent: move StorageDeviceGeneric from kata-types into agent `b151cfd14` metrics: re-enable memory-usage initialization step `f3e1a6a94` osbuilder: alpine: Change mirror `ac612aef5` osbuilder: alpine: Match the version on versions.yaml `9cd706d1c` agent: avoid possible leakage of storage device `bf21411e9` tests: add policy to k8s tests `d0e061067` runtime: config: use the SEV initrd for SNP `67fed26f1` runtime: Use TDX image with in the qemu-tdx config `ac939c458` gha: Rebase atop of the target branch `82cd14ba3` versions: Update alpine to its 3.18 version `666882575` metrics: Add grabdata script for metrics report `c290eaed8` kata-sys-util: protection: Update TDX checks `d7a996c68` gha: Update to checkout@v3 action `c2ba29c15` runtime: Fix data race in ioCopy `211de08d9` osbuilder: Remove chcon operation for guest SELinux `9f21fa9b3` metrics: Add report generator link to general documentation `c0ed5ea0a` metrics: Add README for kata metrics report `a7b59a5bf` metrics: Add limit for 90 percentile for qemu value `99db6568e` metrics: Add limit for write 90 percentile value for clh `6e06392c5` metrics: Enable FIO limits for kata metrics `2e4c87472` runtime/vc: runPrestartHooks should ignore GetHypervisorPid failure `21204caf2` runtime: fail early when starting docker container with FC `32fd01371` runtime: run prestart hooks before starting VM for FC `00e7ffd98` tests: check vmx only on Intel machines `c8dd3c073` metrics: Fix memory footprint qemu limit `8877ec62f` metrics: Fix memory inside limits for kata metrics `80146f207` tests: Fixes cpuType check on AMD machines `7e364716d` metrics: Add test setup details to metrics report `17dc1b976` metrics: Add boot lifecycle times to metrics report `3b0d6538f` metrics: Add memory inside container to metrics report `79fbb9d24` metrics: Add scaling system footprint in metrics report `8e6d4e6f3` metrics: Add metrics reportgen `139ffd4f7` metrics: Add report file titles `878d1a2e7` metrics: Generate PNGs alongside the PDF report `fce248797` metrics: Add metrics report R files `08812074d` metrics: Add report dockerfile `69781fc02` metrics: Add metrics report script `e286e842c` tests: Expand confidential test to support TDX `e31f099be` tests: Expand confidential test to support SNP `c3b9d4945` tests: Add confidential test for SEV `538c965c2` metrics: fix parsing issue on memory-usage test `3818bf331` local-build: Remove $HOME/.docker/buildx/activity/default `d1b54ede2` qemu: tdx: Workaround SMP issue with TDX 1.5 `1e34220c4` qemu: tdx: Adapt to the TDX 1.5 stack `8115a0522` versions: tdx: Update Kernel to 6.2 + TDX `ec18180f3` versions: tdx: Update TDVF to the "edk2-stable202302" `9803b2428` versions: tdx: Update QEMU to v7.2 + TDX v1.10 `dffc16e5b` runtime-rs: check peer close in log_forwarder `aaa5ab126` agent: simplify storage device by removing StorageDeviceObject `fb49d5d7c` gha: Avoid "fail-fast" in tests that are known to be flaky `183f51d6f` tests: use unique test name `6a974679f` tests: delete k8s deployment at the test's end `32a778b6d` metrics: Remove unused variable in tensorflow nhwc script `d8f3ce649` kata-deploy: Don't try to remove /opt/kata `936e8091a` gha: vfio: Run on Ubuntu 23.04 runner `0e7248264` agent: move storage device related code into dedicated files `268e84655` runtime-rs: Fix volumes and rootfs cleanup issues `8f49ee33b` agent: refine storage related code a bit `60ca12ccb` agent: switch to new storage subsystem `fcbda0b41` kata-types: introduce StorageDevice and StorageHandlerManager `b03b1f613` agent: simplify the way to manage storage object `8392c71bf` sys-util: support more mount flags in parse_mount_options() `c00d8f3d4` agent: use create_mount_destination() from kata-sys-util `5e867f053` types: add more mount related constants `880e6c9a7` agent: use function from kata-sys-utils to reduce code `3b881fbc0` local-build: Remove GID before creating group `959ca4944` metrics: Add TensorFlow ResNet50 fp32 Dockerfile `4b7d72c4a` metrics: Add TensorFlow ResNet50 FP32 benchmark `5cba38c17` kata-deploy: Avoid failing on content removal `18d42da21` runtime/fc: fix image/initrd annotation handling `9fda7059a` runtime/clh: fix image/initrd annotation handling `1a0092d63` runtime/qemu: fix image/initrd annotation handling `22d8f335d` libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml `8afd158ce` metrics: Add disk link to README `40914b25d` kata-agent: use default filemode for block device when it is set to 0 `eee2ee6ee` metrics: Fix FIO path `39bc3488f` metrics: Use function from metrics common in pytorch script `400eb8874` gha: capture additional kata-deploy output `4aee3eade` kata-types: implement serde methods for KataVirtualVolume `b875e3932` kata-types: validate KataVirtualVolume object `fa2fdc105` kata-types: implement two conversion helpers for KataVirtualVolume `6326af20e` kata-types: introduce KataVirtualVolume `c8b43f8b3` metrics: Fix README for pytorch `fb571f8be` metrics: Enable kata runtime in K8s for FIO test. `cb056f8cb` rootfs: agent: Policy support with AGENT_INIT=yes `85c02828e` metrics: Update tensorflow name in gha run script `e8a511934` metrics: Fix check results for tensorflow benchmark `2d896ad12` gha: kata-deploy: Do the runtime class cleanup as part of the cleanup `4ffc2c86f` gha: kata-deploy: Add the first kata-deploy test `8616c050a` metrics: Remove unused variable in tensorflow mobilenet script `285e616b5` tests: common: Ensure test_type is used as part of the cluster's name `790bd3548` tests: commob: Don't fail if yq is not part of the cache `ce6adecd0` gha: kata-deploy: Add run-kata-deploy-tests.sh `cfc29c11a` gha: k8s: Stop running kata-deploy tests as part of the k8s suite `f4dd15286` tests: k8s: Call ensure_yq() in setup.sh `339569b69` kata-deploy: Properly create default runtime class `2a491e9b1` metrics: Fix MobileNet help me description `d19a75e80` gha: ci: Start running kata-deploy tests `d90f7ac68` runtime-rs: add unit test for block driver `e44919f0d` runtime-rs: add load_test_config for unit test `7f48a6937` runtime-rs: add driver option `bade6a5c3` docs: Fix TensorFlow word across the document `1a1b20776` docs: Add Tensorflow Resnet50 documentation `24baededc` metrics: Add Dockerfile for ResNet50 int8 `6d971ba8d` metrics: Add Tensorflow ResNet50 int8 benchmark `25d151bd1` runk: Modify kill command's error message for containerd tests `b3592ab25` gha: cri-containerd: Enable tests `84dd02e0f` gha: cri-containerd: Add timeout to the crictl calls on testContainerStop `b29782984` gha: cri-containerd: Show pod before deleting it `ae0930824` gha: cri-containerd: Print kata logs in case of error `6c8b2ffa6` gha: cri-containerd: Group containerd logs `9e898701f` gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account `76dac8f22` agent: simplify error handling `18a7fd8e4` metrics: Rename tensorflow scripts `e55fa93db` tests: kata-deploy: Add placeholder for kata-deploy-tests-on-tdx `d9ee17aae` tests: kata-deploy: Add placeholder for kata-deploy-tests-on-aks `ab829d103` agent: runtime: add the Agent Policy feature `831e73ff9` tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder `af1b46bbf` tests: Add gha-run-k8s-common.sh `416445e7e` docs: Remove installation step in virtcontainers doc `72cbcf040` kata-deploy: Add k0s support `767434d50` metrics: fix the loop used to stop kata components #7629 `5d0f0d43c` metrics: Add cassandra statefulset yaml `c1dcc1396` metrics: Add cassandra service yaml `2297a0d1c` metrics: Add block loop pvc yaml for cassandra `e3d511946` metrics: Add block loop pv yaml for cassandra test `989027159` metrics: Add block loop pvc for cassandra test `349b89969` metrics: Add Cassandra Kubernetes benchmark for kata metrics `c52d09052` gha: static-checks: Move to the Azure instances `8815ed066` runtime: Remove config warnings `afe1a6ac5` agent: support copying of directories and symlinks `ab13ef87e` runtime: propagate configmap/secrets etc changes for remote-hyp `c074ec4df` runtime: Copy shared files recursively `fdcd52ff7` metrics: Add check containers are running in tensorflow mobilenet `36337ee14` metrics: Add check containers are up in tensorflow script `f700f9b0b` metrics: Remove unused variable in tensorflow script `833cf7a68` metrics: Add check containers are running function `918c78308` metrics: Add check containers are up in tensorflow mobilenet script `9d57a1fab` metrics: Use check containers are up in tensorflow script `1c84680d8` metrics: Add check containers are up in common script `d3e57cf45` metrics: Use collect_results function in tensorflow mobilenet test `286de046a` metrics: Remove collect results function definition `9879709aa` metrics: Add common functions to the common script `4746fa3da` docs: Specify supported Firecracker version using `versions.yaml` `cc922be5e` versions: Update firecracker version to 1.4.0 `39e67b06e` dragonball: vsock add fifo/pipe stream support for passed fd hybridStream `473b0d3a3` metrics: compute tensorflow statistics `03d1fa67b` ci: unencrypted-image: Fix build context `eb463b38e` ci: unencrypted-image: Don't fail to build on s390x `a2d731ad2` ci: create-confidential-image: Add dependent actions `d1a629622` metrics: Add nginx documentation to network README `498f7c054` metrics: Add nginx kubernetes yaml `f8a5255cf` metrics: Add network nginx benchmark `43fe5d1b9` ci: k8s: tees: Ensure PR_NUMBER is exported `54f6a7850` ci: {{ pr-number }} should be {{ inputs.pr-number }} `034d7aab8` tests: k8s: Ensure the runtime classes are properly created `fac8ccf5c` ci: Add build-and-publish-tee-confidential-unencrypted-image `ab5f603ff` ci: k8s: Add the image used for unencrypted confidential tests `1e8fe131b` k8s: tests: Take advantage of `SHIMS` and `DEFAULT_SHIM` env vars `729b2dd61` agent: avoid creating new `Vec` instances when easily avoidable `aeaec9dae` tests: upgrade bats version `e66496986` metrics: install kata once and run multiple checks `baabfa9f1` agent: refine implementation of mount related code `98ba211a3` agent: fix a bug in update_ephemeral_mounts() `5333618d7` agent: make add_storage() take &[Storage] instead of Vec<Storage> `37f34781d` agent: simplify function online_cpu_memory() `d3c542237` agent: refine style of code related to sandbox `71a9f6778` agent: avoid unwrap() in function do_remove_container() `84badd89d` agent: avoid clone objects when possible `b23c5ed15` deps: Bump dependent crate versions `863283716` metrics: General improvements to mobilenet tensorflow test `3c319d8d4` metrics: Add iperf to gha run script `5b5caf890` gha: Add iperf network metrics `66db5b535` metrics: Add latency test to network README `c36572418` agent: avoid unnecessary calls to `Arc::clone` `4fbe0a3a5` runtime: bind-mount mounted block device into container `7e1b1949d` runtime: add support for kata overlays `6c867d9e8` agent: add io.katacontainers.fs-opt.overlay-rw option `6163c3565` agent: skip mount options that start with "io.katacontainers." `b2ff97aa0` dragonball: use version 0.10.4 of `fuse-backend-rs` `845eeb4d7` agent: Allow clippy::redundant_clone in the unit tests `1163fc9de` release: Revert kata-deploy changes after 3.2.0-rc0 release `3958a39d0` runtime-rs: Introduce directly attachable network `1e15369e5` metrics: Improve naming testing containers in launch times test `5dbe88330` metrics: Clean kata components before start a metric test. `3b45060b6` metrics: Add latency server yaml `9bb8451df` metrics: Add latency client yaml `64fdb9870` metrics: Add network latency test `a81ad3b58` runtime-rs: Add block device handling in cloud hypervisor `3230dec95` kata-deploy: Use host's systemctl `1b21a4624` docs: Use control-plane term instead of master `28e5e9c86` runtime-rs: fix number of queues handling in dragonball share fs device `f1d8de9be` runk: Allow runk to launch a container without pid namespace Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-20 14:44:50 +02:00
Fabiano Fidêncio	f6e20ac230	Merge pull request #7195 from fidencio/topic/adapt-kata-deploy-stable-to-using-ubuntu kata-deploy-stable: Switch to using the ubuntu based payload	2023-10-20 14:42:04 +02:00
Fabiano Fidêncio	a93fdb014b	kata-deploy-stable: Adapt to what we're using in the stable branch This is basically to make sure that folks trying to use the kata-deploy script from the main branch, to deploy stable kata-deploy images, do not have a hard time. Fixes: #7194 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-20 12:58:42 +02:00
James O. D. Hunt	79ed501a20	Merge pull request #8258 from jodh-intel/protection-fix-tdx-typo libs: protection: Fix typo in TDX output	2023-10-20 08:36:22 +01:00
Dan Mihai	52aaf10759	agent: no endpoint blocking from agent-config.toml Remove the ability to block access to kata agent endpoints by using agent-config.toml. That functionality is now implemented using the Agent Policy feature (#7573). The CCv0 branch relied on blocking endpoints using agent-config.toml but will set-up an equivalent default policy file instead (#8219). Fixes: #8228 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-10-20 02:26:54 +00:00
Fabiano Fidêncio	468a3e4b53	Merge pull request #8260 from gkurz/fix-8259 ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat	2023-10-19 23:58:22 +02:00
GabyCT	5d6bdbd0a1	Merge pull request #8241 from GabyCT/topic/enableagenttest tests: Enable agent stability test	2023-10-19 14:12:49 -06:00
Greg Kurz	36109da93f	ci: k8s: Fix bogus firecracker check in k8s-credentials-secrets.bat Fixes #8259 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-10-19 21:53:23 +02:00
GabyCT	dc295600b8	Merge pull request #8157 from GabyCT/topic/fixsevdoc docs: Fix paths to build kernel in SNP VMs documentation	2023-10-19 11:42:03 -06:00
Gabriela Cervantes	d01daf749b	tests: Adjust timeout for agent stability test This PR adjusts the timeout for the agent stability test to run on the gha. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-19 16:55:23 +00:00
James O. D. Hunt	9b14dda147	libs: protection: Fix typo in TDX output Add the missing closing bracket to the output of the TDX details, so rather than: ```bash $ sudo kata-ctl env 2>/dev/null \| grep available_guest_protection available_guest_protection = "tdx (major_version: 1, minor_version: 0" : ^ : Missing ')' ! ``` ... we now have: ```bash $ sudo kata-ctl env 2>/dev/null \| grep available_guest_protection available_guest_protection = "tdx (major_version: 1, minor_version: 0)" : ^ : Aha! ``` Added a unit test for this scenario. Fixes: #8257. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-19 16:06:08 +01:00
James O. D. Hunt	9336e2e492	Merge pull request #8155 from jodh-intel/runtime-rs-check-ch-tdx-build-feature runtime-rs: ch: Add TDX CH features check	2023-10-19 14:13:08 +01:00
James O. D. Hunt	048cc70654	Merge pull request #8213 from jodh-intel/validate-hypervisor-cfg-name runtime: Validate hypervisor section name in config file	2023-10-19 07:40:58 +01:00
Dan Mihai	99db6dff24	Merge pull request #8230 from microsoft/danmihai1/opa-data tests: query data from the OPA service	2023-10-18 15:32:23 -07:00
James O. D. Hunt	0e0867f15d	runtime-rs: ch: Add TDX CH features check If you attempt to create a container (a TD) on a TDX system using a custom build of Cloud Hypervisor (CH) that was not built with the `tdx` CH feature, Kata will report the following, somewhat cryptic, CH error: ``` ApiError(VmBoot(InvalidPayload)) ``` Newer versions of CH now report their build-time features in the ping API response message so we now use that, if available, to detect this scenario and generate a user-friendly error message instead. This changes improves the readability of `handle_guest_protection()` and adds a couple of additional tests for that method. Fixes: #8152. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-18 18:07:39 +01:00
James O. D. Hunt	409eadddb2	runtime-rs: ch: Improve readability of guest protection checks Improve the way `handle_guest_protection()` is structured by inverting the logic and checking the value of the `confidential_guest` setting before checking the guest protection. This makes the code easier to understand. > Notes: > > - This change also unconditionally saves the available guest protection > (where previously it was only saved when `confidential_guest=true`). > This explains the minor unit test fix. > > - This changes also errors if the CH driver finds an unexpected > protection (since only Intel TDX is currently tested). Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-18 18:06:02 +01:00
Greg Kurz	9863805752	Merge pull request #8201 from fidencio/topic/release-tag-repo-stop-tagging-the-tests-repo release: tag_repos: Stop tagging the `tests` repo	2023-10-18 18:10:39 +02:00
Gabriela Cervantes	a58afe70b8	metrics: Add iperf udp benchmark This PR adds the iperf udp benchmark for bandwdith measurement for network metrics. Fixes #8246 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-18 15:52:03 +00:00
Jianyong Wu	f9c9d8f645	runtime: QemuVirt: hotadd virtio-mem dev to pcie root port Hotplug virtio-mem device to pcie root port for Qemu Virt. Fixes: #7646 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-10-18 06:35:57 +00:00
Jianyong Wu	ef18c9550c	runtime:qemuvirt: hotadd net dev to pcie root port Hotplug network device to pcie root port as this is the only way on QemuVirt. Fixes: #7646 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-10-18 06:35:57 +00:00
Jianyong Wu	f1aec98f9d	qemu/virt: use pcie_root_port to do device hotplug for virt ACPI PCI device hotplug on qemu virt is not supported. The only way to hotplug pci device is pcie native way. Thus we need create pcie root port as default. Pcie root port number depends on following: 1. reserved one for network device as default; 2. virtio-mem dev; 3. add enough port for vhost user blk dev; Fixes: #7646 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-10-18 06:35:57 +00:00
Jianyong Wu	28a41e1d16	runtime: add a new API for Network interface Add GetEndpointsNum API for Network Interface to get the number of network endpoints. This is used for caculate the number of pcie root port for QemuVirt. Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-10-18 06:35:57 +00:00
Songqian Li	09d46450f1	dragonball: add metrics support for balloon device Fixes: #7248 Signed-off-by: Songqian Li <mail@lisongqian.cn>	2023-10-18 14:02:56 +08:00
Gabriela Cervantes	82a0814fc2	tests: Enable agent stability test This PR enables the agent stability test for stability gha CI. Fixes #8240 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-17 15:16:06 +00:00
Dan Mihai	32be8e3a87	tests: query data from the OPA service Add example for querying json data from the OPA service. Fixes: #8231 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-10-17 13:31:43 +00:00
David Esparza	d90d1c5c10	Merge pull request #8243 from dborquez/fix_systemctl_masked_query metrics: fixes common.sh function to always return true	2023-10-16 20:17:24 -06:00
Dan Mihai	b81c0a6693	tests: encode policy file during test Encode policy file during test - easier to understand than hard-coding the encoded file contents. Fixes: #8214 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-10-16 15:58:12 -07:00
David Esparza	4f9681b411	metrics: fixes common.sh function to always return true This PR corrects the init env() helper function, to make that systemctl always returns true when enumerating masked services, and preventing the test from failing Fixes: #8242 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-16 15:57:57 -06:00
David Esparza	59e8b1d5a7	Merge pull request #8206 from dborquez/memory_footprint_test_removing_trailing_commas_to_make_json_results_file_valid Memory footprint test removing trailing commas to make json results file valid	2023-10-16 14:31:28 -06:00
Gabriela Cervantes	2ef2b2a6dc	docs: Fix paths to build kernel in SNP VMs documentation This PR fixes the correct path to setup, build and install properly the kernel for snp. Fixes #8156 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-16 20:09:02 +00:00
Fabiano Fidêncio	db37692f36	Merge pull request #8226 from microsoft/danmihai1/policy-typo policy: allow access to ReseedRandomDev	2023-10-16 19:17:31 +02:00
Peng Tao	45e82b6581	Merge pull request #8192 from bergwolf/github/deps runtime/kata-ctl: update dependencies	2023-10-16 16:39:17 +08:00
Chao Wu	44e602d69a	Merge pull request #8014 from openanolis/chao/fix_nydus_break runtime-rs : fix Nydus support for runtime-rs + Dragonball	2023-10-16 01:30:22 -05:00
Chao Wu	408b59c02c	runtime-rs: fix bugs to support Nydus v5 1. enable virtio-fs-pro in Dragonball to have the ability to process nydus backend registry 2. change passthrough for rw layer's readonly config to false to have the accurate read write ability. Fixes:#8013 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-10-16 10:22:21 +08:00
Chao Wu	157caea9fe	Revert "nydus: Temporarily skip tests on dragonball" This reverts commit `aba36ab188`. Fixes: #8013 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-10-16 10:22:21 +08:00
Chao Wu	678fe3cd31	Dragonball: fix Nydus config serde problem Since Nydus snapshotter has been updated in previous commits, there is a problem that the config passthrough to Dragonball during mount_rafs is RafsConfig instead of ConfigV2, but Dragonball could only serde ConfigV2 so it will panic. We need to add the support for RafsConfig Fixes:#8013 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-10-16 10:22:21 +08:00
Dan Mihai	b6ec621389	policy: allow access to ReseedRandomDev Allow access to the ReseedRandomDev endpoint by default. Using false for ReseedRandomDevRequest was unintended. Fixes: #8225 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-10-13 21:18:27 +00:00
David Esparza	908519db9d	metrics: skips docker restart when it is not installed or is masked. To avoid errors when initializing the test environment, the kill_processes_before_start() helper function needs to verify that docker is installed before attempting to stop it. Fixes: #8218 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-13 18:02:00 +00:00
David Esparza	c2763120aa	metrics: removing trailing comma characters from json file. This PR removes trailing commas so that the json results file is valid. This PR also changes the way data results are collected by terating through the array of memory values to calculate their average. Fixes: #8204 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-13 18:00:57 +00:00
Beraldo Leal	5ef691528d	tests: fixes permission denied when running test After running cri-containerd/integration-tests twice we receive permission denied during containerd clean. Fixes: #8216 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-10-12 19:23:40 +00:00
GabyCT	1974d13122	Merge pull request #8188 from dborquez/metrics_add_fio_readme.md metrics: removal of reference in the documentation to the fio dax subtest.	2023-10-12 10:53:55 -06:00
James O. D. Hunt	3e8cf6959c	runtime: Validate hypervisor section name in config file Previously, if you accidentally modified the name of the hypervisor section in the config file, the default golang runtime gives a cryptic error message ("`VM memory cannot be zero`"). This can be demonstrated using the `kata-runtime` utility program which uses the same golang config package as the actual runtime (`containerd-shim-kata-v2`): ```bash $ kata-runtime env >/dev/null; echo $? 0 $ sudo sed -i 's!^\[hypervisor\.qemu\]!\[hypervisor\.foo\]!g' /etc/kata-containers/configuration.toml $ kata-runtime env >/dev/null; echo $? VM memory cannot be zero 1 ``` The hypervisor name is now validated so that the behaviour becomes: ```bash $ kata-runtime env >/dev/null; echo $? 0 $ sudo sed -i 's!^\[hypervisor\.qemu\]!\[hypervisor\.foo\]!g' /etc/kata-containers/configuration.toml $ ./kata-runtime env >/dev/null; echo $? /etc/kata-containers/configuration.toml: configuration file contains invalid hypervisor section: "foo" 1 ``` Fixes: #8212. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-12 13:53:37 +01:00
James O. D. Hunt	45d28998d9	Merge pull request #8149 from jodh-intel/runtime-rs-ch-detect-tdx-version runtime-rs: ch: Detect Intel TDX version	2023-10-12 10:09:42 +01:00
QuanweiZhou	f904e64155	Merge pull request #8179 from Apokleos/directvol-urlEncode runitme-rs: use the same base64 as kata-runtime/direct-volume does	2023-10-12 09:04:11 +08:00
GabyCT	bc6eadf4f6	Merge pull request #8197 from GabyCT/topic/enablescability tests: Enable scability test for stability CI	2023-10-11 16:41:46 -06:00
Archana Shinde	f814b1a0a2	Merge pull request #8073 from amshinde/runtime-rs-vfio-clh runtime-rs: Add support for adding vfio device for cloud-hypervisor	2023-10-11 15:01:55 -07:00
Gabriela Cervantes	ef6388e815	tests: Remove unused function from scability test This PR removes an unused function from scability test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-11 19:44:21 +00:00
Fabiano Fidêncio	fbc8f8f466	scripts: Use install_yq from the `kata-containers` repo As the file is already part of the kata-containers repo, and the tests repo is about to become read-only, we're good to drop the tests references from here and use everything coming from the `kata-containers` repo instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-11 12:52:55 +02:00
Fabiano Fidêncio	65b1a2d277	release: tag_repos: Stop tagging / updating the `tests` repo As we've moved all the tests to the `kata-containers` repo, the `tests` repo will become a read-only repo. Fixes: #8200 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-11 11:45:27 +02:00
James O. D. Hunt	87b760f569	runtime-rs: ch: Detect Intel TDX version Improve the `GuestProtection` handling to detect the version of Intel TDX available. The TDX version is now logged by the Cloud Hypervisor driver. Fixes: #8147. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-11 09:38:00 +01:00
alex.lyn	73e81f5e39	runitme-rs: unify base64 encoding for direct-volume Direct-volume needs to use the same base64 character set as kata-runtime/direct-volume does. Fixes: #8175 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-10-11 14:00:13 +08:00
Gabriela Cervantes	c6463cb5ae	tests: Fix path for versions yaml for soak parallel test This PR fixes the path for versions yaml for soak parallel test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-10 22:29:20 +00:00
David Esparza	89c9454fca	metrics: removal of reference in the documentation to the dax test. This PR removes the reference in the documentation to the DAX subtest of the FIO benchmark, because this metric is currently WIP. Fixes: #8159 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-10 15:55:59 -06:00
Gabriela Cervantes	30ff58904e	tests: Enable scability test for stability CI This PR enables the scability test for stability CI gha. Fixes #8196 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-10 19:59:57 +00:00
GabyCT	538131ab44	Merge pull request #8154 from GabyCT/topic/addstability tests: Enable soak parallel stability test	2023-10-10 13:53:14 -06:00
Archana Shinde	8d6f7b9096	runtime-rs: Add support for handling vfio device for cloud-hypervisor This change adds support for adding and removing vfio devices for cloud-hypervisor. Fixes: #6691 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-10-10 12:25:44 -07:00
Gabriela Cervantes	e786b2b019	gha: Add install dependencies for stability tests This PR adds the install dependencies for stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-10 16:05:48 +00:00
Chao Wu	936553ae79	Merge pull request #7505 from lisongqian/feat/dragonball_metrics dragonball: vcpu metrics change to be recorded per vcpu	2023-10-10 10:52:40 -05:00
Wainer Moschetta	d311c3dd04	Merge pull request #7621 from wainersm/gha-run-local ci: k8s: adapt gha-run.sh to run locally	2023-10-10 11:19:19 -03:00
David Esparza	93fef543e0	Merge pull request #8127 from dborquez/fix_iperf_check_kata_processes_issue metrics: removes kata components and k8s deployment when test finishes	2023-10-10 07:05:24 -06:00
lisongqian	dbfe6512fc	dragonball: vcpu metrics change to be recorded per vcpu In this commit, the vcpu metrics in Dragonball will be changed to record per-vcpu. Fixes: #7248 Signed-off-by: lisongqian <mail@lisongqian.cn>	2023-10-10 16:22:40 +08:00
lisongqian	fa60fbe023	dragonball: METRICS is refactored to RwLock<DragonballMetrics> In this commit, the METRICS is refactored to RwLock<DragonballMetrics>. Fixes: #7248 Signed-off-by: lisongqian <mail@lisongqian.cn>	2023-10-10 16:22:40 +08:00
Peng Tao	500d1c5cee	kata-ctl: update rustls-webpki/webpki dependency The old ones have security issues. ref: https://github.com/briansmith/webpki/issues/69 https://github.com/briansmith/webpki/issues/69 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:45 +00:00
Peng Tao	d7660d82a0	runtime: unify gopkg.in/yaml.v3 to v3.0.1 The older versions have Denial of Service issues. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:45 +00:00
Peng Tao	fc9a107e8e	runtime: unify swag and testify dependency So that we don't need to depend on that many versions of them. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:45 +00:00
Peng Tao	79ebb959c5	runtime: update runc dependency to v1.1.9 To pick up security fixes. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:45 +00:00
Peng Tao	7f3e8bd65e	runtime: unify golang.org/x/text to v0.7.0 The older versions contain security issues. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:45 +00:00
Peng Tao	df325ae371	runtime: update golang.org/x/net to v0.7.0 To pick up fix for the following issue: A maliciously crafted HTTP/2 stream could cause excessive CPU consumption in the HPACK decoder, sufficient to cause a denial of service from a small number of small requests. Fixes: #8190 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-10 03:56:39 +00:00
David Esparza	bba34910df	metrics: stops kata components and k8s deployment when test finishes This PR adds a trap whenever the scrip exits, it deletes the iperf k8s deployment and k8s services, and deletes the kata components. This way, when the script finishes, it verifies that there are indeed no kata components still running. Fixes: #8126 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-09 13:41:43 -06:00
Gabriela Cervantes	84e3d884e4	gha: Add general dependencies to stability tests This PR adds the general dependencies to stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-09 17:02:49 +00:00
Gabriela Cervantes	dec3951ca5	tests: Add soak parallel stability test This PR adds the soak parallel stability test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-09 17:02:49 +00:00
Gabriela Cervantes	0f04d527d9	tests: Enable soak parallel test This PR enables the soak parallel test for stability test. Fixes #8153 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-09 17:02:49 +00:00
Wainer dos Santos Moschetta	e669282c25	ci: k8s: set KUBERNETES default value The KUBERNETES variable is mostly used by kata-deploy whether to apply k3s specific deployments or not. It is used to select the type of kubernetes to be installed (k3s, k0s, rancher...etc) and it is always set on CI. Running the script locally we want to set a value by default to avoid `KUBERNETES: unbound variable` errors. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta	c30c3ff185	tests: run k8s-volume on a given node This test can give false-positive on a multi-node cluster. Changed it to use the new get_one_kata_node() and the modified exec_host() to run the setup commands on a given node (that has kata installed) and ensure the test pod is scheduled at that same node. Fixes #7619 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta	666993da8d	tests: run k8s-file-volume on a given node This test can give false-positive on a multi-node cluster. Changed it to use the new get_one_kata_node() and the modified exec_host() to run the setup commands on a given node (that has kata installed) and ensure the test pod is scheduled at that same node. Fixes #7619 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:08:48 -03:00
Wainer dos Santos Moschetta	3a00fc9101	tests: exec_host() now gets the node name The exec_host() simply fails on cluster with multi-nodes because `kubectl get node -o name" will return a list o names. Moreover, it will return control nodes names which usually don't have kata installed. Fixes #7619 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	61c9c17bff	tests: add get_one_kata_node() to tests_common.sh The introduced get_one_kata_node() returns the first node that has the kata-runtime=true label, i.e., supposedly a node with kata installed. This is useful for tests that should run on a determined worker node on a multi-nodes cluster. Fixes #7619 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	68f083c4d0	ci: k8s: set KATA_HYPERVISOR default value Let KATA_HYPERVISOR be qemu by default in gh-run.sh as this variable is required to tweak some configurations of kata-deploy. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	6677a61fe4	ci: k8s: configurable deploy kata timeout The deploy-kata() of gha-run.sh will wait for 10 minutes for the kata deploy installation finish. This allow users of the script to overwrite that value by exporting the KATA_DEPLOY_WAIT_TIMEOUT environment variable. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	200e542921	ci: k8s: shellcheck fixes to gha-run.sh Fixed a couple of warns shellcheck emitted and disabled others: * SC2154 (var is referenced but not assigned) * SC2086 (Double quote to prevent globbing and word splitting) Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	4af78be13a	kata-deploy: re-format kata-[deploy\|cleanup].yaml The .tests/integration/kubernetes/gh-run.sh script run `yq write` a couple of times to edit the kata-[deploy\|cleanup].yaml, resulting on the file being formatted again. This is annoying because leaves the git tree dirty. Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	d54e6d9cda	ci: k8s: run_tests() for kcli The only difference to the other platforms is that it needs to export KUBECONFIG. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	c2ef1f0fb0	ci: k8s: add deploy-kata-kcli() to gh-run.sh The cleanup-kcli() behaves like other deploy kata for bare-metal (e.g. sev, tdx...etc) except that KUBECONFIG should be exported. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	d2be8eef1a	ci: k8s: add cleanup-kcli() to gha-run.sh The cleanup-kcli() behaves like other clean up for bare-metal (e.g. sev, tdx...etc) except that KUBECONFIG should be exported. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	cbb9aa15b6	ci: k8s: set default image for deploy_kata() On CI workflows the variables DOCKER_REGISTRY, DOCKER_REPO and DOCKER_TAG are exported to match the built image. However, when running the script outside of CI context, a developer might just use the latest image which in this case will be `quay.io/kata-containers/kata-deploy-ci:kata-containers-latest`. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Wainer dos Santos Moschetta	89bef7d036	ci: k8s: create k8s clusters with kcli Adapted the gha-run.sh script to create a Kubernetes cluster locally using the kcli tool. Use `./gha-run.sh create-cluster-kcli` to create it, and `./gha-run.sh delete-cluster-kcli` to delete. Fixes #7620 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>	2023-10-09 11:05:40 -03:00
Fabiano Fidêncio	1280f85343	Merge pull request #8171 from bergwolf/github/fix-up-gha GHA: fix up referenced yaml exceeding 20 limit problem	2023-10-09 09:37:03 +02:00
Peng Tao	954d40cce5	gha: combine coco jobs into a single yaml So that we don't risk exceeding the GHA 20 rerefenced yaml files limit that easy. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-08 14:22:01 +00:00
Peng Tao	b60e0a9b57	gha: combine basic amd64 jobs into a single yaml GHA has an undocumented limitation that there can be at most 20 referenced yamls in a single yaml file. We workaround it by combining multiple jobs into a single yaml file. Fixes: #8161 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-10-08 13:55:01 +00:00
Fabiano Fidêncio	108db0a721	Merge pull request #8162 from sprt/sprt/unbreak-ci gha: ci: Revert tracing test PR to unbreak CI	2023-10-08 10:13:46 +02:00
Aurélien Bombo	e9bd852113	gha: ci: Revert tracing test PR to unbreak CI Revert "Merge pull request #8115 from fidencio/topic/ci-add-tracing-tests" This unbreaks CI as seen in https://github.com/kata-containers/kata-containers/actions/runs/6434757133 Fixes: #8161 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-10-06 14:13:17 -07:00
James O. D. Hunt	16fe81f27c	Merge pull request #8124 from jodh-intel/ch-enable-feature runtime-rs: ch: Enable feature	2023-10-06 13:02:08 +01:00
Fabiano Fidêncio	fa6786d1d7	Merge pull request #8117 from fidencio/topic/ci-add-runk-tests gha: ci: Port runk tests over	2023-10-06 11:19:55 +02:00
Fabiano Fidêncio	8fec654716	Merge pull request #8115 from fidencio/topic/ci-add-tracing-tests ci: gha: Port tracing tests over	2023-10-06 10:06:57 +02:00
GabyCT	265f53e594	Merge pull request #8082 from dborquez/enable_fio_on_ctr Enable fio test using containerd client	2023-10-05 17:26:22 -06:00
GabyCT	c8b9ec1cb5	Merge pull request #8108 from GabyCT/topic/ghastability gha: Add stability tests workflow for gha	2023-10-05 17:10:10 -06:00
James O. D. Hunt	b8a46a4b85	runtime-rs: ch: Enable feature Enable the Cloud Hypervisor driver (the `cloud-hypervisor` build feature) for the rust runtime. Fixes: #6264. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-10-05 17:58:39 +01:00
Gabriela Cervantes	0f2dc8c675	gha: Add containerd stability tests to ci yaml This PR adds containerd stability tests to ci yaml. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-05 15:21:24 +00:00
Fabiano Fidêncio	89f73e658d	Merge pull request #8110 from fidencio/topic/gha-be-more-specific-about-the-arm-runners gha: arm64: Ensure the builder is arm64-builder	2023-10-04 21:20:08 +02:00
Fabiano Fidêncio	da91c9df88	ci: Port runk tests to this repo I'm basically moving the runk tests from the tests repo to this one, and I'm adding the "Signed-off-by:" of every single contributor the tests. Fixes: #8116 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Chen Yiyang <cyyzero@qq.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-04 20:41:29 +02:00
Fabiano Fidêncio	7f23772763	ci: Add placeholder for runk tests The runk test has been executed as part of the former "ubuntu" jenkins CI. We're porting it to GHA and running it against LTS containerd. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-04 20:40:32 +02:00
Fabiano Fidêncio	9205acc3d2	ci: Move tracing tests here I'm basically moving the tracing tests from the tests repo to this one, and I'm adding the "Signed-off-by:" of every single contributor to the tests. Fixes: #8114 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com> Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com> Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>	2023-10-04 20:02:27 +02:00
Gabriela Cervantes	85d290a048	gha: Add stability gha run script This PR adds the stability gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-04 17:45:45 +00:00
Gabriela Cervantes	54f0c8f88e	gha: Add stability tests workflow for gha This PR adds the stability test workflow for gha for the kata CI. Fixes #8107 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-04 16:32:13 +00:00
Fabiano Fidêncio	3bb2923e5d	ci: Add placeholder for tracing tests The tracing tests are currently running as part of the Jenkins CI with the following setups: * Container Engines: containerd * VMMs: QEMU \| Cloud Hypervisor * Snapshotters: overlayfs \| devmapper We'll be restricting those tests to be running on LTS version of containerd, without devmapper. As it's known due to our GHA limitation, this is just a placeholder and the tests will actually be added in the next interations. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-04 18:02:02 +02:00
Fabiano Fidêncio	2c3bf406dc	ci: Create a function to install docker This will be re-used in other tests as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-04 15:01:51 +02:00
Fabiano Fidêncio	c2cce12de5	Merge pull request #8100 from fidencio/topic/kata-deploy-build-agent kata-deploy: Build kata-agent as we build all the other components	2023-10-04 11:56:03 +02:00
Steve Horsman	c430cc3707	Merge pull request #8098 from stevenhorsman/k8s-registry-suite versions: migrate out of k8s.gcr.io	2023-10-04 10:51:39 +01:00
Fabiano Fidêncio	119f03de26	gha: arm64: Ensure the builder is arm64-builder Otherwise we'll use any arm64 machine that's added as a runner, and whenever new machines are added those may end up being only used for running some specific set of the tests. Fixes: #8109 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-04 11:08:11 +02:00
Fabiano Fidêncio	59b9380d1c	Merge pull request #8093 from stevenhorsman/crictl-pod-config-update doc: Update crictl pod-config	2023-10-04 10:49:04 +02:00
David Esparza	8c498ef5ee	metrics: Use jq tool to pretty-print json metrics output This PR enables the use of jq pretty-print feature to improve the formatting of metric results json files. Fixes: #8081 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-03 23:33:19 -06:00
David Esparza	a2159a6361	metrics: Enables FIO test for kata containers FIO benchmark is enabled to measure IO in Kata at different latencies using containerd client, in order to complement the CI metrics testing set. This PR asl deprecated the previous Fio bench based on k8s. Fixes: #8080 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-10-03 23:32:38 -06:00
Fabiano Fidêncio	f337315952	Merge pull request #8106 from fidencio/topic/gha-fix-k0s-related-cis gha: Fix k0s deployment	2023-10-03 21:47:40 +02:00
GabyCT	d1d9af5de2	Merge pull request #8085 from GabyCT/topic/stabilitytests tests: Add stability test for kata CI	2023-10-03 11:28:49 -06:00
Fabiano Fidêncio	70e7ec3e23	gha: Fix k0s deployment The tests are failing when setting up k0s, and that happens because we download a kubectl binary matching the kubernetes version k0s is using, and we do that by: ``` sudo k0s kubectl version --short 2>/dev/null \| ... ``` With kubectl 1.28, which is now the default on k0s, `kubectl version --short` has been removed, leading us to an empty stringm causing then the error in the CI. Fixes: #8105 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 17:21:40 +02:00
Fabiano Fidêncio	560bbffb57	packaging: tools: Remove `set -x` leftover This was used for debugging, and ended up being merged with that. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	18fa483d90	packaging: release: Mention newly added images We've added two new containerd builder images recently, one for the components under `src/tools` and another one for the Kata Containers agent. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	ca3b888371	packaging: tools: Fix container image env var name This should be TOOLS_CONTAINER_BUILDER instead of VIRTIOFSD_CONTAINER_BUILDER. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	5ca66795c7	packaging: Allow passing the TOOLS_CONTAINER_BUILDER This follows what we've been doing for all the components we're building, but was missed as part of #8077. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	02acef9575	gha: Build the kata-agent as part of our workflows The kata-agent binary won't be released, just built so it can be used, later on, as part of our tests and as part of the rootfs build. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	5208386ab1	packaging: Build the kata-agent Let's add the needed functions to start building the kata-agent, with or without the OPA support. For now this build is not used as part of the rootfs build, but later on this will (not as part of this series, though). Fixes: #8099 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 15:33:55 +02:00
Fabiano Fidêncio	1727487eef	agent: Allow specifying DESTDIR and AGENT_POLICY via env vars This will help to build the agent binary as part of the kata-deploy localbuild, as we need to pass the DESTDIR to where the agent will be installed, and also whether we're building the agent with policy support enabled or not. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 14:18:45 +02:00
Fabiano Fidêncio	45c1188839	packaging: Add get_agent_image_name() This will be used for building the kata-agent. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-10-03 14:17:38 +02:00
Wainer dos Santos Moschetta	0db8fb8f98	versions: migrate out of k8s.gcr.io The k8s.gcr.io is deprecated for a while now and has been redirected to registry.k8s.io. However on some bare-metal machines in our testing pools that redirection is not working, so let's just replace the registries. Fixes #8098 Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> (cherry picked from commit b2c3bca558c38deff2117d5909d9071c23c05590)	2023-10-03 11:52:59 +01:00
stevenhorsman	a1a0543671	doc: Fix spelling Spell check failed with: ``` [kata-spell-check.sh:275] WARNING: Word 'overcommitment': did you mean one of the following?: over commitment, over-commitment, commitment ``` So update this to pass the static checks Fixes: # Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-10-03 10:17:38 +01:00
Gabriela Cervantes	6339605a14	tests: Add general stability fixes This PR adds general stability fixes. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-10-02 19:42:46 +00:00
stevenhorsman	59ae244442	doc: Update crictl pod-config - Ensure that our documented crictl pod config file contents have uid and namespace fields for compatibility with crictl 1.24+ This avoids a user potentially hitting the error: ``` getting sandbox status of pod "d3af2db414ce8": metadata.Name, metadata.Namespace or metadata.Uid is not in metadata "&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}" getting sandbox status of pod "-A": rpc error: code = NotFound desc = an error occurred when try to find sandbox: not found ``` Fixes: #8092 Signed-off-by: stevenhorsman <steven@uk.ibm.com> (cherry picked from commit `8f8c2215`)	2023-10-02 14:53:46 +01:00
Gabriela Cervantes	fd19f4082f	tests: Add agent stability test This PR adds the agent stability test to stability test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-28 22:37:02 +00:00
Gabriela Cervantes	215577032f	tests: Add cassandra stress in stability tests This PR adds the cassandra stress at the stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-28 22:34:45 +00:00
GabyCT	a890ad3a16	Merge pull request #8066 from GabyCT/topic/urlvra docs: Update url in kata vra document	2023-09-28 14:59:34 -06:00
Zvonko Kaiser	79e33c211c	Merge pull request #7325 from zvonkok/vfio-sandbox-id-debug gpu: Adding CDI support for cold and hot-plug of VFIO devices	2023-09-28 21:31:12 +02:00
Gabriela Cervantes	f2d3ea988d	tests: Add stressng dockerfile for stability tests This PR adds the stressng dockerfile for stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-28 16:35:22 +00:00
Gabriela Cervantes	6493aa309e	tests: Add stressor CPU test for stability tests This PR adds the stressor CPU test for stability tests. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-28 16:33:08 +00:00
Gabriela Cervantes	ef68a3a36b	metrics: Add stability test for kata CI This PR adds the stability test for kata containers repository. Fixes #8084 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-28 16:23:36 +00:00
David Esparza	f7ef45b167	Merge pull request #8077 from fidencio/topic/kata-deploy-ship-the-tools kata-deploy: build & ship the rust components from src/tools/	2023-09-28 09:59:19 -06:00
Zvonko Kaiser	7c934dc7da	gpu: Fix cold-plug of VFIO devices We need to do proper sandbox sizing when we're doing cold-plug introduce CDI, the de-facto standard for enabling devices in containers. containerd will pass-through annotations for accumulated CPU,Memory and now CDI devices. With that information sandbox sizing can be derived correctly. Fixes: #7331 Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>	2023-09-28 09:49:13 +00:00
GabyCT	fcc755fc3b	Merge pull request #8068 from GabyCT/topic/limitlatency metrics: Add latency value limits for kata CI	2023-09-27 13:28:41 -06:00
Greg Kurz	defbb64ac8	Merge pull request #8036 from rye-stripe/bugfix/overhead-metrics runtime: fix reading cgroup stats of sandboxes	2023-09-27 19:39:55 +02:00
Archana Shinde	95455e6fe8	Merge pull request #8058 from likebreath/0925/clh_v35.0 Upgrade to Cloud Hypervisor v35.0	2023-09-27 10:39:32 -07:00
Gabriela Cervantes	8d66ef5185	metrics: Increase qemu jitter value This PR increases qemu jitter value. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-27 17:31:07 +00:00
Gabriela Cervantes	5600e28b54	metrics: Increase jitter value for clh This PR increases jitter value for clh. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-27 17:30:19 +00:00
Fabiano Fidêncio	a6b1f5e21b	ci: Build src/tools components as part of our tests / releases Build those as part of our CI and release workflows. Fixes #5520 #5348 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 18:50:25 +02:00
Fabiano Fidêncio	501a168a81	kata-deploy: Build components from src/tools Let's add targets and actually enable users and oursevles to build those components in the same way we build the rest of the project. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 18:49:02 +02:00
Fabiano Fidêncio	6ef42db5ec	static-build: Add scripts to build content from src/tools As we'd like to ship the content from src/tools, we need to build them in the very same way we build the other components, and the first step is providing scripts that can build those inside a container. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 18:48:56 +02:00
Fabiano Fidêncio	4d08ec29bc	packaging: Add get_tools_image_name() This will be used for building all the (rust) components from src/tools. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 18:48:35 +02:00
Fabiano Fidêncio	98097c96de	packaging: Use git abbreviated hash This will make it easier to build images that rely on several directories hashes. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 18:48:30 +02:00
Fabiano Fidêncio	8b25e90027	Merge pull request #8075 from fidencio/topic/ci-add-kata-monitor-tests ci: Port kata-monitor tests from Jenkins to GHA	2023-09-27 15:48:46 +02:00
Fabiano Fidêncio	489caf1ad0	ci: kata-monitor: Move tests over Let's move, adapt, and use the kata-monitor tests from the tests repo. In this PR I'm keeping the SoB from every single contributor from who touched those tests in the past. Fixes: #8074 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com> Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-27 11:40:31 +02:00
Fabiano Fidêncio	a3fb067f1b	ci: Add placeholder for kata-monitor tests The kata-monitor tests is currently running as part of the Jenkins CI with the following setups: * Container Engines: CRI-O \| containerd * VMMs: QEMU When using containerd, we're testing it with: * Snapshotter: overlayfs \| devmapper We will stop running those tests on devmapper / overlayfs as that hardly would get us a functionality issue. Also, we're restricting this to run with the LTS version of containerd, when containerd is used. As it's known due to our GHA limitation, this is just a placeholder and the tests will actually be added in the next iterations. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 11:31:17 +02:00
Fabiano Fidêncio	57cb4ce204	ci: Make install_kata aware of container engines This will help us when running tests using CRI-O. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 11:31:17 +02:00
Fabiano Fidêncio	de1eeee334	ci: Create a generic install_crio function This will serve us quite will in the upcoming tests addition, which will also have to be executed using CRi-O. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 11:26:13 +02:00
Fabiano Fidêncio	64a2000859	ci: Add install_cni_plugins helper This will become handy when doing tests with CRI-O, as CRI-O doesn't install the CNI plugins for us. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 11:26:13 +02:00
Fabiano Fidêncio	8132fe15c9	ci: Modify containerd default config Let's ensure we have runc running with `SystemdCgroups = false`, otherwise we'll face failures when running tests depending on runc on Ubuntu 22.04, woth LTS containerd. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-27 11:16:12 +02:00
Chelsea Mafrica	a49bc68374	runtime-rs: Update status for pause and resume Pause and resume task do not currently update the status of the container to paused or running, so fix this. This is specifically for pausing the task and not the VM. Fixes #6434 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-09-26 17:22:47 -07:00
Gabriela Cervantes	8cb7df1bed	metrics: Add checkmetrics for latency test This PR adds the checkmetrics for latency test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-26 19:11:08 +00:00
Gabriela Cervantes	e90440ae24	metrics: Add qemu latency value limit This PR adds the qemu latency value limit for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-26 17:30:09 +00:00
Gabriela Cervantes	a74a8f8a9d	metrics: Add latency value limits for kata CI This PR adds latency value limits for kata CI. Fixes #8067 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-26 17:29:07 +00:00
Gabriela Cervantes	d7def8317a	metrics: Fix general check static warnings This PR fixes general check static warnings. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-26 16:30:59 +00:00
GabyCT	309103169d	Merge pull request #8056 from GabyCT/topic/fixlatencypath metrics: Fix latency yamls path	2023-09-26 10:16:55 -06:00
Gabriela Cervantes	928553d1ba	docs: Update url in kata vra document This PR updates the url in kata vra document. Fixes #8065 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-26 16:13:12 +00:00
GabyCT	5c0afaacf4	Merge pull request #8018 from GabyCT/topic/fixreadme metrics: Fix metrics README	2023-09-26 09:51:47 -06:00
David Esparza	83326f89b3	Merge pull request #8054 from GabyCT/topic/fixcrdoc metrics: Fix C-Ray documentation	2023-09-26 09:50:19 -06:00
James O. D. Hunt	31478b9c33	Merge pull request #7944 from jodh-intel/runtime-rs-ch-enable-tdx runtime-rs: ch: Enable Intel TDX	2023-09-26 14:11:12 +01:00
James O. D. Hunt	b0a3293d53	runtime-rs: ch: Enable Intel TDX Allow Cloud Hypervisor to create a confidential guest (a TD or "Trust Domain") rather than a VM (Virtual Machine) on Intel systems that provide TDX functionality. > Notes: > > - At least currently, when built with the `tdx` feature, Cloud Hypervisor > cannot create a standard VM on a TDX capable system: it can only create > a TD. This implies that on TDX capable systems, the Kata Configuration > option `confidential_guest=` must be set to `true`. If it is not, Kata > will detect this and display the following error: > > ``` > TDX guest protection available and must be used with Cloud Hypervisor (set 'confidential_guest=true') > ``` > > - This change expands the scope of the protection code, changing > Intel TDX specific booleans to more generic "available guest protection" > code that could be "none" or "TDX", or some other form of guest > protection. Fixes: #6448. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-26 10:55:25 +01:00
James O. D. Hunt	523399c329	runtime-rs: ch: Add more consts Introduce a few new constants (for PCI segment count and FS queues) and move the disk queue constants to `convert.rs` to allow them to be used there too. > Note: > > This change gives the `ShareFs` code it's own set of values rather > than relying on the disk queue constants. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-26 08:41:32 +01:00
James O. D. Hunt	dea8065811	runtime-rs: ch: Remove unused function Delete the `handle_pending_devices_after_boot()` function which is no longer required. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-26 08:41:32 +01:00
James O. D. Hunt	995f2c015f	runtime-rs: ch: Only handle particular pending device types Modify the Cloud Hypervisor `add_device()` method to add `ShareFs` and `Network` devices to the list of pending devices since only these two device types need to be cached before VM startup. Full details in the comments. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-26 08:41:32 +01:00
James O. D. Hunt	b1b96a5c49	runtime-rs: ch: Remove erroneous "virtio-blk-mmio" check Remove the `VIRTIO_BLK_MMIO` check which appears to have been added erroneously in the first place. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-26 08:41:32 +01:00
Gabriela Cervantes	9ac29b8d38	metrics: Add init_env function to latency test This Pr adds the init_env function to latency test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-25 22:06:00 +00:00
Bo Chen	dfd0c9fa9a	runtime: clh: Re-generate the client code This patch re-generates the client code for Cloud Hypervisor v35.0. Note: The client code of cloud-hypervisor's OpenAPI is automatically generated by openapi-generator. Fixes: #8057 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-09-25 12:22:37 -07:00
Bo Chen	8f9f087e35	versions: Upgrade to Cloud Hypervisor v35.0 Details of this release can be found in ourroadmap project as iteration v35.0: https://github.com/orgs/cloud-hypervisor/projects/6. Fixes: #8057 Signed-off-by: Bo Chen <chen.bo@intel.com>	2023-09-25 12:22:01 -07:00
Fabiano Fidêncio	a4daa86535	Merge pull request #8028 from fidencio/topic/ci-test-with-crio-part-2 ci: k8s: crio: Follow up patches to have CRI-O also working as part of our CI	2023-09-25 18:40:42 +02:00
Gabriela Cervantes	81c8babca9	metrics: Fix latency yamls path This PR fixes the latency yamls path for the latency test for kata metrics. Fixes #8055 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-25 15:52:24 +00:00
Gabriela Cervantes	4815736820	metrics: Fix C-Ray documentation This PR fixes the C-Ray documentation for kata metrics. Fixes #8052 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-25 15:27:58 +00:00
Fabiano Fidêncio	ef63d67c41	ci: crio: Trail '\r' from exec_host() output We've faced this as part of the CI, only happening with the CRI-O tests: ``` not ok 1 Test readonly volume for pods # (from function `exec_host' in file tests_common.sh, line 51, # in test file k8s-file-volume.bats, line 25) # `exec_host "echo "$file_body" > $tmp_file"' failed with status 127 # [bats-exec-test:38] INFO: k8s configured to use runtimeclass # bash: line 1: $'\r': command not found # # Error from server (NotFound): pods "test-file-volume" not found ``` I must say I didn't dig into figuring out why this is happening, but we may be safe enough to just trail the '\r', as long as all the tests keep passing on containerd. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-25 16:42:18 +02:00
Fabiano Fidêncio	74c12b2927	ci: crio: Enable default capabilities We need the default capabilities to be enabled, especially `SYS_CHROOT`, in order to have tests accessing the host to pass. A huge thanks to Greg Kurz for spotting this and suggesting the fix. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2023-09-25 14:56:15 +02:00
Fabiano Fidêncio	358dc2f569	kata-deploy: Fix CRI-O detection Some of the "k8s distros" allow using CRI-O in a non-official way, and if that's done we cannot simply assume they're on containerd, otherwise kata-deploy will simply not work. In order to avoid such issue, let's check for `cri-o` as the container engine as the first place and only proceed with the checks for the "k8s distros" after we rule out that CRI-O is not being used. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-25 14:56:15 +02:00
Fabiano Fidêncio	ebaa4fa4c1	ci: crio: Pass `-y` to apt That was something overlooked during my tests. :-/ Fixes: #8005 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-25 14:56:15 +02:00
GabyCT	11cf0e2d28	Merge pull request #8038 from GabyCT/topic/latency metrics: Enable latency test in gha run script	2023-09-22 16:57:53 -06:00
GabyCT	3ef57b335e	Merge pull request #8045 from jepio/fix-docker-ownership local-build: Fix .docker ownership before build-payload	2023-09-22 14:43:38 -06:00
Archana Shinde	9bb9a3e7a4	Merge pull request #7966 from amshinde/runtime-rs-network-clh runtime-rs: Add network support for cloud-hypervisor	2023-09-22 13:08:09 -07:00
Gabriela Cervantes	97e73b2234	metrics: Fix spelling warnings This PR fixes general spelling warnings detected by the spelling check. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-22 15:50:51 +00:00
Gabriela Cervantes	36c8cd6f1f	metrics: Fix metrics README This PR fixes the network metrics section at the README by leaving the current tests that we have in our kata metrics. Fixes #8017 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-22 15:28:58 +00:00
Fabiano Fidêncio	c5a5a0c95e	Merge pull request #8012 from arronwy/strip osbuild: Reduce guest components binary size with strip	2023-09-22 15:45:38 +02:00
Fabiano Fidêncio	9d190f2390	Merge pull request #8042 from GabyCT/topic/pandoc gha: Add pandoc as a dependency for static checks	2023-09-22 15:31:18 +02:00
Jeremi Piotrowski	15425a2b80	local-build: Fix .docker ownership before build-payload The permissions on .docker/buildx/activity/default are regularly broken by us passing docker.sock + $HOME/.docker to a container running as root and then using buildx inside. Fixup ownership before executing docker commands. Fixes: #8027 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-22 13:44:53 +02:00
Jeremi Piotrowski	a5338e885e	Merge pull request #8030 from portersrc/8027-ci-rootfs-image-build-asset-is-failing-oras ci: rootfs-image build-asset is failing	2023-09-22 11:07:50 +02:00
Chao Wu	6f98fbafde	Merge pull request #6706 from guixiongwei/feat/thp feat(runtime-rs): introduce huge page mode to select VM RAM's backend	2023-09-22 15:27:06 +08:00
Gabriela Cervantes	13ca7d9f97	gha: Add pandoc as a dependency for static checks To avoid the failure of not finding pandoc command this PR adds that package as a dependency for static checks. Fixes #8041 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-21 20:14:41 +00:00
Jeremi Piotrowski	28dd5ae91e	Merge pull request #7799 from UiPath/clh-directio-support clh: Direct IO support for block devices	2023-09-21 19:16:08 +02:00
David Esparza	6de9f39895	Merge pull request #8020 from GabyCT/topic/fixhunspell gha: Install hunspell for static checks	2023-09-21 10:58:40 -06:00
Gabriela Cervantes	08bc8e4db4	metrics: Add latency benchmark for gha This PR adds the latency benchmark for gha for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-21 16:14:39 +00:00
Gabriela Cervantes	6776b55d7e	metrics: Enable latency test in gha run script This PR enables the latency test for gha run script for kata metrics. Fixes #8037 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-21 16:11:58 +00:00
Peteris Rudzusiks	94e2ccc2d5	runtime: fix reading cgroup stats of sandboxes The cgroup stats come from resourcecontrol package in the form of pointers to structs. The sandbox Stat() method incorrectly was expecting structs. This caused the cpu and memory stats to always be 0, which in turn caused incorrect pod overhead metrics. Fixes #8035 Signed-off-by: Peteris Rudzusiks <rye@stripe.com>	2023-09-21 17:00:53 +02:00
Alexandru Matei	d507d189bb	fc: Add support for noflush cache option Firecracker supports noflush semantic via Unsafe cache type. There is no support for direct i/o, remove it from config file Fixes: #7823 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2023-09-21 14:48:24 +03:00
Alexandru Matei	2ca781518a	clh: Direct IO support for block devices Clh suports direct i/o for disks. It doesn't offer any support for noflush, removed passing of option to cloud-hypervisor internal config Fixes: #7798 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2023-09-21 14:48:24 +03:00
Fabiano Fidêncio	dd27912f31	Merge pull request #8032 from fidencio/topic/ci-make-push-after-build-be-trigger-by-workflow-dispatch ci: Trigger payload-after-push on workflow_dispatch	2023-09-21 10:25:24 +02:00
Fabiano Fidêncio	0c95697cc4	ci: Trigger payload-after-push on workflow_dispatch This will allow us to easily test failures and fixes on that workflows. Fixes: #8031 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-21 09:24:13 +02:00
Chris Porter	28cbc3b51c	ci: rootfs-image build-asset is failing Fixes: #8027 Signed-off-by: Chris Porter <porter@ibm.com>	2023-09-21 00:58:42 -05:00
Fabiano Fidêncio	21f6f9a173	Merge pull request #8016 from fidencio/topic/ci-test-with-crio-part-1 ci: Actually enable the CRI-O tests	2023-09-21 07:42:27 +02:00
Wainer Moschetta	87e64a07ed	Merge pull request #7979 from beraldoleal/gogo-removal protocol: remove gogoprotobuff tests	2023-09-20 22:38:10 -03:00
Gabriela Cervantes	87a8616488	gha: Install hunspell for static checks Seems like the static checks are failing due the missing of the hunspell package this PR fixes that. Fixes #8019 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-20 16:58:10 +00:00
Fabiano Fidêncio	8c3c50ca8a	ci: Actually enable the CRI-O tests The test has been added to the repo, but we have to also add it to the list of jobs to be executed. Fixes: #8005 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-20 18:01:25 +02:00
David Esparza	03554c799a	Merge pull request #8006 from fidencio/topic/ci-test-with-crio-part-0 ci: k8s: Also run tests with CRI-O	2023-09-20 07:45:17 -06:00
Fabiano Fidêncio	c6a9e50c37	Merge pull request #8004 from microsoft/danmihai1/quoted-spaces runtime: support kernel params including spaces	2023-09-20 12:10:51 +02:00
Wang, Arron	3a6510ad61	osbuild: Reduce guest components binary size with strip opa_linux_amd64_static 38M => 27M kata-agent 30M => 23M ls -alh opa_linux_amd64_static -rw-rw-r-- 1 arron arron 38M Jul 28 01:59 opa_linux_amd64_static ➜ kata-containers git:(main) ✗ strip opa_linux_amd64_static ➜ kata-containers git:(main) ✗ ls -alh opa_linux_amd64_static -rw-rw-r-- 1 arron arron 27M Sep 20 16:12 opa_linux_amd64_static ls -alh ./usr/bin/kata-agent -rwxr-xr-x. 1 root root 30M Jul 30 23:41 ./usr/bin/kata-agent ls -alh ./usr/bin/kata-agent -rwxr-xr-x. 1 root root 23M Sep 20 16:13 ./usr/bin/kata-agent Fixes: #8011 Signed-off-by: Wang, Arron <arron.wang@intel.com>	2023-09-20 16:23:17 +08:00
Fabiano Fidêncio	07a6e63a6b	ci: k8s: rke2: Use sudo to call systemd Otherwise we'll face the following error: ``` Failed to enable unit: Interactive authentication required. ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-20 08:48:29 +02:00
Fabiano Fidêncio	03b82e8484	ci: k8s: Add a CRI-O test Let's make sure we'll also be testing k8s using CRI-O. For now, we'll only be running the CRI-O test with QEMU. Once it becomes stable we can expand this to other Hypervisors as well. Fixes: #8005 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-20 00:59:09 +02:00
Fabiano Fidêncio	d7105cf7a4	ci: k8s: Add a method to install CRI-O This is based on official CRI-O documentations[0] and right now we're making this specific to Ubuntu as that's what we have as runners. We may want to expand this in the future, but we're good for now. [0]: https://github.com/cri-o/cri-o/blob/main/install.md#apt-based-operating-systems Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-20 00:59:09 +02:00
Fabiano Fidêncio	54c0a471b1	ci: k8s: k0s: Allow passing parameters to the k0s installer We'll need this in order to setup k0s with a different container engine. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-20 00:59:09 +02:00
Fabiano Fidêncio	31ef64606c	Merge pull request #8007 from fidencio/topic/ci-kata-deploy-fix-garm-runner-name ci: kata-deploy: Fix runner name	2023-09-20 00:58:33 +02:00
Beraldo Leal	730ef51693	deps: updating dependencies Updating dependencies after make check, make test. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-19 16:54:35 -04:00
GabyCT	6111ef6fb6	Merge pull request #7990 from GabyCT/topic/parallelbandwidth metrics: Enable parallel bandwidth iperf limit	2023-09-19 14:52:21 -06:00
Fabiano Fidêncio	3a2c83d69b	ci: kata-deploy: Fix runner name It should be garm-ubuntu-2004-smaller instead of garm-ubuntu-2004-small. Fixes: #7890 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 22:34:37 +02:00
Dan Mihai	82ff2db460	runtime: support kernel params including spaces Support quoted kernel command line parameters that include space characters. Example: dm-mod.create="dm-verity,,,ro,0 736328 verity 1 /dev/vda1 /dev/vda2 4096 4096 92041 0 sha256 f211b9f1921ef726d57a72bf82be23a510076639fa8549ade10f85e214e0ddb4 065c13dfb5b4e0af034685aa5442bddda47b17c182ee44ba55a373835d18a038" Fixes: #8003 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-09-19 20:26:38 +00:00
Beraldo Leal	604a9dd673	protocol: remove gogoprotobuff tests This is part of a bigger effort to drop gogoprotobuff from our code base. IIUC, those options are basically used by *pb_test.go, and since we are dropping gogoprotobuff and those are auto generated tests, let's just remove it. Fixes #7978. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-19 12:55:42 -04:00
Fabiano Fidêncio	5560e72024	Merge pull request #7896 from fidencio/topic/ground-work-for-testing-all-k8s-flavours-we-support ci: kata-deploy: Enable all k8s flavours that we support	2023-09-19 17:44:34 +02:00
Fabiano Fidêncio	f7fa7f602a	ci: Enable kata-deploy tests for all the supported k8s flavours Let's ensure we test kata-deploy on RKE2 and k0s as well. Fixes: #7890 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 13:38:10 +02:00
Fabiano Fidêncio	2c908b598c	ci: kata-deploy: Add the ability to deploy rke2 This will be very useful in the near future, when we start testing kata-deploy with rke2 as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 13:38:10 +02:00
Fabiano Fidêncio	eaf6164916	ci: kata-deploy: Add the ability to deploy k0s This will be very useful in the near future, when we start testing kata-deploy with k0s as well. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 13:38:10 +02:00
Fabiano Fidêncio	0015257636	ci: kata-deploy: Add deploy-k8s argument to gha-run.sh We'll be using exactly the same code used for the k8s tests, which are already deploying k3s on GARM. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 13:38:10 +02:00
Fabiano Fidêncio	bf2cb02283	ci: kata-deploy: Expland tests to run on k0s / rke2 We just need to make sure the correct overlay is applied, following what we already have been doing for k3s. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 13:38:10 +02:00
Fabiano Fidêncio	6d5d844e5c	Merge pull request #7983 from sprt/resource-group-naming ci: Create clusters in individual resource groups	2023-09-19 12:54:21 +02:00
Fabiano Fidêncio	b12b9e1886	ci: kata-deploy: Add placeholder for tests on GARM We'll be testing kata-deploy with different kubernetes flavours as part of our GARM tests, and this is a place-holder for this. Once enabled, we'll do nothing, just `return 0`, so we can then properly add the tests after this commit gets merged. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 12:42:02 +02:00
Fabiano Fidêncio	9e1fb8a966	ci: kata-deploy: Export KUBERNETES env var So we have a better control on which flavour of kubernetes kata-deploy is expected to be targetting. This was also done as part of `fa62a4c01b`, for the k8s tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 12:37:56 +02:00
Fabiano Fidêncio	09cc0ed438	ci: Move deploy_k8s() to gha-run-k8s-common.sh This will allow us to re-use the function in the kata-deploy tests, which will come soon. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 12:37:56 +02:00
Fabiano Fidêncio	1829f5c049	Merge pull request #7992 from skaegi/virtiofsd-1.8.0 versions: Bump virtiofsd to v1.8.0	2023-09-19 11:52:49 +02:00
Fabiano Fidêncio	486fe14c99	ci: Properly set K8S_TEST_UNION Otherwise only the first test will be executed Signed-off-by: Aurélien Bombo <abombo@microsoft.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-19 10:23:58 +02:00
Aurélien Bombo	d9ef1352af	ci: Add first letter of the K8S_TEST_HOST_TYPE to resource group name Ideally we'd add the instance_type or the full K8S_TEST_HOST_TYPE but that exceeds the maximum amount of characteres allowed for the cluster name. With this in mind, let's use the first letter of K8S_TEST_HOST_TYPE instead. Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-09-19 10:23:58 +02:00
Aurélien Bombo	68267a3996	ci: Create clusters in individual resource groups This makes it so that each AKS cluster is created in its own individual resource group, rather than using the "kataCI" resource group for all test clusters. This is to accommodate a tool that we recently introduced in our Azure subscription which automatically deletes resource groups after a set amount of time, in order to keep spending under control. The tool will automatically delete any resource group, unless it has a tag SkipAutoDeleteTill = YYYY-MM-DD. When this tag is present, the resource group will be retained until the specified date. Note that I tagged all current resource groups in our subscription with SkipAutoDeleteTill = 2043-01-01 so that we don't lose any existing resources. Fixes: #7982 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-09-19 10:23:55 +02:00
Fabiano Fidêncio	84c0d59d23	Merge pull request #7985 from fidencio/topic/clh-use-static_sandbox_resource_mgmt-as-default-on-arm clh: arm: Use static_sandbox_resource_mgmt=true	2023-09-19 09:25:34 +02:00
Gabriela Cervantes	9aa8d1c917	metrics: Add parallel bandwidth limit for qemu This PR adds the parallel bandwidth limit for qemu for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-18 21:08:54 +00:00
Simon Kaegi	44c7c082d9	versions: Bump virtiofsd to v1.8.0 https://gitlab.com/virtio-fs/virtiofsd/-/releases/v1.8.0 was released two weeks ago. We have fully tested and are using this version. Also bumps toolchain version to match what virtiofsd used. Fixes: #7960 Signed-off-by: Simon Kaegi <simon.kaegi@gmail.com>	2023-09-18 15:21:15 -04:00
Fabiano Fidêncio	5f8e210d3b	Merge pull request #7961 from ChengyuZhu6/update_nydus Bump nydus versions and update nydus tests	2023-09-18 21:02:20 +02:00
Fabiano Fidêncio	c3ee913bf6	Merge pull request #7953 from gkurz/extra-monitor-socket runtime/qemu: Rework QMP/HMP support	2023-09-18 19:04:14 +02:00
Gabriela Cervantes	af59d4bf4a	metrics: Enable parallel bandwidth iperf limit This PR enables the parallel bandwidth iperf limit for kata metrics. Fixes #7989 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-18 16:32:11 +00:00
Fabiano Fidêncio	aba36ab188	nydus: Temporarily skip tests on dragonball We're hitting a specific issue after updating, which will require some work on dragonball before it can be re-added here. The issue: ``` ... 3: failed to do rafs mount\\n 4: fail to attach rafs \\\"/var/lib/containerd-nydus/snapshots/2/fs/image/image.boot\\\"\\n 5: add share fs mount\\n 6: Mount rafs at /rafs/197ef3db03c86b91bf3045ff59183ce8b5750941ad1d3484f4a8301a70f5109f/rootfs_lower error: Failed to Mount backend ... Caused by: vmm action error: FsDevice(AttachBackendFailed(\\\"attach/detach a backend filesystem failed:: missing field `version` at line 1 column 489\\\"))\"): unknown" ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	b8a8dfcd15	nydus: Use `kata-${KATA_HYPERVISOR}` instead of `kata` This will ensure we're testing with the correct runtime, instead of using the `default` one. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
ChengyuZhu6	f6df3d6efb	static-build: Fix arch error on nydus build Fix the arch error when downloading the nydus tarball. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com> Signed-off-by: Steven Horsman <steven@uk.ibm.com>	2023-09-18 17:40:06 +02:00
ChengyuZhu6	2f9c9e2e63	tests: nydus: Update nydus tests To support the v0.12.0 nydus-snapshotter, we need to update the config files and the commandline to start nydus-snapshotter. Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	c9a4e7e46d	versions: Bump nydus and nydus-snapshotter to its latest release As we need https://github.com/containerd/nydus-snapshotter/pull/530 in. Fixes #7984 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: ChengyuZhu6 <chengyu.zhu@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	b73bde320d	gha: nydus: Populate run() And with this we finally enable the nydus tests to run as part of our GHA CI. Fixes: #6543 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	b3904a1a30	gha: nydus: Populate install_dependencies() Let's have all the dependencies needed for running the nydus tests installed. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	d2b3b67f5d	gha: nydus: Actually install kata when `install-kata` is called We've been simply doing nothing whenever `install-kata` was called, and that was the intent when we added the placeholder calls. Now, let's install kata, as expected. :-) Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	0ec00ad42e	gha: nydus: Get rid of nydus{,-snapshotter} install from nydus_test.sh As we've added install_nydus() and install_nydus_snapshotter(), which do conform with the pattern we're following on GHA, let's rely on them rather than relying on the bits coming from nydus_test.sh. Later on we'll have install_nydus() and install_nydus_snapshotter() as part of the dependencies install in our `gha-run.sh`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	568439c77b	tests: nydus: Add timeout to the crictl calls Similarly to what's been done for the cri-containerd tests, as part of `84dd02e0f9`, we need to add the timeout here for the crictl calls. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	5ac3b76eb1	tests: nydus: Add uid / namespace to the nydus container / sandbox Otherwise we may face errors like: ``` getting sandbox status of pod "d3af2db414ce8": metadata.Name, metadata.Namespace or metadata.Uid is not in metadata "&PodSandboxMetadata{Name:nydus-sandbox,Uid:,Namespace:default,Attempt:1,}" getting sandbox status of pod "-A": rpc error: code = NotFound desc = an error occurred when try to find sandbox: not found ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	376574a16c	tests: nydus: Decorate some calls with `sudo` Otherwise we canoot properly start the nydus snapshotter, nor properly kill it after it's been started. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	4290fd4b67	tests: nydus: Adapt "source ..." to GHA The "source ..." we've been doing was not changed since those tests were part of the Jenkins tests, and we need to adapt them, either setting the correct path or entirely removing the ones that are not relevant to us anymore. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	a84efa3e87	tests: nydus: Adapt check to "clh" instead "cloud-hypervisor" As that's what we've been using as part of the GHA. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	56a14b3950	tests: common: Add install_nydus_snapshotter() This function will be used to download and install the nydus-snapshotter, and it follows the same pattern we already have introduced for downloading and installing another dependencies from GitHub. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	b6563783e2	tests: common: Add install_nydus() This function will be used to download and install nydus, and it follows the same pattern we already have introduced for downloading and installing another dependencies from GitHub. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 17:40:06 +02:00
Fabiano Fidêncio	72599f1911	clh: arm: Use static_sandbox_resource_mgmt=true Users have noticed that this is needed, as CLH does not yet implement a way to hotplug resources on aarh64. With this patch, when building for x86_64, I can see the this is the resulting config: ``` $ ARCH=amd64 make ... $ cat config/configuration-clh.toml \| grep static_sandbox_resource_mgmt static_sandbox_resource_mgmt=false ``` And when building for aarch64: ``` $ ARCH=arm64 make ... $ cat config/configuration-clh.toml \| grep static_sandbox_resource_mgmt static_sandbox_resource_mgmt=true ``` Fixes: #7941 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-18 14:14:10 +02:00
Jeremi Piotrowski	dfa6af54df	Merge pull request #7806 from jongwu/clh_serial clh:arm64: use arm AMBA UART for hypervisor debug	2023-09-18 12:29:07 +02:00
Greg Kurz	1f16b6627b	runtime/qemu: Rework QMP/HMP support PR #6146 added the possibility to control QEMU with an extra HMP socket as an aid for debugging. This is great for development or bug chasing but this raises some concerns in production. The HMP monitor allows to temper with the VM state in a variety of ways. This could be intentionally or mistakenly used to inject subtle bugs in the VM that would be extremely hard if not even impossible to debug. We definitely don't want that to be enabled by default. The feature is currently wired to the `enable_debug` setting in the `[hypervisor.qemu]` section of the configuration file. This setting has historically been used to control "debug output" and it is used as such by some downstream users (e.g. Openshift). Forcing people to have the extra HMP backdoor at the same time is abusive and dangerous. A new `extra_monitor_socket` is added to `[hypervisor.qemu]` to give fine control on whether the HMP socket is wanted or not. This setting is still gated by `enable_debug = true` to make it clear it is for debug only. The default is to not have the HMP socket though. This isn't backward compatible with #6416 but it is for the sake of "better safe than sorry". An extra monitor socket makes the QEMU instance untrusted. A warning is thus logged to the journal when one is requested. While here, also allow the user to choose between HMP and QMP for the extra monitor socket. Motivation is that QMP offers way more options to control or introspect the VM than HMP does. Users can also ask for pretty json formatting well suited for human reading. This will improve the debugging experience. This feature is only made visible in the base and GPU configurations of QEMU for now. Fixes #7952 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-09-18 12:13:01 +02:00
Greg Kurz	cab46c9e23	Merge pull request #7973 from fidencio/topic/ci-use-bigger-machine-sizes-for-the-needed-tests-part-0 ci: Use variable size of VMs depending on the tests running	2023-09-18 12:06:44 +02:00
Fabiano Fidêncio	0e3bfac3b3	Merge pull request #7976 from fidencio/topic/ci-static-checks-rework-part-0 ci: Rework static checks	2023-09-18 11:01:18 +02:00
Peng Tao	6eedd9b0b9	Merge pull request #7738 from Xuanqing-Shi/7732/handle-non-empty-endpoints-in-RemoveEndpoints runtime: incorrect handling of non-empty []Endpoint parameter in Remo…	2023-09-18 10:58:28 +08:00
Fabiano Fidêncio	8b1e9b0c75	ci: static-checks: Clean up static-checks job Now that the static-checks job only takes care of running the static-checks, let's clean it up, remove all the unneeded steps, make sure that we're using the actions in their latest version, and have it running in a cost free runner. At some point I'd like to see those tests done in parallel, in the same way that I've organised the build-checks, but that's something for someone else, at some other time. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 14:23:02 +02:00
Fabiano Fidêncio	2c5ca2eaf8	ci: static-checks: Run tests depending on KVM With this we're removing the dragonball static-checks CI, as the test is running here now. :-) Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 14:22:38 +02:00
Fabiano Fidêncio	509c309ab2	ci: static-checks: Move "sudo make test" to the new test matrix We're moving it out of the previous "static-checks" confusing matrix, and adding it to the matrix that was currently being used for the `make vendor` and `make check` checks. This will allow us to have one job per component, and with that we can easily run those in parallel and on the zero cost runners. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:53:23 +02:00
Fabiano Fidêncio	4e963cedf4	ci: static-checks: Move "make test" to the new test matrix We're moving it out of the previous "static-checks" confusing matrix, and adding it to the matrix that was currently being used for the `make vendor` and `make check` checks. This will allow us to have one job per component, and with that we can easily run those in parallel and on the zero cost runners. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:53:17 +02:00
Fabiano Fidêncio	08f2e5ae0b	runtime-rs: Ensure static-checks-build is a dep of `make test` Otherwise `make test` will simply fail with: ``` error[E0583]: file not found for module `config` ``` Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:53:13 +02:00
Fabiano Fidêncio	2bc3a616ae	kata-ctl: Use `loop` instead of `kvm` module in tests This makes it pssible to run the tests in the cost free runners, which are not KVM capable. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:53:08 +02:00
Fabiano Fidêncio	46daddc500	kata-ctl: Ensure GENERATED_CODE is a dep of `make test` Otherwise `make test` will simply fail with: ``` error[E0583]: file not found for module `version` ``` Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:53:01 +02:00
Fabiano Fidêncio	ec826f328f	agent: Ensure GENERATED_CODE is a dep of `make test` Otherwise `make test` will fail with: ``` error[E0583]: file not found for module `version` ``` Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:57 +02:00
Fabiano Fidêncio	1d32410a83	ci: install_libseccomp: Do not depend on the tests repo It makes things way simpler, waaaaay simpler. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:49 +02:00
Fabiano Fidêncio	bf888b9a5e	ci: static-checks: Move "make check" to the new test matrix We're moving it out of the previous "static-checks" confusing matrix, and adding it to the matrix that was currently being used for the `make vendor` checks. This will allow us to have one job per component, and with that we can easily run those in parallel and on the zero cost runners. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:45 +02:00
Fabiano Fidêncio	473ec87806	kata-ctl: Add `kata-types` to the Cargo.lock file Commit message covered everything. :-) Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:40 +02:00
Fabiano Fidêncio	ea19549a99	kata-ctl: Ensure GENERATED_CODE is a dep of `make check` Otherwise `make check` would fail with: ``` Error writing files: failed to resolve mod `version`: /home/runner/work/kata-containers/kata-containers/src/tools/kata-ctl/src/ops/version.rs does not exist make: *** [../../../utils.mk:176: standard_rust_check] Error 1 ``` Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:36 +02:00
Fabiano Fidêncio	e125775863	tests: install_rust: Also install clippy clippy is used as part our tests, so it's useful to have it installed while we're already installing rust. In case of developers, they also better be using it. :-) Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:31 +02:00
Fabiano Fidêncio	e2c61a152c	ci: static-checks: Move vendor check to its own job Similarly to the static-check jobs, those jobs can be run on the zero cost runners. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:30 +02:00
Fabiano Fidêncio	6794d4c843	tests: Move install_rust.sh from the tests repo We'll use it as part of the refactoring we're doing in the static check tests. I can see a lot of other uses of this, but changing all of them to this one is out of the scope for this PR. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:29 +02:00
Fabiano Fidêncio	e64508c308	tests: install_go: Remove tests repo dependency We can rely on the functions that are now part of the common.bash. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:28 +02:00
Fabiano Fidêncio	11dff731b7	tests: Move functions from kata_arch script here We can use this a lot as part of our CI, but right now I'm just moving those here with the intent to use later on in this series. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:28 +02:00
Fabiano Fidêncio	75c974c802	ci: static-checks: Move kernel config check to its own job It doesn't make sense to run this for all the bits of the matrix, neither it's demanding enough to require running this in one of our Azure sponsored runners. Fixes: #7974 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 12:52:25 +02:00
Archana Shinde	9c233bb9e0	test: Add test to verify try_from for clh Netconfig Add tests to verify conversion from runtime NetworkConfig to clh specific config. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-09-16 00:24:14 -07:00
Fabiano Fidêncio	c69a1e33bd	ci: Use variable size of VMs depending on the tests running Let me start with a fair warning that this commit is hard to split into different parts that could be easily tested (or not tested, just ignored) without breaking pieces. Now, about the commit itself, as we're on the run to reduce costs related to our sponsorship on Azure, we can split the k8s tests we run in 2 simple groups: * Tests that can be run in the smaller Azure instance (D2s_v5) * Tests that required the normal Azure instance (D4s_v5) With this in mind, we're now passing to the tests which type of host we're using, which allows us to select to run either one of the two types of tests, or even both in case of running the tests on a baremetal system. Fixes: #7972 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-16 09:13:54 +02:00
Archana Shinde	9049d311df	runtime-rs: Add network support for cloud-hypervisor This PR adds support for adding a network device before starting the cloud-hypervisor VM. Support for adding and removing network devices is not really added to the resource manager, so supporting this for cloud-hypervisor is not scoped in this PR. This also changes "pending_devices" for clh implementation from an Option of vector to simply a vector. This simplifies the structure a bit as we can simple iterate over the pending devices instead of having to check for a "Some" value as this is not really required. Fixes: #6333 Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com> Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2023-09-15 23:25:20 -07:00
Greg Kurz	79c494eb4e	Merge pull request #7969 from fidencio/topic/ci-cache-using-oras-part-3 ci: cache: Check the sha256sum of the components & fix ovmf-sev cache usage	2023-09-15 16:30:22 +02:00
Fabiano Fidêncio	eecd5bf2aa	ci: cache: Fix ovmf-sev cache The cached tarball is relying on the component name, thus it's important to set it correctly, otherwise we'll end up always building it. With this patch applied: ``` ≡ ⨯ make ovmf-sev-tarball make ovmf-sev-tarball-build make[1]: Entering directory '/home/ffidenci/src/upstream/kata-containers/kata-containers' /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build//kata-deploy-binaries-in-docker.sh --build=ovmf-sev sha256:67cc94e393dc1d5bfc2b77a77e83c9b1c0833d0fbbebaa9e9e36f938bb841fcc Build kata version 3.2.0-rc0: ovmf-sev INFO: DESTDIR /home/ffidenci/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/destdir Downloading a76f5522493f ovmf-sev-builder-image-version Downloading 7e98c854bd94 kata-static-ovmf-sev.tar.xz Downloading 559311973ff8 ovmf-sev-version Downloaded a76f5522493f ovmf-sev-builder-image-version Downloading 353b655c2297 ovmf-sev-sha256sum Downloaded 559311973ff8 ovmf-sev-version Downloaded 353b655c2297 ovmf-sev-sha256sum Downloaded 7e98c854bd94 kata-static-ovmf-sev.tar.xz Pulled [registry] ghcr.io/kata-containers/cached-artefacts/ovmf-sev:latest-main-x86_64 Digest: sha256:933236c2c79e53be3ca7acc0b966d0ddac9c0335edcb1e8cad8b9bb3aaf508ce kata-static-ovmf-sev.tar.xz: OK INFO: Using cached tarball of ovmf-sev drwxr-xr-x runner/runner 0 2023-09-15 10:34 ./ drwxr-xr-x runner/runner 0 2023-09-15 10:34 ./opt/ drwxr-xr-x runner/runner 0 2023-09-15 10:34 ./opt/kata/ drwxr-xr-x runner/runner 0 2023-09-15 10:34 ./opt/kata/share/ drwxr-xr-x runner/runner 0 2023-09-15 10:34 ./opt/kata/share/ovmf/ -rwxr-xr-x runner/runner 4194304 2023-09-15 10:34 ./opt/kata/share/ovmf/AMDSEV.fd ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/builddir ~/src/upstream/kata-containers/kata-containers/tools/packaging/kata-deploy/local-build/build/ovmf-sev/builddir make[1]: Leaving directory '/home/ffidenci/src/upstream/kata-containers/kata-containers' ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 12:39:22 +02:00
Fabiano Fidêncio	86c41074b4	ci: cache: Check the sha256sum of the component We've removed this in the part 2 of this effort, as we were not caching the sha256sum of the component. Now that this part has been merged, let's get back to checking it. Fixes: #7834 -- part 3 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 12:34:30 +02:00
Fabiano Fidêncio	f5e52d02d3	Merge pull request #7964 from fidencio/topic/ci-cache-using-oras-part-2 ci: cache: Use the artefacts stored in ghcr.io/kata-containers/cached-artefacts/${component}	2023-09-15 12:29:28 +02:00
Fabiano Fidêncio	2fe0b494da	Merge pull request #7959 from fidencio/topic/ci-run-on-smaller-garm-instances ci: Run some of the GARM tests in smaller instances	2023-09-15 11:30:13 +02:00
Fabiano Fidêncio	460988c5f7	ci: cache: Remove the script used to cache artefacts on Jenkins That's not needed anymore, as we've switched to using ORAS and an OCI registry to cache the artefacts. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 10:27:55 +02:00
Fabiano Fidêncio	4533a7a416	ci: cache: Also store the ${component} sha256sum This is something that was done by our Jenkins jobs, but that I ended up missing when writing `d0c257b3a7`. Now, let's also add the sha256sum to the cached artefact, and in a coming up PR (after this one is merged) we will also start checking for that. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 10:25:26 +02:00
Fabiano Fidêncio	eccc76df63	ci: cache: Use the cached artefacts from ORAS In the previous series related to the artefacts we build, we've switching from storing the artefacts on Jenkins, to storing those in the ghcr.io/kata-containers/cached-artefacts/${artefact_name}. Now, let's take advantage of that and actually use the artefacts coming from that "package" (as GitHub calls it). NOTE: One thing that I've noticed that we're missing, is storing and checking the sha256sum of the artefact. The storing part will be done in a different commit, and the checking the sha256sum will be done in a different PR, as we need to ensure those were pushed to the registry before actually taking the bullet to check for them. Fixes: #7834 -- part 2 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 10:13:47 +02:00
Jeremi Piotrowski	6f30d00ae7	Merge pull request #7956 from fidencio/topic/ci-reduce-the-machine-size-used ci: Reduce the size of the AKS VMs	2023-09-15 08:49:08 +02:00
Steve Horsman	1b8f3fa9ae	Merge pull request #7957 from fidencio/topic/ci-cache-using-oras-part-1 ci: cache: Allow pushing our artefacts to an OCI registry	2023-09-15 07:45:24 +01:00
Jianyong Wu	7f5e77bcb8	kernel: enable Arm pl011 support Enable pl011 (ttyAMA0) support in kernel for aarch64. Fixes: #5080 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-09-15 01:45:16 +00:00
Jianyong Wu	241c355e07	clh:arm64: use arm AMBA uart for hypervisor debug cloud hypervisor on arm64 only support arm AMBA UART(pl011) as tty. So, the console should be set to "ttyAMA0" instead of "ttyS0" when enable hypervisor debug mode. Fixes: #5080 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>	2023-09-15 01:44:23 +00:00
Fabiano Fidêncio	094b6b2cf8	ci: k8s: Temporarily disable tests that require a bigger VM instance The list of tests which require a bigger VM instance is: * k8s-number-cpus.bats -- failing on all CIs * k8s-parallel.bats -- only failing on the cbl-mariner CI * k8s-scale-nginx.bats -- only failing on the cbl-mariner CI We'll keep those disabled while we re-work the logic to only run those in a bigger (and more expensive) VM instance. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 01:33:19 +02:00
GabyCT	6fe5cd3bd5	Merge pull request #7937 from GabyCT/topic/iperfbandwidth metrics: Add iperf value for cpu utilization	2023-09-14 16:47:19 -06:00
Fabiano Fidêncio	d0c257b3a7	ci: cache: Push cached artefacts to ghcr.io Let's push the artefacts to ghcr.io and stop relying on jenkins for that. Fixes: #7834 -- part 1 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:39:57 +02:00
Fabiano Fidêncio	108f1b60dd	kata-deploy: Generate latest_{artefact,image_builder} files Right now this is not used, but it'll be used when we start caching the artefacts using ORAS. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:39:57 +02:00
Fabiano Fidêncio	be2eb7b378	ci: cache: Install ORAS in the kata-deploy binaries builder container ORAS is the tool which will help us to deal with our artefacts being pushed to and pulled from a container registry. As both the push to and the pull from will be done inside the kata-deploy binaries builder container, we need it installed there. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:39:57 +02:00
Fabiano Fidêncio	fb24fb0dc1	ci: k8s: devmapper: Use a smaller / cheaper VM instance We don't need to run on a D4s_v5. as those tests are not CPU / memory intense. With this is mind, let's use a smaller version of the instance, the D2s_v5 one. Fixes: #7958 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:27:05 +02:00
Fabiano Fidêncio	1daf02f5d4	ci: nydus: Use a smaller / cheaper VM instance We don't need to run on a D4s_v5. as those tests are not CPU / memory intense. With this is mind, let's use a smaller version of the instance, the D2s_v5 one. Fixes: #7958 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:25:41 +02:00
Fabiano Fidêncio	e60d81f554	ci: nerdctl: Use a smaller / cheaper VM instance We don't need to run on a D4s_v5. as those tests are not CPU / memory intense. With this is mind, let's use a smaller version of the instance, the D2s_v5 one. Fixes: #7958 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:25:41 +02:00
Fabiano Fidêncio	4db416997c	ci: docker: Use a smaller / cheaper VM instance We don't need to run on a D4s_v5. as those tests are not CPU / memory intense. With this is mind, let's use a smaller version of the instance, the D2s_v5 one. Fixes: #7958 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:25:41 +02:00
Fabiano Fidêncio	32841827b8	ci: cri-containerd: Use a smaller / cheaper VM instance We don't need to run on a D4s_v5. as those tests are not CPU / memory intense. With this is mind, let's use a smaller version of the instance, the D2s_v5 one. Fixes: #7958 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-15 00:25:35 +02:00
Fabiano Fidêncio	92fff129fd	ci: k8s: Don't set cpu limit request for k8s-inotofy test Without setting the cpu limit / request to 1, we can make this test run in a smaller VM instance without any issue. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 22:03:16 +02:00
Fabiano Fidêncio	faf98c0623	ci: Reduce the size of the AKS VMs We do not need a very powerful machine for our tests, as we're not building anything there. The instance we switched to (Standard_D2s_v5) still has nested virt available, as shown here[0], but has half of the amount of vCPUs / Memory, which should be fine only for running the tests, costing us basically half of the price[1]. [0]: https://learn.microsoft.com/en-us/azure/virtual-machines/dv5-dsv5-series [1]: https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/#pricing Fixes: #7955 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 22:03:16 +02:00
Fabiano Fidêncio	adc18ecdb1	ci: cache: For consistency, read all used env vars Instead of having some of them only being considered if explicitly passed to the script. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 20:24:48 +02:00
Fabiano Fidêncio	c7a851efd7	ci: cache: Pass the exposed env vars to the kata-deploy binaries in docker As the environment variables are now being passed down from the GitHub Actions, let's make sure they're exposed to the container used to build the kata-deploy binaries, and during the build process we'll be able to use those to log in and push the artefacts to the OCI registry, using ORAS. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 20:24:48 +02:00
Fabiano Fidêncio	2e8b41f39c	Merge pull request #7954 from fidencio/topic/ci-cache-using-oras-part-0 ci: cache: Export env vars needed to use ORAS	2023-09-14 20:23:55 +02:00
Fabiano Fidêncio	6bd15a85d5	ci: cache: Export env vars needed to use ORAS We do the build of our artefacts inside a container image, and we need to expose some env vars to the container so ORAS can be used there to push the artefacts we want to cache to ghcr.io. The env vars we're exposing are: * ARTEFACT_REGISTRY: The registry where we're going to save the artefacts. * ARTEFACT_REGISTRY_USERNAME: The username to log in to the registry, as ORAS does not use the same json file used by docker. * ARTEFACT_REGISTRY_PASSWORD: The pasword to log in to the the registry, as the ORAS does not use the same json file used by docker. * TARGET_BRANCH: The target branch, which will be part of the tag of the artefact, as we may end up caching the artefacts for both main and stable branches. Fixes: #7834 -- part 0 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-14 19:36:33 +02:00
Gabriela Cervantes	cd4fd1292a	metrics: Add iperf cpu utilization limit for qemu This PR adds the iperf cpu utilization limit for qemu for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-14 17:17:47 +00:00
Gabriela Cervantes	df5cd10ea0	metrics: Add iperf value for cpu utilization This PR adds the iperf value for cpu utilization for kata metrics. Fixes #7936 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-14 16:06:49 +00:00
Jeremi Piotrowski	b54dd8cdf4	Merge pull request #7704 from jepio/vfio-part-1 gha: vfio: Import test script	2023-09-14 16:45:31 +02:00
Jeremi Piotrowski	a96050a7ad	tests: Apply timeout to 'ctr t kill' This task has been observed to hang at times. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	9d93036783	tests/vfio: Bump VM image to Fedora 38 We need a very recent L2 guest kernel to fix all the bugs that occur in nested virtualization. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	faee59b520	tests/vfio: Accept single device in vfio group for CLH cloud hypervisor does not emulate pcie switches or pci bridges, so we need to accept a lonely device. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	df3dc1105c	tests/vfio: Get rid of sync's It is fine to start a VM with the disk image without syncing it as we now run the test in an ephemeral Azure instance. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	7211c3dccc	gha: vfio: Set test timeout to 15m Sometimes the test gets stuck running commands in the container - need to investigate why later. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	1b02f89e4f	packaging: kernel: Enable VIRTIO_IOMMU on x86_64 Cloud Hypervisor exposes a VIRTIO_IOMMU device to the VM when IOMMU support is enabled. We need to add it to the whitelist because dragonball uses kernel v5.10 which restricted VIRTIO_IOMMU to ARM64 only. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	3a1db7a86b	runtime: clh: Support enabling iommu by enabling IOMMU on the default PCI segment. For hotplug to work we need a virtualized iommu and clh exposes one if there is some device or PCI segment that requests it. I would have preferred to add a separate PCI segment for hotplugging vfio devices but unfortunately kata assumes there is only one segment all over the place. See create_pci_root_bus_path(), split_vfio_pci_option() and grep for '0000'. Enabling the IOMMU on the default PCI segment requires passing enabling IOMMU on every device that is attached to it, which is why it is sprinkled all over the place. CLH does not support IOMMU for VirtioFs, so I've added a non IOMMU segment for that device. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	9f1a42c6cc	tests/vfio: Give commands 30s to execute This is a to catch the case of the guest getting stuck. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	b46b0ecf8b	tests/vfio: Configure a value for 'hot_plug_vfio' for both vmms This shouldn't be hiding behind only a qemu check, we need this for clh as well. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	bfc93927fb	runtime: Remove redundant check in checkPCIeConfig There is no way for this branch to be hit, as port is only set when it is different than config.NoPort. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	7c4e73b609	runtime: Add test cases for checkPCIeConfig These test cases shows which options are valid for CLH/Qemu, and test that we correctly catch unsupported combinations. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	fc51e4b9eb	runtime: Check config for supported CLH (cold\|hot)_plug_vfio values The only supported options are hot_plug_vfio=root-port or no-port. cold_plug_vfio not supported yet. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	509771e6f5	runtime: clh: Add hot_plug_vfio entry to config hot_plug_vfio needs to be set to root-port, otherwise attaching vfio devices to CLH VMs fails. Either cold_plug_vfio or hot_plug_vfio is required, and we have not implemented support for cold_plug_vfio in CLH yet. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	5f6475a28a	tests/vfio: Gather debug info and disable tdp_mmu tdp_mmu had some issues up until around Linux v6.3 that make it work particularly bad when running nested on Hyper-V. Reload the module at the start of the test and disable the tdp_mmu param. Gather debug info at the end of the test to make it easier to figure out what went wrong. This uses github actions group syntax so that each section can be collapsed. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	8fffdc81c5	tests/vfio: Capture journal from vm For debugging (though this doesn't get exposed yet). Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	df815087e7	tests/vfio: Change to get the test working in GHA - reduce memory and cpu usage to fit in a D4s_v5 - source correct lib - mount workspace from 9p - disable cpu mitigations for speed - drop unused commands and variables - install containerd - install kata from built artifacts Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	a92ddeea15	tests/vfio: Move dependency installation to gha-run.sh To match the flow of other github actions workflows. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Jeremi Piotrowski	5a551a85b1	gha: vfio: Import jobs scripts from tests repo This imports the vfio test scripts github.com/kata-containers/tests. The test case doesn't work yet but doing the changes in a separate commit will make it easier to track the changes. The only change in this commit is renaming vfio_jenkins_job_build.sh -> vfio_fedora_vm_wrapper.sh Fixes: #6555 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-14 14:23:28 +02:00
Fabiano Fidêncio	a1e3fa7ac4	Merge pull request #7905 from microsoft/danmihai1/mariner-annotations tests: fix kernel and initrd annotations	2023-09-14 10:37:42 +02:00
GabyCT	1d331124ad	Merge pull request #7925 from GabyCT/topic/bandwidthlimit metrics: Add iperf bandwidth value for kata metrics	2023-09-13 17:43:55 -06:00
Gabriela Cervantes	49e2fa189c	metrics: Increase jitter value for qemu This PR increases the jitter value for qemu for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-13 22:36:09 +00:00
Gabriela Cervantes	49234433a7	metrics: Increase value limit for jitter in clh This PR increases the value limit for jitter in clh. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-13 21:27:08 +00:00
David Esparza	0a24d3f718	Merge pull request #7923 from GabyCT/topic/addcassandradoc metrics: Add Cassandra Metrics documentation	2023-09-13 10:17:00 -06:00
GabyCT	c565053bac	Merge pull request #7895 from GabyCT/topic/removewarning metrics: Remove warning from metrics documentation	2023-09-13 10:16:38 -06:00
Fabiano Fidêncio	8b9df1d32e	Merge pull request #7929 from fidencio/topic/use-tcp-port-ping-on-docker-nerdctl-tests ci: docker: nerdctl: Switch to tcp port 80 ping	2023-09-13 15:46:31 +02:00
Peng Tao	55ca7e8aec	Merge pull request #7907 from Xuanqing-Shi/7876/network-devices-naming-conflict runtime: Naming conflict of network devices	2023-09-13 19:29:41 +08:00
Fabiano Fidêncio	813bfdec01	ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io This will ensure that we're calling the correct binary for the hypervisor. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:10:14 +02:00
Fabiano Fidêncio	46bc0b1c01	ci: nerdctl: Create the containerd config Otherwise we'll fail to configure kata-containers in the `install-kata` step. This is mostly needed because the nerdctl-full tarball doesn't provide a contaienrd configuration, just the binary, as contaienrd does not actually require a configuration file to run with the default config. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:00:57 +02:00
Fabiano Fidêncio	13968aa7f6	ci: nerdctl: Switch to tcp port 80 ping TIL that the Azure VMs we use are created without an explicit outbund connectivity defined. This leads us to issues using `ping ...` as part of our tests, and when consulting Jeremi Piotrowski about the issue he pointed me out to two interesting links: * https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access * https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity For your own sanity, do not read the comments, after all this is internet. :-) Anyways, the suggestion is to use nping instead, which is provided by the nmap package, so we can explicitly switch to using the tcp port 80 for the ping. With this in mind, I'm switching the image we use for the test and using one that provided nping as a possible entry point, and from now on (this part of) the tests should work. Fixes: #7910 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:00:57 +02:00
Fabiano Fidêncio	e0c811678b	ci: docker: Switch to tcp port 80 ping TIL that the Azure VMs we use are created without an explicit outbund connectivity defined. This leads us to issues using `ping ...` as part of our tests, and when consulting Jeremi Piotrowski about the issue he pointed me out to two interesting links: * https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access * https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity For your own sanity, do not read the comments, after all this is internet. :-) Anyways, the suggestion is to use nping instead, which is provided by the nmap package, so we can explicitly switch to using the tcp port 80 for the ping. With this in mind, I'm switching the image we use for the test and using one that provided nping as a possible entry point, and from now on (this part of) the tests should work. Fixes: #7910 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-13 13:00:57 +02:00
shixuanqing	1636abbe1c	runtime: issue with non-empty []Endpoint in RemoveEndpoints In the RemoveEndpoints(), when the endpoints paramete isn't empty, using idx may result in wrong endpoint removals. To improve, directly passing the endpoint parameter helps locate the correct elements within n.eps. Fixes: #7732 Signed-off-by: shixuanqing <1356292400@qq.com> Fixes: #7732 Signed-off-by: shixuanqing <1356292400@qq.com> Update src/runtime/virtcontainers/network_linux.go Co-authored-by: Xuewei Niu <justxuewei@apache.org>	2023-09-13 09:47:18 +00:00
Peng Tao	9766f9090c	Merge pull request #7719 from beraldoleal/nullable Remove gogoproto.nullable extension	2023-09-13 15:11:56 +08:00
David Esparza	c2b2a00ad9	Merge pull request #7899 from GabyCT/topic/startdocker metrics: Ensure docker is running in init_env	2023-09-12 23:01:26 -06:00
Gabriela Cervantes	0aa073967d	metrics: Add iperf bandwidth value for qemu This PR adds the iperf bandwidth value for qemu for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-12 20:57:14 +00:00
Dan Mihai	c0ad914766	tests: fix kernel and initrd annotations Fix kernel and initrd annotations in the k8s tests on Mariner. These annotations must be applied to the spec.template for Deployment, Job and ReplicationController resources. Fixes: #7764 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-09-12 20:15:25 +00:00
Gabriela Cervantes	615c1cbf19	metrics: Add iperf bandwidth value for kata metrics This PR adds the iperf bandwidth value for kata metrics. Fixes #7924 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-12 19:30:24 +00:00
Gabriela Cervantes	d53eb73eec	metrics: Ensure docker is running in init_env This PR ensures that docker is running as part of the init_env function in kata metrics to avoid failures like docker is not running and making the kata metrics CI to fail. Fixes #7898 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-12 19:13:09 +00:00
GabyCT	c0d502493e	Merge pull request #7921 from dborquez/metrics_disable_fio_test metrics: this PR skips the FIO test temprarily to fix issues	2023-09-12 12:08:48 -06:00
Gabriela Cervantes	ad08321b83	metrics: Add Cassandra Metrics documentation This PR adds the Cassandra Metrics documentation for kata metrics. Fixes #7922 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-12 16:30:35 +00:00
David Esparza	a58ea66592	metrics: this PR skips the FIO test temprarily to fix issues FIO test is showing ongoing issues when running in k8s. Working on running FIO on the ctr client which has been shown to be stable. Fixes: #7920 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-09-12 10:23:57 -06:00
Fabiano Fidêncio	2d8447fc6b	Merge pull request #7916 from fidencio/topic/add-functional-nerdctl-tests ci: Add a very basic nerdctl sanity test	2023-09-12 17:47:08 +02:00
James O. D. Hunt	7feb8de9dc	Merge pull request #7887 from jodh-intel/hypervisor-remove-debug-kernel-options runtime-rs: hypervisor: Remove debug kernel options	2023-09-12 16:31:48 +01:00
Fabiano Fidêncio	f536ef5ce1	ci: docker: Also run the smoke test with runc This will help us to make sure that the failure is actually related to Kata Containers. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-12 16:54:02 +02:00
Fabiano Fidêncio	c83f167c59	ci: docker: Run the tests after the kata-static is created There's no reason to wait till the payload is created to run the tests, as we rely on the tarball, not on the kata-deploy payload. That was a mistake on my side, and that's already fixed for the nerdctl tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-12 16:53:47 +02:00
Fabiano Fidêncio	12d833d07d	ci: Add a very basic nerdctl sanity test Let's add a very basic sanity test to check that we can spawn a containers using nerdctl + Kata Containers. This will ensure that, at least, we don't regress to the point where this feature doesn't work at all. In the future, we should also test all the VMMs with devmapper, but that's for a follow-up PR after this test is working as expected. Fixes: #7911 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-12 16:52:55 +02:00
Greg Kurz	be71a0ab4e	Merge pull request #7811 from stevenhorsman/bump-rust-to-1.72 versions: Bump rust version	2023-09-12 15:30:35 +02:00
Fabiano Fidêncio	b020912629	Merge pull request #7913 from fidencio/topic/add-functional-docker-tests ci: Add a very basic docker sanity test	2023-09-12 15:28:49 +02:00
Fabiano Fidêncio	348b8644d6	ci: Add a very basic docker sanity test Let's add a very basic sanity test to check that we can spawn a containers using docker + Kata Containers. This will ensure that, at least, we don't regress to the point where this feature doesn't work at all. For now we're running this test against Cloud Hypervisor and QEMU only, due to an already reported issue with dragonball: https://github.com/kata-containers/kata-containers/issues/7912 In the future, we should also test all the VMMs with devmapper, but that's for a follow-up PR after this test is working as expected. Fixes: #7910 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-12 15:15:26 +02:00
stevenhorsman	a75fd5eb81	runk: Fix rust unecessary mut error - Fix `error: variable does not need to be mutable` in rust 1.72 Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	a31c145172	kata-ctl: useless-vec warning - Fix clippy::useless-vec warning Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	c8419fc3bb	kata-ctl: Resolve non-minimal-cfg warning - In rust 1.72, clippy warned clippy::non-minimal-cfg as the cfg has only one condition, so doesn't need to be wrapped in the any combinator. Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	3eaf68d954	agent-ctl: Allow clippy lint - Allow `clippy::redundant-closure-call` which has issues with the guard function passed into the `run_if_auto_values` macro Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	1d8b78959d	runtime-rs: Fix useless-vec warning Fix clippy::useless-vec warning Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	99f3d69e94	runtime-rs: Remove mut Fix `error: variable does not need to be mutable` Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	16fbc27b09	dragonball: Allow ambiguous-glob-reexports The bindgen generated code is triggering lots of ambiguous-glob-reexports warnings in rust 1.70+ Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	bbf1919516	dragonball: Resolve non-minimal-cfg warning - In rust 1.72, clippy warned clippy::non-minimal-cfg as the cfg has only one condition, so doesn't need to be wrapped in the all combinators. Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	75cfdd5d59	agent: config: Allow clippy lint - Allow `clippy::redundant-closure-call` in `from_cmdline` which has issues with the guard function passed into the `parse_cmdline_param` macro Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	f3a0fd5907	agent: config: Fix useles-vec warning Fix clippy::useless-vec warning Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	9e423bd3d6	libs: Fix clippy unnecesary hashes error - Fix error: unnecessary hashes around raw string literal Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
stevenhorsman	444395050a	versions: Bump rust version Bump rust to 1.72.0 to test what extra warnings/issues we get Fixes: #7902 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-09-12 11:31:49 +01:00
Yipeng Yin	a16b0962b5	chore(cargo): update cargo lock Update cargo lock for runtime-rs, agent and kata-ctl. Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>	2023-09-12 15:27:38 +08:00
Chao Wu	c800d0739f	Merge pull request #7889 from UiPath/fix-dragonball-build dragonball: fix for non-deterministic builds	2023-09-12 14:06:18 +08:00
shixuanqing	ca4b6b051d	runtime: Naming conflict of network devices When creating a new endpoint, we check existing endpoint names and automatically adjust the naming of the new endpoint to ensure uniqueness. Fixes: #7876 Signed-off-by: shixuanqing <1356292400@qq.com>	2023-09-12 04:29:51 +00:00
Guixiong Wei	202049f35e	feat(runtime-rs): introduce huge page type to select VM RAM's backend This commit allows us to specify the huge page backend when enabling huge page. Currently, we support two backends: thp and hugetlbfs, the default is hugetlbfs. To ensure backward compatibility, we introduce another configuration item "hugepage_type" to select the memory backend, which is available only when "enable_hugepages" is true. Besides, we add an annotation "io.katacontainers.config.hypervisor.hugepage_type" to configure huge page type per pod. Fixes: #6703 Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>	2023-09-12 11:28:27 +08:00
Zhongtao Hu	e1f54f96d0	Merge pull request #7766 from Apokleos/wrap-vsock-virtiofs runtime-rs: bring hybrid vsock devices in manager.	2023-09-12 09:27:34 +08:00
GabyCT	af29eeb8b1	Merge pull request #7901 from fidencio/topic/ci-target-branch-fixes-follow-up-3 ci: use github.ref_name instead of $GITHUB_REF_NAME	2023-09-11 15:31:29 -06:00
Fabiano Fidêncio	f811b064ca	ci: use github.ref_name instead of $GITHUB_REF_NAME As, regardless of what's mentioned in the documentation, it seems that $GITHUB_REF_NAME is passed down as a literal string. Fixes: #7414 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-11 22:14:55 +02:00
Fabiano Fidêncio	dc0b350e49	Merge pull request #7900 from fidencio/topic/ci-target-branch-fixes-follow-up-2 ci: Add more target-branch related fixes	2023-09-11 21:26:26 +02:00
Fabiano Fidêncio	6d795c089e	ci: Add more target-branch related fixes The ones for the payload-after-push.yamland ci-nightly.yaml are not that much important right now, but they're needed for when we start running those on stable branches as well. The other ones were missed during `bd24afcf73`. Fixes: #7414 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-11 20:42:57 +02:00
Fabiano Fidêncio	07d0ad0ad7	Merge pull request #7897 from fidencio/topic/ci-devmapper-do-the-rebase-as-well ci: Fix target-branch usage	2023-09-11 20:30:53 +02:00
Fabiano Fidêncio	d7f991d139	Merge pull request #7151 from Yuan-Zhuo/fix-systemd-cgroup agent: optimize the code of systemd cgroup manager	2023-09-11 20:15:51 +02:00
Fabiano Fidêncio	8509c31870	ci: Fix target-branch usage We missed those one as part of `bd24afcf73`. Fixes: #7414 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-11 20:10:27 +02:00
Gabriela Cervantes	060499dcae	metrics: Remove warning from metrics documentation Now that the metrics migration from the tests to kata containers has been completed, this PR removes the warning from the main metrics documentation. Fixes #7894 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-11 16:41:48 +00:00
GabyCT	b384757ac7	Merge pull request #7874 from fidencio/topic/manually-rebase-branches-atop-of-the-target-one gha: Manually rebase PR atop of the target branch before testing	2023-09-11 10:35:01 -06:00
Fabiano Fidêncio	46e73cf7a2	Merge pull request #7884 from fidencio/topic/update-kernel-to-the-latest-lts-plus-bring-in-erofs-patches Update kernel to the latest LTS release (v6.1.52) and bring in erofs patches needed for the CC work	2023-09-11 13:58:43 +02:00
James O. D. Hunt	c0f697fcc5	runtime: Allow kernel_params annotation To support the removal of the `initcall_debug` and `earlyprintk=` options from the default guest kernel cmdline, add `kernel_params` to the list of enabled annotations to allow those kernel options (or others) to be set using `kata-deploy` for either runtime. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-11 12:12:12 +01:00
Alexandru Matei	b03e49794e	dragonball: fix for non-deterministic builds Fixes: #7888 Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2023-09-11 14:07:10 +03:00
Fabiano Fidêncio	93bad13769	Merge pull request #7875 from fidencio/topic/kata-deploy-fix-arm64-image-build kata-deploy: Fix aarch64 image build	2023-09-11 11:36:52 +02:00
James O. D. Hunt	976d10150c	runtime-rs: hypervisor: Remove debug kernel options Removed the following kernel command line options: - `earlyprintk=ttyS0` - `initcall_debug` Both these options are only useful when debugging a guest kernel failure which is not a common occurrence. Further, the `earlyprintk=` option can have a large negative performance impact (it can increase the VM boot time significantly). If the user wishes to use either of these options, they can add them to the `kernel_params=` setting in the Kata configuration file's hypervisor stanza. Fixes: #7886. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-11 09:43:39 +01:00
Fabiano Fidêncio	fde34610cd	kernel: Add erofs patches needed for CC related work All the patches have already been merged upstream and they've just been cherry-picked to this branch. Fixes: #7885 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-11 10:39:37 +02:00
Fabiano Fidêncio	dc6a4588a2	versions: Bump kernel to the latest LTS release (6.1.52) We're bumping here in order to make our lives easier backporting EROFS patches needed for the CC related work. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-11 10:32:16 +02:00
James O. D. Hunt	52f6449b70	kata-manager: Remove initcall_debug kernel option Removed the addition of the `initcall_debug` kernel option when agent debugging enabled. This option has nothing to do with the agent. If the user wishes to use this option, they can add it to the `kernel_params=` setting in the Kata configuration file's hypervisor stanza. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-09-11 09:31:44 +01:00
Fabiano Fidêncio	6cd5d83a37	Merge pull request #7865 from gkurz/fix-more-virtiofs-args runtime: Fix more virtiofs args	2023-09-09 21:30:16 +02:00
Fabiano Fidêncio	8b4a0b368f	kata-deploy: Remove curl after it's used There's no need to keep curl there after the kubectl binary has already been downloaded. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-09 10:52:05 +02:00
Fabiano Fidêncio	139c7f03ab	kata-deploy: Fix aarch64 image build Similarly to what's been done for x86_64 -> amd64, we need to do a aarch64 -> arm64 change in order to be able to download the kubectl binary. Fixes: #7861 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-09 10:51:52 +02:00
Fabiano Fidêncio	94f5a69346	Merge pull request #7862 from fidencio/topic/kata-deploy-use-alpine-as-base-image kata-deploy: Switch to an alpine image	2023-09-09 09:02:13 +02:00
Yuan-Zhuo	470d065415	agent: optimize the code of systemd cgroup manager 1. Directly support CgroupManager::freeze through systemd API. 2. Avoid always passing unit_name by storing it into DBusClient. 3. Realize CgroupManager::destroy more accurately by killing systemd unit rather than stop it. 4. Ignore no such unit error when destroying systemd unit. 5. Update zbus version and corresponding interface file. Acknowledgement: error handling for no such systemd unit error refers to Fixes: #7080, #7142, #7143, #7166 Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com> Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>	2023-09-09 13:56:43 +08:00
GabyCT	fa818bfad1	Merge pull request #7867 from GabyCT/topic/optimizedimage metrics: Use TensorFlow optimized image	2023-09-08 11:34:21 -06:00
Fabiano Fidêncio	bd24afcf73	gha: Manually rebase PR atop of the target branch before testing We're changing what's been done as part of `ac939c458c`, as we've notcied issues using `github.event.pull_request.merge_commit_sha`. Basically, whenever a force-push would happen, the reference of merge_commit_sha wouldn't be updated, leading us to test PRs with the old code. :-/ In order to get the rebase properly working, we need to ensure we pull the hash of the commit as part of checkout action, and ensure fetch-depth is set to 0. Fixes: #7414 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 18:56:31 +02:00
GabyCT	dc7414f5c1	Merge pull request #7870 from dborquez/metrics_fio_fix_clean_env_order metrics: fix FIO test initialization	2023-09-08 10:28:10 -06:00
Greg Kurz	72c510d057	runtime/virtiofsd: Drop all references to "--cache=none" This syntax belongs to the legacy C virtiofsd implementation that we don't support anymore since kata-containers 3.1.3 because of other API breaking changes. People have been warned to switch from "none" to "never" since kata-containers 2.5.2. Let's officially do that. The compat code that would convert "none" to "never" isn't needed anymore. Just drop it. Fixes #7864 Signed-off-by: Greg Kurz <groug@kaod.org>	2023-09-08 17:57:30 +02:00
Beraldo Leal	ead724bec1	protocol: removing gogo.nullable feature gogo.nullable is the main gogo.protobuf' feature used here. Since we are trying to remove gogo.protobuf, the first reasonable step seems to be remove this feature. This is a core update, and it will change how the structs are defined. I could spot only a few places using those structs, based on make check/build. Fixes #7723. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	d8e4bb9859	protocol: remove unused PROTO_FILE env There is no reference to PROTO_FILE and this is not working. Also we are not inside a Makefile, so makes sense to adapt the usage to reflect the script instead of a make command. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	5e1106a770	protocol: remove unused import_path import_path is used as the default package when no input files specify go_package. However, all the files we are currently building already have a go_package definition, making this behavior both redundant and error-prone. Additionally, one of our files (types.pb.go) resides outside the grpc directory, indicating that it's indeed ignored but also inconsistent. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	87accaaecb	protocol: use workdir during build Currently, the script searches for .proto files within $GOPATH/. Consequently, modifications to a definition file in the current working directory won't influence the output .pb.go if the directory is outside of $GOPATH. For developers, it's more intuitive to alter the local codebase than the version stored in $GOPATH. With this modification, the generated .pb.go files will be relative to the current working directory, removing the need to clone this project under $GOPATH/src/github.com/kata-containers. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	711a7ed965	protocol: remove mapping definitions The definitions are already specified in the .proto files using the go_package option. Centralizing them in one location reduces the potential for errors and simplifies the script. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	8db84c1bd2	protocol: force GOPATH to be set Currently, if GOPATH is not set, errors will raise since protoc is using GOPATH to find packages. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Beraldo Leal	68156d77ac	protocol: breaking lines to improve readability Just a small change to improve the readability of modules before the actual changes. Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-09-08 11:49:01 -04:00
Fabiano Fidêncio	670a8e9c73	kata-deploy: Switch to an alpine image This will make our image smaller, and still ensure it's multi-arch support. Fixes: #7861 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 17:39:51 +02:00
Fabiano Fidêncio	0b26a5d053	Merge pull request #7871 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-3 ci: k8s: Add clean-up-garm argument for gha-run.sh	2023-09-08 17:27:57 +02:00
Fabiano Fidêncio	9d74b7ccc9	k8s: ci: Skip "Pod quota" test with firecracker The test is failing, and an issue has been opened to track it. For now, let's skip it. Issue: https://github.com/kata-containers/kata-containers/issues/7873 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 15:51:46 +02:00
Fabiano Fidêncio	f6cd3930c5	ci: k8s: Remove useless skip statement from tests There's absolutely no need to have the skip check as part of the test itself when it's already done as part of the setup function. We're only touching the files here that were touched in the previous commit. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 14:25:29 +02:00
Fabiano Fidêncio	3cc20b47a6	ci: k8s: Also check for "fc" (for firecracker) Let's keep both checks for now, but in the future we'll be able to remove the check for "firecracker", as the hypervisor name used as part of the GitHub Actions has to match what's used as part of the kata-deploy stuff, which is `fc` (as in `kata-fc for the runtime class) instead of `firecracker`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 14:25:24 +02:00
Fabiano Fidêncio	b5bad3cb0f	ci: k8s: Add clean-up-garm argument for gha-run.sh The tests are failing to finish as the argument is invalid. Fixes: #6542 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 14:04:50 +02:00
Fabiano Fidêncio	05e2e7636e	Merge pull request #7868 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-2 ci: k8s: Second round of fix-ups with the devmapper CI	2023-09-08 11:02:20 +02:00
Fabiano Fidêncio	aaec5a09f3	ci: k8s: devmapper tests should be using ubuntu 20.04 That's what we've been using as part of Jenkins, so let's ensure things will work as they did before, and only after that consider upgrading the base OS used for the tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 10:09:04 +02:00
Fabiano Fidêncio	27fa7d828d	ci: k8s: Add a kata-deploy-garm target We've been using the `kata-deploy-tdx` target as that also uses k3s as base, but it's better to just have a specific garm target. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 10:09:04 +02:00
Fabiano Fidêncio	fa62a4c01b	ci: k8s: Export KUBERNETES env var So we have a better control on which flavour of kubernetes kata-deploy is expected to be targetting. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 10:09:04 +02:00
Fabiano Fidêncio	8c9380a798	ci: k8s: Install bats on GARM runners GARM runners do not come with the whole set of tools we need, or are used to when it comes to the GHA runners, so we need to manually install bats on those. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-08 10:09:04 +02:00
Fabiano Fidêncio	3de23034f8	ci: k8s: Wait some time after restarting k3s Let's put a 1 minute sleep, just to make sure everything is back up again. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:46:58 +02:00
David Esparza	adfea55b8f	metrics: fix FIO test initialization This PR changes the order in which the FIO test first cleans the environment and then checks if the environment is indeed clean. Fixes: #7869 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-09-07 15:41:59 -06:00
Fabiano Fidêncio	2df183fd99	ci: k8s: Append, instead of overwrite, the devmapper config As we were using `tee` without the `-a` (or `--apend`) aptton, the containerd config would be overwritten, leading to a NotReady state of the Node. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:12:55 +02:00
Fabiano Fidêncio	369a8af8f7	ci: k8s: Decrease k3s sleep from 4 to 2 minutes It should be plenty, and worked well in local tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:12:55 +02:00
Fabiano Fidêncio	ada65b988a	ci: k8s: Use vanilla kubectl with k3s Let's download the vanilla kubectl binary into `/usr/bin/`, as we need to avoid hitting issues like: ```sh error: open /etc/rancher/k3s/k3s.yaml.lock: permission denied ``` The issue basically happens because k3s links `/usr/local/bin/kubectl` to `/usr/local/bin/k3s`, and that does extra stuff that vanilla `kubectl` doesn't do. Also, in order to properly use the k3s.yaml config with the vanilla kubectl, we're copying it to ~/.kube/config. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:12:55 +02:00
Fabiano Fidêncio	ad45ab5d33	ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644 Otherwise the /etc/rancher/k3s/k3s.yaml is not readable by other users than root. As --write-config-mode is being passed, and that's an option that has to be passed to the `server`, -s is also added to the command line. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:12:55 +02:00
Fabiano Fidêncio	028a97e0d5	ci: k8s: Use the proper command for sleep `wait` waits for a job to complete, not a number of seconds. Not sure how I got that wrong in the first place, but it's what it's. Fixes: #6542 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 23:12:55 +02:00
David Esparza	34f580901f	Merge pull request #7824 from dborquez/fix_memory_usage_initialization metrics: re-enable memory-usage initialization step	2023-09-07 14:24:27 -06:00
Gabriela Cervantes	3a427795ea	metrics: Use TensorFlow optimized image This PR replaces the ubuntu image for one which has TensorFlow optimized for kata metrics. Fixes #7866 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-07 15:38:51 +00:00
Chao Wu	cd8c217ee1	Merge pull request #6879 from openanolis/chao/update_upstream_upcall_feature Dragonball: optimize the placement of dbs-upcall features	2023-09-07 18:07:53 +08:00
Fabiano Fidêncio	dfa1cce916	Merge pull request #7860 from fidencio/topic/ci-add-k8s-devmapper-tests-follow-up-1 ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml	2023-09-07 11:48:30 +02:00
Fabiano Fidêncio	8d99972a8a	ci: k8s: Fix typo in run-k8s-tests-on-garm.yaml integrations -> integration integrtion -> integration Fixes: #6542 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-07 11:31:30 +02:00
Fabiano Fidêncio	0483d3d16d	Merge pull request #7841 from fidencio/topic/ci-add-k8s-devmapper-tests ci: k8s: Add k8s devmapper tests (part 0)	2023-09-07 10:53:09 +02:00
Jeremi Piotrowski	f6cc01d77c	Merge pull request #7833 from jepio/kata-static-fix-ownership kata-deploy: Create kata-static.tar with correct ownership	2023-09-07 10:16:23 +02:00
Peng Tao	435e890cd9	Merge pull request #7703 from bergwolf/github/nerdctl-fc runtime: run prestart hooks before starting VM for FC	2023-09-07 10:55:31 +08:00
Chao Wu	deed1b927d	Dragonball: optimize the placement of dbs-upcall features Currently, the dbs-upcall features have 2 problems that are needed to be fixed : There are redundant dbs-upcall features that are needed to be removed. Some place should be controlled by dbs-upcall but not being implemented. This commit will fix those two problems. fixes: #6878 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-09-07 10:27:29 +08:00
Fabiano Fidêncio	0e8bd50cbb	ci: k8s: Add k8s devmapper tests (part 0) Let's enable the devmapper kubernetes tests to match exactly what's been tested as part of the Jenkins CI. Fixes: #6542 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-06 23:08:38 +02:00
Fabiano Fidêncio	b28b54df04	ci: k8s: Add a function to configure devmapper for containerd This function right now is completely based on what's part of the tests repo[0], and that's the reason I'm keeping the `Signed-off-by` of all the contributors to that file. This is not perfect, though, as it changes the default snapshotter to devmapper, instead of only doing so for the Kata Containers specific runtime handlers. OTOH, this is exactly what we've always been doing as part of the tests. We'll improve it, soon enough, when we get to also add a way for kata-deploy to set up different snapshotters for different handlers. But, for now, this is as good (or as bad) as it's always been. It's important to note that the devmapper setup doesn't take into consideration a BM machine, and this is not suitable for that. We're really only targetting GHA runners which will be thrown away after the run is over. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com> Signed-off-by: Shiming Zhang <wzshiming@foxmail.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-06 23:08:17 +02:00
Fabiano Fidêncio	54f7117212	ci: k8s: Add a function to deploy k3s One can use different kubernetes flavours for getting a kubernetes cluster up and running. As part of our CI, though, I really would like to avoid contributors spending time maintaining and updating kubernetes dependencies, as done with the tests repo, and which has been proven to be really good on getting things rotten. With this in mind, I'm taking the bullet and using "k3s" as the way to deploy kubernetes for the devmapper related tests, and that's the reason I'm adding a function to do so, and this will be used later on as part of this series. It's important to note that the k3s setup doesn't take into consideration a BM machine, and this is not suitable for that. We're really only targetting GHA runners which will be thrown away after the run is over. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-06 23:07:41 +02:00
David Esparza	cf258090aa	Merge pull request #7843 from GabyCT/topic/ffiolimit metrics: Add write 95 percentile FIO value	2023-09-06 14:52:00 -06:00
Fabiano Fidêncio	c5e1e7ddc3	Merge pull request #7854 from fidencio/topic/runtime-allow-virtio_fs_extra_args-annotation runtime: Allow virtio_fs_extra_args annotation	2023-09-06 19:20:40 +02:00
Greg Kurz	81536f21af	runtime/qemu: Pass "--xattr" to virtiofsd instead of "-o xattr" The "-o" syntax belongs to the legacy C virtiofsd. It is deprecated with the rust implementation. Signed-off-by: Greg Kurz <groug@kaod.org>	2023-09-06 17:50:35 +02:00
Fabiano Fidêncio	b1dd09a4d3	runtime: Allow virtio_fs_extra_args annotation Some use cases may just require passing extra arguments to virtiofsd, and having this disabled by default makes it impossible to set when using kata-deploy, as changes in the configuration file would be overwritten by the daemon-set. With this in mind, let's allow users to pass whatever thet need (and here I'm specifically looking at `--xattr`) as a virtio_fs_extra_arg. Fixes: #7853 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-06 17:11:16 +02:00
Hyounggyu Choi	d27fe18167	Merge pull request #7849 from BbolroC/hot-fix-dockerbuild packaging: do not install docker-compose-plugin for s390x\|ppc64le	2023-09-06 13:13:25 +02:00
Hyounggyu Choi	2efda20c77	packaging: do not install docker-compose-plugin for s390x\|ppc64le This PR is to skip installing docker-compose-plugin while buiding a `build-kata-deploy` image for s390x\|ppc64le. It is a temporary solution to fix current CI failures for s390x regarding `hash sum mismatch`. Fixes: #7848 Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>	2023-09-06 11:12:03 +02:00
Zhongtao Hu	aa85e0b3ec	Merge pull request #7714 from justxuewei/volumes-cleanup runtime-rs: Fix volumes and rootfs cleanup issues	2023-09-06 10:13:55 +08:00
Gabriela Cervantes	438fbf9669	metrics: Add write 95 percentile for FIO for qemu This PR adds the write 95 percentile for FIO for qemu for checkmetrics for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 22:50:31 +00:00
Gabriela Cervantes	024b4d2ffe	metrics: Add write 95 percentile FIO value This PR adds the write 95 percentile FIO value for checkmetrics for kata metrics. Fixes #7842 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 21:00:05 +00:00
GabyCT	3e3a91fd2c	Merge pull request #7577 from GabyCT/topic/enableiperfm metrics: Enable iperf benchmark on gha for kata metrics	2023-09-05 14:53:47 -06:00
Gabriela Cervantes	e98e5cdea2	metrics: Add checkmetrics to gha run script This PR adds the checkmetrics to gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 17:05:03 +00:00
Gabriela Cervantes	c1edfe5511	metrics: Add checkmetrics value for qemu for iperf This PR adds the checkmetrics value for qemu for iperf benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 16:04:52 +00:00
Gabriela Cervantes	6a79ecedf9	metrics: Add jitter value for clh This PR adds jitter value for clh for iperf metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 16:04:52 +00:00
Gabriela Cervantes	f609a9a754	metrics: Add test selector to iperf metrics This PR adds test selector to iperf metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 16:04:52 +00:00
Gabriela Cervantes	5b8db30422	metrics: Enable iperf benchmark on gha for kata metrics This PR enables the iperf benchmark to run on the gha for kata metrics. Fixes #7575 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-09-05 16:04:52 +00:00
Jeremi Piotrowski	cf46b056fd	Merge pull request #7839 from openanolis/chao/switch_to_azure CI: switch static-checks-dragonball CI machines to Azure	2023-09-05 10:59:02 +02:00
Chao Wu	60f733d301	CI: switch static-checks-dragonball CI machines to Azure Previously, static-checks-dragonball is using machines from Alibaba Cloud to run all the CI jobs. Currently, we are going through an internal process to apply for the new machines for Dragonball CI. Before the internal process is over, we will temporarily use Azure VM to run static-checks-dragonball jobs. fixes: #7838 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>	2023-09-05 15:19:07 +08:00
alex.lyn	7870b33a2d	runtime-rs: bring hybridVsock devices in manager. Currently, virtio_vsock are still outside of the device manager. This causes some management issues,such as the inability to unify PCI address management. Just do some work for hybrid vsock. Fixes: #7655 Signed-off-by: alex.lyn <alex.lyn@antgroup.com>	2023-09-05 08:46:56 +08:00
Jeremi Piotrowski	18c94ebbe3	kata-deploy: Create kata-static.tar with correct ownership Pass --owner and --group to the tar invokation to prevent gihtub runner user from leaking into release artifacts. Fixes: #7832 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-09-04 17:24:00 +02:00
Fabiano Fidêncio	b663ec21ac	Merge pull request #7803 from GabyCT/topic/readmereportdoc metrics: Add README for kata metrics report	2023-09-03 21:57:13 +02:00
Fabiano Fidêncio	e490b0bc76	Merge pull request #7808 from ManaSugi/fix/remove-manual-chcon osbuilder: Remove chcon operation for guest SELinux	2023-09-03 21:55:02 +02:00
Fabiano Fidêncio	27dab249a0	Merge pull request #7800 from jodh-intel/kata-sys-util-update-tdx-protection-checks kata-sys-util: protection: Update TDX checks	2023-09-02 14:47:51 +02:00
Jiang Liu	d5729e818c	Merge pull request #7819 from jiangliu/storage-cleanup Improve the way to clean up storage devices for sandbox	2023-09-02 17:02:51 +08:00
Jiang Liu	57e7bf14a6	agent: refine StorageDeviceGeneric::cleanup() Refine StorageDeviceGeneric::cleanup() to improve safety. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 14:22:21 +08:00
Jiang Liu	53edb19374	agent: implement StorageDeviceGeneric::cleanup() Refactor cleanup_sandbox_storage as StorageDeviceGeneric::cleanup(). Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 14:00:26 +08:00
Jiang Liu	0c63453e28	types: make StorageDevice::cleanup() return possible error code Make StorageDevice::cleanup() return possible error code. Fixes: #7818 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 13:27:06 +08:00
Jiang Liu	3a3d77b3b5	agent: move StorageDeviceGeneric from kata-types into agent Move StorageDeviceGeneric from kata-types into agent, so we can refactor code later. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-02 13:12:17 +08:00
Jiang Liu	d848126b61	Merge pull request #7821 from jiangliu/storage-leak agent: avoid possible leakage of storage device	2023-09-02 12:40:40 +08:00
Fabiano Fidêncio	4f92e6df90	Merge pull request #7683 from microsoft/danmihai1/policy-tests tests: add policy to existing tests	2023-09-01 23:52:15 +02:00
David Esparza	b151cfd140	metrics: re-enable memory-usage initialization step This PR re-enables the initialization step disabled on `538c965c2b`. Fixes: #7804 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-09-01 14:29:34 -06:00
Fabiano Fidêncio	f3e1a6a94f	osbuilder: alpine: Change mirror As we're hitting a lot of: ``` ERROR: https://dl-5.alpinelinux.org/alpine/v3.18/main: operation timed out ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-01 16:01:42 +00:00
Fabiano Fidêncio	ac612aef5e	osbuilder: alpine: Match the version on versions.yaml We've switching to 3.18 as part of `82cd14ba39`. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-01 16:01:33 +00:00
Jiang Liu	9cd706d1c9	agent: avoid possible leakage of storage device When a storage device is used by more than one container, the second and forth instances will cause storage device reference count leakage, thus cause storage device leakage. The reason is: add_storages() will increase reference count of existing storage device, but forget to add the device to the `mount_list` array, thus leak the reference count. Fixes: #7820 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-09-01 22:52:42 +08:00
Dan Mihai	bf21411e90	tests: add policy to k8s tests Use AGENT_POLICY=yes when building the Guest images, and add a permissive test policy to the k8s tests for: - CBL-Mariner - SEV - SNP - TDX Also, add an example of policy rejecting ExecProcessRequest. Fixes: #7667 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-09-01 14:28:08 +00:00
Dan Mihai	d0e0610679	runtime: config: use the SEV initrd for SNP Thanks Unmesh Deodhar! Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-09-01 14:28:08 +00:00
Fabiano Fidêncio	67fed26f18	runtime: Use TDX image with in the qemu-tdx config Let's make sure we use the TDX image as part of the QEMU TDX configuration, which will help us to have the policies tested here. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-01 14:28:08 +00:00
Fabiano Fidêncio	f65ffb23da	Merge pull request #7814 from fidencio/topic/gha-rebase-prs-atop-of-main-for-the-tests gha: Rebase PR atop of the target branch before testing	2023-09-01 16:26:32 +02:00
Fabiano Fidêncio	ef70aeb6b8	Merge pull request #7817 from fidencio/topic/update-alpine-to-its-latest-release versions: Update alpine to its 3.18 version	2023-09-01 14:51:58 +02:00
Fabiano Fidêncio	ac939c458c	gha: Rebase atop of the target branch We have two scenarios we care about this, `pull_request` and `pull_request_target` events triggered a job. `pull_request` event: When using the checkout action, it'll already provide a "rebased atop of main" repo for us, nothing else is needed, and that's basically what we already have as part of the jobs in our CI. `pull_request_target` event: This one is a little bit tricky, as the checkout action, unless passing a spsecific repo, give us the PR checked out rebased atop of the HEAD of the PR branch. Jeremi Piotrowski nicely pointed out that we could use github.event.pull_request.merge_commit_sha instead, which is the result of the PR's branch with the official repo target branch. Now, the only cases where the contributor's rebase would still be needed is when the action itself has been changed. Fixes: #7414 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-09-01 11:23:31 +02:00
Jeremi Piotrowski	bde06758b1	Merge pull request #7761 from jepio/iocopy-fix-race runtime: Fix data race in ioCopy	2023-09-01 09:30:54 +02:00
Fabiano Fidêncio	82cd14ba39	versions: Update alpine to its 3.18 version 3.15 will be out of life in 2 months from now. Fixes: #7816 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-31 23:02:54 +02:00
GabyCT	d75c7b5f9c	Merge pull request #7813 from GabyCT/topic/genreport metrics: Add grabdata script for metrics report	2023-08-31 13:33:38 -06:00
Gabriela Cervantes	6668825752	metrics: Add grabdata script for metrics report This PR adds the grabdata script so it can be used for the metrics report for kata metrics. Fixes #7812 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-31 16:17:29 +00:00
James O. D. Hunt	c290eaed8c	kata-sys-util: protection: Update TDX checks Update the protection checking code to detect newer versions of Intel TDX (whose userland interface has now stabilised). > Note: that we don't need to retain the existing behaviour since: > > - We haven't yet landed the TDX feature (#6448). > - Systems wishing to use TDX will need to use the latest available > system components (such as firmware and host kernel). Also added an explicit TDX unit test. Fixes: #7384. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2023-08-31 16:15:15 +01:00
Fabiano Fidêncio	d7a996c686	gha: Update to checkout@v3 action At this point we should always be using the latest checkout action. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-31 16:02:31 +02:00
Jeremi Piotrowski	d7612440b8	Merge pull request #7789 from beraldoleal/tests/amd Fixes tests on AMD machines	2023-08-31 11:23:51 +02:00
Jeremi Piotrowski	c2ba29c15b	runtime: Fix data race in ioCopy IoCopy is a tricky function (I don't claim to fully understand its contract), but here is what I see: The goroutine that runs it spawns 3 goroutines - one for each stream to handle (stdin/stdout/stderr). The goroutine then waits for the stream goroutines to exit. The idea is that when the process exits and is closed, the stdout goroutine will be unblocked and close stdin - this should unblock the stdin goroutine. The stderr goroutine will exit at the same time as the stdout goroutine. The iocopy routine then closes all tty.io streams. The problem is that the stdout goroutine decrements the WaitGroup before closing the stdin stream, which causes the iocopy goroutine to race to close the streams. Move the wg.Done() of the stdout routine past the close so that this race becomes impossible. I can't guarantee that this doesn't affect some unspecified behavior. Fixes: #5031 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-08-31 10:17:38 +02:00
Manabu Sugimoto	211de08d9e	osbuilder: Remove chcon operation for guest SELinux Remove the `chcon` operation which adds `container_runtime_exec_t` label to the `kata-agent` binary because the container-selinux package including the `39f83cc74d` commit has been released officially. Ref. https://centos.pkgs.org/9-stream/centos-appstream-x86_64/container-selinux-2.221.0-1.el9.noarch.rpm.html The container-selinux package is installed in a guest rootfs when we create it with `SELinux = yes`, and `restorecon` sets `container_runtime_exec_t` to the `kata-agent`. Fixes: #7807 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-31 16:44:32 +09:00
GabyCT	b467f2ef68	Merge pull request #7772 from GabyCT/topic/fiolimit metrics: Enable FIO limits for kata metrics	2023-08-30 14:49:04 -06:00
Gabriela Cervantes	9f21fa9b39	metrics: Add report generator link to general documentation This PR adds the report generator link to general documentation. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-30 16:55:14 +00:00
Gabriela Cervantes	c0ed5ea0ad	metrics: Add README for kata metrics report This PR adds the README for kata metrics report. Fixes #7802 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-30 16:36:08 +00:00
Fabiano Fidêncio	aa2b51a831	Merge pull request #7783 from GabyCT/topic/makereport metrics: Add metrics report script	2023-08-30 17:11:39 +02:00
Gabriela Cervantes	a7b59a5bf9	metrics: Add limit for 90 percentile for qemu value This PR adds the limit for 90 percentile for qemu value for FIO kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-30 13:53:38 +00:00
Gabriela Cervantes	99db6568e9	metrics: Add limit for write 90 percentile value for clh This PR adds the limit for write 90 percentile value for clh for FIO metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-30 13:53:38 +00:00
Gabriela Cervantes	6e06392c55	metrics: Enable FIO limits for kata metrics This PR enables the FIO limits for kata metrics. Fixes #7771 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-30 13:53:38 +00:00
David Esparza	924d06a7f5	Merge pull request #7787 from GabyCT/topic/fixmemoryinsidelimit metrics: Fix memory inside limits for kata metrics	2023-08-30 07:45:17 -06:00
Peng Tao	2e4c874726	runtime/vc: runPrestartHooks should ignore GetHypervisorPid failure If we are running FC hypervisor, it is not started when prestart hooks are executed. So we should just ignore such error and just go ahead and run the hooks. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-30 03:06:11 +00:00
Peng Tao	21204caf20	runtime: fail early when starting docker container with FC FC does not support network device hotplug. Let's add a check to fail early when starting containers created by docker. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-30 02:52:01 +00:00
Peng Tao	32fd013716	runtime: run prestart hooks before starting VM for FC Add a new hypervisor capability to tell if it supports device hotplug. If not, we should run prestart hooks before starting new VMs as nerdctl is using the prestart hooks to set up netns. To make nerdctl + FC to work, we need to run the prestart hooks before starting new VMs. Fixes: #6384 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-30 02:52:01 +00:00
Beraldo Leal	00e7ffd988	tests: check vmx only on Intel machines When running on amd machines, those tests will fail because there is no vmx flag. Following other tests that checks for cpuType, let's adapt them to restrict vmx only on Intel machines. Fixes #7788. Related #5066 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-08-29 20:04:31 -04:00
Gabriela Cervantes	c8dd3c0737	metrics: Fix memory footprint qemu limit This PR fixes the memory footprint qemu limit for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 22:51:21 +00:00
Gabriela Cervantes	8877ec62fb	metrics: Fix memory inside limits for kata metrics This PR fixes the memory inside limit for clh for kata metrics due to the recent changes that we had in the script which impacted in the performance measurement. Fixes #7786 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 21:38:18 +00:00
Beraldo Leal	80146f2078	tests: Fixes cpuType check on AMD machines cpuType is not initialized yet. gets 0 (Intel) by default, failing on AMD machines. Fixes #7785 Signed-off-by: Beraldo Leal <bleal@redhat.com>	2023-08-29 17:04:07 -04:00
Gabriela Cervantes	7e364716dd	metrics: Add test setup details to metrics report This PR adds test setup details to metrics report. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:56:53 +00:00
Gabriela Cervantes	17dc1b9760	metrics: Add boot lifecycle times to metrics report This PR adds the boot lifecycle times to metrics report. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:55:44 +00:00
Gabriela Cervantes	3b0d6538f2	metrics: Add memory inside container to metrics report This PR adds memory inside container to metrics report. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:53:17 +00:00
Gabriela Cervantes	79fbb9d243	metrics: Add scaling system footprint in metrics report This PR adds scaling system footprint in metrics report. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:51:27 +00:00
Gabriela Cervantes	8e6d4e6f3d	metrics: Add metrics reportgen This PR adds metrics reportgen for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:45:36 +00:00
Gabriela Cervantes	139ffd4f75	metrics: Add report file titles This PR adds report file titles for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 17:43:06 +00:00
GabyCT	8f2dae7b53	Merge pull request #7775 from dborquez/fix_memory_usage_parsing_results metrics: fix parsing issue on memory-usage test	2023-08-29 11:26:13 -06:00
Gabriela Cervantes	878d1a2e7d	metrics: Generate PNGs alongside the PDF report This PR generates the PNGs for the kata metrics PDF report. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 16:50:32 +00:00
Gabriela Cervantes	fce2487971	metrics: Add metrics report R files This PR adds the metrics report R files. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 16:45:22 +00:00
Gabriela Cervantes	08812074d1	metrics: Add report dockerfile This PR adds the report dockerfile for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 16:28:32 +00:00
Gabriela Cervantes	69781fc027	metrics: Add metrics report script This PR adds metrics report script for kata metrics. Fixes #7782 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-29 16:25:14 +00:00
Chao Wu	e4fb20c74a	Merge pull request #7585 from lifupan/main dragonball: vsock add fifo/pipe stream support for passed fd hybridSt…	2023-08-29 23:39:21 +08:00
Fabiano Fidêncio	50e51bcafe	Merge pull request #7185 from UnmeshDeodhar/add-cc-sev-test tests: Add confidential test	2023-08-29 15:32:25 +02:00
Fabiano Fidêncio	e286e842c1	tests: Expand confidential test to support TDX Let's expand the confidential test to also support TDX. The main difference on the test, though, is that we're not grepping for a string in the `dmesg` output, but rather relying on `cpuid` to detect a TDX guest. Fixes: #7184 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-29 14:10:47 +02:00
Unmesh Deodhar	e31f099be1	tests: Expand confidential test to support SNP Let's expand the confidential test to also support SNP. Fixes: #7184 Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>	2023-08-29 14:10:47 +02:00
Unmesh Deodhar	c3b9d4945e	tests: Add confidential test for SEV Add a test case for the launch of unencrypted confidential container, verifying that we are running inside a TEE. Right now the test only works with SEV, but it'll be expanded in the coming commits, as part of this very same series. Fixes: #7184 Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-29 14:10:34 +02:00
David Esparza	538c965c2b	metrics: fix parsing issue on memory-usage test This PR fixes an issues in the parsing results stage, by collecting just the n-results from the n-running containers, discarding irrelevant data. Fixes: #7774 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-28 23:39:46 -06:00
Fabiano Fidêncio	708b0a3052	Merge pull request #7768 from fidencio/topic/update-tdx-to-the-6.2-kernel-based-stack tdx: Update the components needed for using the 6.2 kernel stack	2023-08-28 19:27:15 +02:00
Fabiano Fidêncio	3818bf3311	local-build: Remove $HOME/.docker/buildx/activity/default The file can be removed between builds without causing any issue, and leaving it around has been causing us some headache due to: ``` ERROR: open /home/runner/.docker/buildx/activity/default: permission denied ``` Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:41:36 +02:00
Fabiano Fidêncio	d1b54ede29	qemu: tdx: Workaround SMP issue with TDX 1.5 `...,sockets=1,cores=numvcpus,threads=1,...` must be used. Fixes: #7770 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:41:36 +02:00
Archana Shinde	1e34220c41	qemu: tdx: Adapt to the TDX 1.5 stack QEMU for TDX 1.5 makes use of private memory map/unmap. Make changes to govmm to support this. Support for private backing fd for memory is added as knob to the qemu config. Userspace's map/unmap operations are done by fallocate() ioctl on the backing store fd. Reference: https://lore.kernel.org/linux-mm/20220519153713.819591-1-chao.p.peng@linux.intel.com/ Fixes: #7770 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com> Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:41:36 +02:00
Fabiano Fidêncio	8115a0522d	versions: tdx: Update Kernel to 6.2 + TDX This is the version that's been used and tested inside Intel, and it matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15. Fixes: #7770 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:11:34 +02:00
Fabiano Fidêncio	ec18180f34	versions: tdx: Update TDVF to the "edk2-stable202302" This is the version that's been used and tested inside Intel, and it matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15. Fixes: #7770 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:11:34 +02:00
Fabiano Fidêncio	9803b24286	versions: tdx: Update QEMU to v7.2 + TDX v1.10 This is the version that's been used and tested inside Intel, and it matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15. Fixes: #7770 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-28 13:11:27 +02:00
Fabiano Fidêncio	02a08c956b	Merge pull request #7754 from microsoft/danmihai1/pod-quota-deployment tests: delete k8s deployment at the test's end	2023-08-27 17:52:00 +02:00
Fabiano Fidêncio	98037ced52	Merge pull request #7755 from microsoft/danmihai1/unique-test-name tests: use unique test name	2023-08-27 17:27:40 +02:00
Zhongtao Hu	f0440a9cfe	Merge pull request #7742 from frezcirno/fix-log-forwarder-loop runtime-rs: check peer close in log_forwarder	2023-08-26 10:44:09 +08:00
Fabiano Fidêncio	16a610d788	Merge pull request #7758 from fidencio/topic/gha-avoid-fail-fast-till-everything-is-ultra-stable gha: Avoid "fail-fast" in tests that are known to be flaky	2023-08-25 16:49:26 +02:00
Jiang Liu	91db888d83	Merge pull request #7602 from jiangliu/agent-storage Refine storage device management for kata-agent	2023-08-25 22:20:18 +08:00
Zixuan Tan	dffc16e5b3	runtime-rs: check peer close in log_forwarder The log_forwarder task does not check if the peer has closed, causing a meaningless loop during the period of “kata vm exit”, when the peer closed, and “ShutdownContainer RPC received” that aborts the log forwarder. This patch fixes the problem. Fixes: #7741 Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>	2023-08-25 19:00:07 +08:00
Jiang Liu	aaa5ab1264	agent: simplify storage device by removing StorageDeviceObject Simplify storage device implementation by removing StorageDeviceObject. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-25 17:23:16 +08:00
Fabiano Fidêncio	fb49d5d7ce	gha: Avoid "fail-fast" in tests that are known to be flaky Otherwise we'll have to re-run all the tests due to a flaky behaviour in one of the parts. Fixes: #7757 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-25 10:00:17 +02:00
Dan Mihai	183f51d6f6	tests: use unique test name k8s-pid-ns.bats was already using the test name from k8s-kill-all-process-in-container.bats - probably a copy/paste bug. Fixes: #7753 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-08-25 03:41:06 +00:00
Dan Mihai	6a974679f2	tests: delete k8s deployment at the test's end At the end of k8s-kill-all-process-in-container.bats, delete the deployment it created. Fixes: #7752 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-08-25 03:34:37 +00:00
David Esparza	686eb3878b	Merge pull request #7751 from GabyCT/topic/unusednhwc metrics: Remove unused variable in tensorflow nhwc script	2023-08-24 18:34:06 -06:00
Fabiano Fidêncio	f1d8e1f513	Merge pull request #7747 from fidencio/topic/kata-deploy-dont-try-to-remove-opt-kata kata-deploy: Don't try to remove /opt/kata	2023-08-24 18:56:52 +02:00
Gabriela Cervantes	32a778b6da	metrics: Remove unused variable in tensorflow nhwc script This PR removes unused variable in tensorflow nhwc script. Fixes #7750 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-24 15:54:27 +00:00
David Esparza	875a85ee14	Merge pull request #7736 from GabyCT/topic/tensorflowfp32 metrics: Add TensorFlow ResNet50 FP32 benchmark	2023-08-24 08:56:24 -06:00
Fabiano Fidêncio	d8f3ce6497	kata-deploy: Don't try to remove /opt/kata The directory is a host path mount and cannot be removed from within the container. What we actually want to remove is whatever is inside that directory. This may raise errors like: ``` rm: cannot remove '/opt/kata/': Device or resource busy ``` Fixes: #7746 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-24 13:57:36 +02:00
Jeremi Piotrowski	71c90b994a	Merge pull request #7745 from jepio/vfio-part-0 gha: vfio: Run on Ubuntu 23.04 runner	2023-08-24 12:15:19 +02:00
Greg Kurz	9991772b26	Merge pull request #7718 from littlejawa/fix_filemode_when_zero kata-agent: use default filemode for block device when it is set to 0	2023-08-24 11:40:28 +02:00
Jeremi Piotrowski	936e8091a7	gha: vfio: Run on Ubuntu 23.04 runner The vfio test requires nested-nested virtualization: L0 Azure host -> L1 Ubuntu VM -> L2 Fedora VM -> L3 Kata This hits a kernel bug on v5.15 but works quite nicely on the v6.2 kernel included in Ubuntu 23.04. We can switch back to Ubuntu 22.04 when they roll out v6.2. Fixes: #6555 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-08-24 10:10:02 +02:00
Jiang Liu	0e7248264d	agent: move storage device related code into dedicated files Move storage device related code into dedicated files. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 13:48:51 +08:00
Xuewei Niu	268e846558	runtime-rs: Fix volumes and rootfs cleanup issues There are several processes for container exit: - Non-detach mode: `Wait` request is sent by containerd, then `wait_process()` will be called eventually. - Detach mode: `Wait` request is not sent, the `wait_process()` won’t be called. - Killed by ctr: For example, a container runs `tail -f /dev/null`, and is killed by `sudo ctr t kill -a -s SIGTERM <CID>`. Kill request is sent, then `kill_process()` will be called. User executes `sudo ctr c rm <CID>`, `Delete` request is sent, then `delete_process()` will be called. - Exited on its own: For example, a container runs `sleep 1s`. The container’s state goes to `Stopped` after 1 second. User executes the delete command as below. Where do we do container cleanup things? - `wait_process()`: No, because it won’t be called in detach mode. - `delete_process()`: No, because it depends on when the user executes the delete command. - `run_io_wait()`: Yes. A container is considered exited once its IO ended. And this always be called once a container is launched. Fixes: #7713 Signed-off-by: Jianyong Wu <jianyong.wu@arm.com> Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-08-24 13:23:47 +08:00
Jiang Liu	8f49ee33b2	agent: refine storage related code a bit Refine storage related code by: - remove the STORAGE_HANDLER_LIST - define type alias - move code near to its caller Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 13:09:10 +08:00
Jiang Liu	60ca12ccb0	agent: switch to new storage subsystem Switch to new storage subsystem to create a StorageDevice for each storage object. Fixes: #7614 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 13:09:09 +08:00
Jiang Liu	fcbda0b419	kata-types: introduce StorageDevice and StorageHandlerManager Introduce StorageDevice and StorageHandlerManager, which will be used to refine storage device management for kata-agent. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 13:08:55 +08:00
Jiang Liu	b03b1f6134	agent: simplify the way to manage storage object Simplify the way to manage storage objects, and introduce StorageStateCommon structures for coming extensions. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 12:58:24 +08:00
Jiang Liu	8392c71bf2	sys-util: support more mount flags in parse_mount_options() Support more mount flags in parse_mount_options(). Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 12:17:39 +08:00
Jiang Liu	c00d8f3d48	agent: use create_mount_destination() from kata-sys-util Use create_mount_destination() from kata-sys-util crate to reduce redundant code. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 12:17:38 +08:00
Jiang Liu	5e867f0538	types: add more mount related constants Add more mount related constants. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 12:17:36 +08:00
Jiang Liu	880e6c9a76	agent: use function from kata-sys-utils to reduce code Use function get_linux_mount_info() from kata-sys-util crate to share common code. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-24 12:17:34 +08:00
QuanweiZhou	a6921dd837	Merge pull request #7698 from jiangliu/virtual-volume kata-types: introduce KataVirtualVolume to support nydus, direct volume and image pull	2023-08-24 11:50:39 +08:00
Fabiano Fidêncio	7705c5962e	Merge pull request #7728 from ManaSugi/fix/typo-test-toml libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml	2023-08-23 23:55:41 +02:00
GabyCT	c1712e1930	Merge pull request #7737 from jepio/fix-local-build local-build: Remove GID before creating group	2023-08-23 12:26:39 -06:00
Jeremi Piotrowski	3b881fbc0e	local-build: Remove GID before creating group docker install now creates a group with gid 999 which happens to match what we need to get docker-in-docker to work. Remove the group first as we don't need it. Fixes: #7726 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-08-23 18:58:38 +02:00
David Esparza	ebce5d25a9	Merge pull request #7734 from fidencio/topic/kata-deploy-fix-removal kata-deploy: Avoid failing on content removal	2023-08-23 10:29:57 -06:00
Gabriela Cervantes	959ca49447	metrics: Add TensorFlow ResNet50 fp32 Dockerfile This PR adds the TensorFlow ResNet50 fp32 Dockerfile for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-23 16:24:58 +00:00
Gabriela Cervantes	4b7d72c4a8	metrics: Add TensorFlow ResNet50 FP32 benchmark This PR adds TensorFlow ResNet50 FP32 benchmark for kata metrics. Fixes #7735 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-23 16:21:09 +00:00
Fabiano Fidêncio	e7e4cc2182	Merge pull request #7716 from bergwolf/github/image-initrd-assets runtime: fix image and initrd assets handling	2023-08-23 18:02:15 +02:00
Fabiano Fidêncio	5cba38c175	kata-deploy: Avoid failing on content removal We can simply use `rm -f` all over the place and avoid the container returning any error. Fixes: #7733 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-23 16:49:26 +02:00
Peng Tao	18d42da21e	runtime/fc: fix image/initrd annotation handling Right now if we configure an image annotation and have a config file setting initrd, the initrd config would override the image annotation. Make sure annotations are preferred over config options in image and initrd path handling. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-23 03:47:28 +00:00
Peng Tao	9fda7059a5	runtime/clh: fix image/initrd annotation handling We should make sure annotations are preferred over config options in image and initrd path handling. Fixes: #7705 Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-23 03:47:28 +00:00
Peng Tao	1a0092d631	runtime/qemu: fix image/initrd annotation handling Right now if we configure an image annotation and have a config file setting initrd, the initrd config would override the image annotation. Add a helper function ImageOrInitrdAssetPath to make sure annotations are preferred over config options in image and initrd path handling. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2023-08-23 03:47:27 +00:00
Manabu Sugimoto	22d8f335d6	libs,tests: fix typo disable_guest_seccomp in configuration-anno-1.toml Change `pdisable_guest_seccomp` to `disable_guest_seccomp` Fixes: #7727 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-23 12:08:18 +09:00
GabyCT	b8990c0490	Merge pull request #7722 from GabyCT/topic/adddiskreadme metrics: Add disk link to README	2023-08-22 12:29:54 -06:00
GabyCT	514d3d42b8	Merge pull request #7712 from GabyCT/topic/fixfiopath metrics: Fix FIO path	2023-08-22 12:28:28 -06:00
Gabriela Cervantes	8afd158cef	metrics: Add disk link to README This PR adds disk link to README documentation for kata metrics. Fixes #7721 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-22 16:20:31 +00:00
Julien Ropé	40914b25d4	kata-agent: use default filemode for block device when it is set to 0 When the FileMode field for the device is unset (0), use a default value instead to allow the use of the device from the container. This behaviour is seen from cri-o typically. Note: this is what runc is doing, which is why regular containers don't have an issue. This change makes sure kata behaves the same as runc. Fixes: #7717 Signed-off-by: Julien Ropé <jrope@redhat.com>	2023-08-22 16:08:14 +02:00
Fabiano Fidêncio	8032797418	Merge pull request #7708 from microsoft/danmihai1/kata-deploy-log gha: capture additional kata-deploy output	2023-08-21 23:43:51 +02:00
David Esparza	d2c130ea69	Merge pull request #7710 from GabyCT/topic/fixpytorch1 metrics: Use function from metrics common in pytorch script	2023-08-21 15:31:24 -06:00
Gabriela Cervantes	eee2ee6eeb	metrics: Fix FIO path This PR fixes the FIO path for the FIO files. Fixes #7711 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-21 21:06:04 +00:00
David Esparza	9347051592	Merge pull request #7666 from dborquez/metrics_improve_fio_test metrics: Enable kata runtime in K8s for FIO test.	2023-08-21 13:51:57 -06:00
Gabriela Cervantes	39bc3488f5	metrics: Use function from metrics common in pytorch script This PR uses a common function into the pytorch script. Fixes #7709 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-21 16:12:35 +00:00
Dan Mihai	400eb88743	gha: capture additional kata-deploy output 10 lines can be insufficient for diagnostics. Fixes: #7707 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-08-21 15:58:57 +00:00
GabyCT	700759232f	Merge pull request #7690 from GabyCT/topic/fixpytorch metrics: Fix README for pytorch	2023-08-21 09:50:14 -06:00
Jiang Liu	6e038e66e4	Merge pull request #7680 from GabyCT/topic/removetime metrics: Remove unused variable in tensorflow mobilenet script	2023-08-21 23:39:07 +08:00
Jiang Liu	4aee3eade0	kata-types: implement serde methods for KataVirtualVolume Implement serilization/deserialization methods for KataVirtualVolume. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-21 16:46:56 +08:00
Jiang Liu	b875e39323	kata-types: validate KataVirtualVolume object Implement method validate() for KataVirtualVolume to validate message format. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-21 16:42:07 +08:00
Jiang Liu	fa2fdc1057	kata-types: implement two conversion helpers for KataVirtualVolume Enable conversions from NydusExtraOptions/DirectVolumeMountInfo to KataVirtualVolume. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-21 16:35:26 +08:00
Jiang Liu	6326af20e3	kata-types: introduce KataVirtualVolume Introduce structure KataVirtualVolume to to encapsulate information for extra mount options and direct volumes, so we could build a common infrastructure to handle these cases. Fixes: #7699 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-21 16:19:47 +08:00
Gabriela Cervantes	c8b43f8b3e	metrics: Fix README for pytorch This PR fixes the pytorch reference in the README file. Fixes #7689 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-18 20:14:49 +00:00
Aurélien	fa34d61805	Merge pull request #7664 from microsoft/danmihai1/agent-init-policy rootfs: agent: Policy support with AGENT_INIT=yes	2023-08-18 10:51:55 -07:00
Fabiano Fidêncio	7e66d1f6b5	Merge pull request #7649 from fidencio/topic/k8s-tests-remove-kata-deploy-tests gha: k8s: kata-deploy: Move kata-deploy specific tests from integration/kubernetes to functional/kata-deploy	2023-08-18 07:47:26 +02:00
David Esparza	fb571f8be9	metrics: Enable kata runtime in K8s for FIO test. This PR configures the corresponding kata runtime in K8s based on the tested hypervisor. This PR also enables FIO metrics test in the kata metrics-ci. Fixes: #7665 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-17 17:11:27 -06:00
Dan Mihai	cb056f8cb3	rootfs: agent: Policy support with AGENT_INIT=yes When building with AGENT_POLICY=yes and AGENT_INIT=yes: 1. Include OPA and the Policy settings in rootfs. 2. Start OPA from the kata agent. Before these changes, building with both AGENT_POLICY=yes and AGENT_INIT=yes was unsupported. Starting OPA from systemd (when AGENT_INIT=no) was already supported. Fixes: #7615 Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-08-17 22:37:58 +00:00
GabyCT	c358056a3f	Merge pull request #7685 from GabyCT/topic/changename metrics: Fix check results for tensorflow benchmark	2023-08-17 15:39:43 -06:00
Gabriela Cervantes	85c02828e1	metrics: Update tensorflow name in gha run script This PR update tensorflow name in gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-17 20:17:48 +00:00
Gabriela Cervantes	e8a5119343	metrics: Fix check results for tensorflow benchmark This PR fixes the check results for tensorflow benchmark now that we change the name of the test. Fixes #7684 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-17 19:52:45 +00:00
Fabiano Fidêncio	2d896ad12f	gha: kata-deploy: Do the runtime class cleanup as part of the cleanup Instead of doing this as part of the test itself, let's ensure it's done before running the tests and during the tests cleanup. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 18:54:46 +02:00
Fabiano Fidêncio	4ffc2c86f3	gha: kata-deploy: Add the first kata-deploy test This test, at least for now, only checks whether the runtimeclasses have been properly created. This is just a migration from a test we had as part of the k8s suite. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 18:54:46 +02:00
GabyCT	4ba684e6e4	Merge pull request #7653 from GabyCT/topic/tensorflowfp32 metrics: Add Tensorflow ResNet50 int8 benchmark	2023-08-17 10:44:25 -06:00
Gabriela Cervantes	8616c050ae	metrics: Remove unused variable in tensorflow mobilenet script This PR removes unused variable in tensorflow mobilenet script. Fixes #7679 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-17 16:04:18 +00:00
Fabiano Fidêncio	285e616b5e	tests: common: Ensure test_type is used as part of the cluster's name By doing this we can make sure there won't be any clash on the cluster name created for either the k8s or the kata-deploy tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 14:22:16 +02:00
Fabiano Fidêncio	790bd3548d	tests: commob: Don't fail if yq is not part of the cache This may happen on external runners. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 14:22:14 +02:00
Fabiano Fidêncio	ce6adecd0a	gha: kata-deploy: Add run-kata-deploy-tests.sh This will have the same function as run-k8s-tests.sh has, but for kata-deploy. Right now it doesn't have any tests, and the command to actually run the tests is commented out, but right now this is just a placeholder that will be populated sooner than later. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 09:49:03 +02:00
Fabiano Fidêncio	cfc29c11a3	gha: k8s: Stop running kata-deploy tests as part of the k8s suite In a follow-up series, we'll add a whole suite for the kata-deploy tests. With this in mind, let's already get rid of this one and avoid more kata-deploy tests to land here. Fixes: #7642 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-17 09:48:54 +02:00
Fabiano Fidêncio	e470a650e0	Merge pull request #7654 from sprt/ci-fixes kata-deploy: Properly create default runtime class	2023-08-17 09:43:34 +02:00
Wedson Almeida Filho	962378606e	Merge pull request #7627 from wedsonaf/error-conv agent: simplify error handling	2023-08-16 21:02:38 -03:00
Aurélien Bombo	f4dd152863	tests: k8s: Call ensure_yq() in setup.sh It wasn't the `common.bash` import in `run_kubernetes_tests.sh` causing the yq error so let's try this instead. Reference: https://github.com/kata-containers/kata-containers/actions/runs/5674941359/job/15379797568#step:10:341 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-08-16 14:13:56 -07:00
GabyCT	3d0cfc88c9	Merge pull request #7662 from GabyCT/topic/fixhelptensorflow metrics: Fix MobileNet help me description	2023-08-16 14:13:39 -06:00
Aurélien Bombo	339569b69c	kata-deploy: Properly create default runtime class The default `kata` runtime class would get created with the `kata` handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong hypervisor and broke CI. Fixes: #7663 Signed-off-by: Aurélien Bombo <abombo@microsoft.com>	2023-08-16 11:04:44 -07:00
Gabriela Cervantes	2a491e9b1f	metrics: Fix MobileNet help me description This PR fixes MobileNet help me description in the tensorflow script. Fixes #7661 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-16 15:25:39 +00:00
Fabiano Fidêncio	606e419fac	Merge pull request #7660 from fidencio/topic/add-kata-deploy-tests-as-part-of-the-ci gha: ci: Start running kata-deploy tests	2023-08-16 16:44:08 +02:00
Fabiano Fidêncio	d19a75e80c	gha: ci: Start running kata-deploy tests Let's add the tests as part of the ci.yaml, so they an be triggered as part of each PR. For this PR those tests won't be triggered, courtesy to the `pull_request_target` event we rely on. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-16 16:08:05 +02:00
Fabiano Fidêncio	4adcf2192e	Merge pull request #7651 from ManaSugi/runk/containerd-test runk: Modify kill command's error message for containerd tests	2023-08-16 15:37:48 +02:00
Zhongtao Hu	5c8a61a4c8	Merge pull request #7558 from openanolis/fix/driver_option runtime-rs: add driver option	2023-08-16 13:56:29 +08:00
Zhongtao Hu	d90f7ac689	runtime-rs: add unit test for block driver add unit test for block driver Fixes:#7539 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-08-16 11:45:27 +08:00
Zhongtao Hu	e44919f0da	runtime-rs: add load_test_config for unit test add load_test_config for unit test Fixes:#7539 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-08-16 11:32:56 +08:00
Zhongtao Hu	7f48a69379	runtime-rs: add driver option add driver option when handle linux devices Fixes:#7539 Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>	2023-08-16 11:32:49 +08:00
Gabriela Cervantes	bade6a5c3b	docs: Fix TensorFlow word across the document This PR fixes the TensorFlow word across the document to have uniformity across all the document. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-15 20:13:05 +00:00
Fabiano Fidêncio	0bc48eab60	Merge pull request #7640 from fidencio/topic/gha-cri-containerd-enable-tests gha: cri-containerd: Enable tests	2023-08-15 21:18:28 +02:00
Gabriela Cervantes	1a1b207760	docs: Add Tensorflow Resnet50 documentation This PR adds the Tensorflow Resnet50 documentation. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-15 17:46:44 +00:00
Gabriela Cervantes	24baededc0	metrics: Add Dockerfile for ResNet50 int8 This PR adds the dockerfile for ResNet50 int8 benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-15 17:38:26 +00:00
Gabriela Cervantes	6d971ba8df	metrics: Add Tensorflow ResNet50 int8 benchmark This PR adds the Tensorflow ResNet50 int8 script for kata metrics. Fixes #7652 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-15 17:30:22 +00:00
Manabu Sugimoto	25d151bd1b	runk: Modify kill command's error message for containerd tests The error message when the kill command is executed with the container's state == Stopped should be "container not running" because the containerd tests expect that OCI runtimes return the error message and compare it. If the error message is different from the expected one, the tests fail. Fixes: #7650 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-16 00:39:50 +09:00
GabyCT	0bbabeaaf8	Merge pull request #7644 from GabyCT/topic/renametensorflow metrics: Rename tensorflow scripts	2023-08-15 09:23:24 -06:00
Fabiano Fidêncio	46d25d908d	Merge pull request #7643 from fidencio/topic/add-functional-kata-deploy-tests gha: tests: Add kata-deploy functional tests -- Part 1	2023-08-15 15:23:48 +02:00
Fabiano Fidêncio	b3592ab25c	gha: cri-containerd: Enable tests As the cri-containerd tests have been fully migrated to GHA, let's make sure we get them running. Fixes: #6543 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:32:42 +02:00
Fabiano Fidêncio	84dd02e0f9	gha: cri-containerd: Add timeout to the crictl calls on testContainerStop As part of the runners, we're hitting a timeout that I cannot reproduce, at all, when allocating the same instance and running the tests manually. The default timeout to connect to the server is 2s when using `crictl`. Let's increase this to 20s. It's fairly important to mention that in the first tests I used a timeout of 10s, and that helped but we still hit issues every now and then. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:31:54 +02:00
Fabiano Fidêncio	b29782984a	gha: cri-containerd: Show pod before deleting it It'll help us to debug failures with the pod stop / pod delete. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:31:54 +02:00
Fabiano Fidêncio	ae0930824a	gha: cri-containerd: Print kata logs in case of error We need this to fully understand what are the issues we're facing. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:31:54 +02:00
Fabiano Fidêncio	6c8b2ffa60	gha: cri-containerd: Group containerd logs This improves readability in case of failures by a lot. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:31:54 +02:00
Fabiano Fidêncio	9e898701f5	gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account Short commit log says it all. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-15 14:31:54 +02:00
Wedson Almeida Filho	76dac8f22c	agent: simplify error handling We extend the `Result` and `Option` types with associated types that allows converting a `Result<T, E>` and `Option<T>` into `ttrpc::Result<T>`. This allows the elimination of many `match` statements in favor of calling the map function plus the `?` operator. This transformation simplifies the code. Fixes: #7624 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-15 06:55:27 -03:00
Fabiano Fidêncio	e107d1d94e	Merge pull request #7574 from microsoft/danmihai1/policy agent: runtime: add Agent Policy feature	2023-08-15 11:29:13 +02:00
Bin Liu	ea81eb6c2e	Merge pull request #7169 from chethanah/runk/support-no-pid-ns runk: Support without pid ns	2023-08-15 13:00:40 +08:00
Gabriela Cervantes	18a7fd8e4e	metrics: Rename tensorflow scripts This PR renames the tensorflow scripts to include the data format that is being used as we will have multiple tests with different data and model formats for tensorflow so this will help us to distinguish them. Fixes #7645 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-14 20:40:35 +00:00
GabyCT	a740c80251	Merge pull request #7626 from GabyCT/topic/cassandrak metrics: Add Cassandra Kubernetes benchmark for kata metrics	2023-08-14 14:22:52 -06:00
GabyCT	4e5e39e8b3	Merge pull request #7618 from GabyCT/topic/addfunctionscommon metrics: Add common functions to the common script	2023-08-14 14:22:30 -06:00
GabyCT	a19d471c01	Merge pull request #7629 from dborquez/metrics_improve_stopping_kata_components metrics: fix the loop used to stop kata components	2023-08-14 14:22:06 -06:00
Fabiano Fidêncio	e55fa93db9	tests: kata-deploy: Add placeholder for kata-deploy-tests-on-tdx This will not be tested as part of the PR, thanks to the `pull_request_target` event, but we want it to be added so we can build atop of that in a coming up series. Fixes: #7642 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-14 21:38:00 +02:00
Fabiano Fidêncio	d9ee17aaec	tests: kata-deploy: Add placeholder for kata-deploy-tests-on-aks This will not be tested as part of the PR, thanks to the `pull_request_target` event, but we want it to be added so we can build atop of that in a coming up series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-14 21:37:52 +02:00
Chelsea Mafrica	22465d22f0	Merge pull request #7638 from ManaSugi/fix/virtcontainers-doc docs: Remove installation step in virtcontainers doc	2023-08-14 10:21:57 -07:00
Dan Mihai	ab829d1038	agent: runtime: add the Agent Policy feature Fixes: #7573 To enable this feature, build your rootfs using AGENT_POLICY=yes. The default is AGENT_POLICY=no. Building rootfs using AGENT_POLICY=yes has the following effects: 1. The kata-opa service gets included in the Guest image. 2. The agent gets built using AGENT_POLICY=yes. After this patch, the shim calls SetPolicy if and only if a Policy annotation is attached to the sandbox/pod. When creating a sandbox/pod that doesn't have an attached Policy annotation: 1. If the agent was built using AGENT_POLICY=yes, the new sandbox uses the default agent settings, that might include a default Policy too. 2. If the agent was built using AGENT_POLICY=no, the new sandbox is executed the same way as before this patch. Any SetPolicy calls from the shim to the agent fail if the agent was built using AGENT_POLICY=no. If the agent was built using AGENT_POLICY=yes: 1. The agent reads the contents of a default policy file during sandbox start-up. 2. The agent then connects to the OPA service on localhost and sends the default policy to OPA. 3. If the shim calls SetPolicy: a. The agent checks if SetPolicy is allowed by the current policy (the current policy is typically the default policy mentioned above). b. If SetPolicy is allowed, the agent deletes the current policy from OPA and replaces it with the new policy it received from the shim. A typical new policy from the shim doesn't allow any future SetPolicy calls. 4. For every agent rpc API call, the agent asks OPA if that call should be allowed. OPA allows or not a call based on the current policy, the name of the agent API, and the API call's inputs. The agent rejects any calls that are rejected by OPA. When building using AGENT_POLICY_DEBUG=yes, additional Policy logging gets enabled in the agent. In particular, information about the inputs for agent rpc API calls is logged in /tmp/policy.txt, on the Guest VM. These inputs can be useful for investigating API calls that might have been rejected by the Policy. Examples: 1. Load a failing policy file test1.rego on a different machine: opa run --server --addr 127.0.0.1:8181 test1.rego 2. Collect the API inputs from Guest's /tmp/policy.txt and test on the machine where the failing policy has been loaded: curl -X POST http://localhost:8181/v1/data/agent_policy/CreateContainerRequest \ --data-binary @test1-inputs.json Signed-off-by: Dan Mihai <dmihai@microsoft.com>	2023-08-14 17:07:35 +00:00
Fabiano Fidêncio	831e73ff91	tests: kata-deploy: Add functional/kata-deploy/gha-run.sh placeholder Right now this file does nothing, as it's not even called by any GHA. However, it'll be populated later on as part of a different series, where we'll have kata-deploy specific tests running here. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-14 17:46:10 +02:00
Fabiano Fidêncio	af1b46bbf2	tests: Add gha-run-k8s-common.sh Let's split a good portion of `tests/integration/kuberentes/gha-run.sh` out, and put them in a place where they can be used to the soon-to-come kata-deploy specific tests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-14 17:45:58 +02:00
Jeremi Piotrowski	a57e7ffe14	Merge pull request #7211 from stevenhorsman/propogate-secrets Propogate secrets, config maps etc into guest if sharedFS not available	2023-08-14 11:24:47 +02:00
Manabu Sugimoto	416445e7eb	docs: Remove installation step in virtcontainers doc Remove the installation step in the virtcontainers doc because the virtcontainers install/uninstall targets have been removed by `86723b51ae` and they are not used anymore. Fixes: #7637 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-14 15:15:24 +09:00
Fabiano Fidêncio	b975c27793	Merge pull request #7547 from stevefan1999-personal/patch-k0s kata-deploy: Preliminary k0s support	2023-08-12 14:28:13 +02:00
Fabiano Fidêncio	6ed57d1e9a	Merge pull request #7447 from fidencio/topic/gha-move-static-jenkins-to-azure-instances gha: static-checks: Move to the Azure instances	2023-08-12 13:31:54 +02:00
Steve Fan	72cbcf040b	kata-deploy: Add k0s support Add k0s support to kata-deploy, in the very same way kata-containers already supports k3s, and rke2. k0s support requires v1.27.1, which is noted as part of the kata-deploy documentation, as it's the way to use dynamic configuration on containerd CRI runtimes. This support will only be part of the `main` branch, as it's not a bug fix that can be backported to the `stable-3.2` branch, and this is also noted as part of the documentation. Fixes: #7548 Signed-off-by: Steve Fan <29133953+stevefan1999-personal@users.noreply.github.com>	2023-08-11 21:17:23 +02:00
David Esparza	767434d50a	metrics: fix the loop used to stop kata components #7629 This PR fixed the loop that stops the kata-shim and the hypervisors used in metrics checks. Fixes: #7628 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-11 12:32:41 -06:00
Gabriela Cervantes	5d0f0d43c7	metrics: Add cassandra statefulset yaml This PR adds cassandra statefulset yaml for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:22:39 +00:00
Gabriela Cervantes	c1dcc1396f	metrics: Add cassandra service yaml This PR adds the cassandra service yaml for the benchmark. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:22:36 +00:00
Gabriela Cervantes	2297a0d1c5	metrics: Add block loop pvc yaml for cassandra This PR adds block loop pvc yaml for cassandra test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:22:33 +00:00
Gabriela Cervantes	e3d511946f	metrics: Add block loop pv yaml for cassandra test This PR adds the block loop pv yaml for cassandra test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:22:29 +00:00
Gabriela Cervantes	9890271594	metrics: Add block loop pvc for cassandra test This PR adds the block loop pvc for cassandra test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:22:19 +00:00
Gabriela Cervantes	349b89969a	metrics: Add Cassandra Kubernetes benchmark for kata metrics This PR adds Cassandra Kubernetes benchmark for kata metrics tests. Fixes #7625 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-11 17:21:48 +00:00
Fabiano Fidêncio	c52d090522	gha: static-checks: Move to the Azure instances The GHA runners are not exactly powerful, which makes the static-checks take way too long (almost an hour). Let's give a try and move those to the same size of Azure instances used as part of our CI, and probably have this time reduced. Fixes: #7446 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-11 18:47:47 +02:00
stevenhorsman	8815ed0665	runtime: Remove config warnings Remove configuration file shared_fs = none warnings now that there is a solution to updating configMaps, secrets etc Fixes: #7210 Signed-off-by: stevenhorsman <steven@uk.ibm.com>	2023-08-11 16:31:08 +01:00
Yohei Ueda	afe1a6ac5a	agent: support copying of directories and symlinks This patch allows copying of directories and symlinks when static file copying is used between host and guest. This change is necessary to support recursive file copying between shim and agent. Signed-off-by: Yohei Ueda <yohei@jp.ibm.com> (cherry picked from commit `de232b8030`)	2023-08-11 16:31:08 +01:00
Pradipta Banerjee	ab13ef87ee	runtime: propagate configmap/secrets etc changes for remote-hyp For remote hypervisor, the configmap, secrets, downward-api or project-volumes are copied from host to guest. This patch watches for changes to the host files and copies the changes to the guest. Note that configmap updates takes significantly longer than updates via downward-api. This is similar across runc and Kata runtimes. Fixes: #7210 Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com> Signed-off-by: Julien Ropé <jrope@redhat.com> (cherry picked from commit `3081cd5f8e`) (cherry picked from commit 68ec673bc4d9cd853eee51b21a0e91fcec149aad)	2023-08-11 16:31:08 +01:00
Yohei Ueda	c074ec4df1	runtime: Copy shared files recursively This patch enables recursive file copying when filesystem sharing is not used. Signed-off-by: Yohei Ueda <yohei@jp.ibm.com> Co-authored-by: stevenhorsman <steven@uk.ibm.com> (cherry picked from commit `5422a056f2`) (cherry picked from commit 16055ce040bbd724be2916bc518d89b69c9e0ca5) Fixes: #7210	2023-08-11 16:16:52 +01:00
Peng Tao	a39fd6c066	Merge pull request #7611 from ManaSugi/fix/fc-version versions: Update firecracker version to 1.4.0	2023-08-11 16:43:37 +08:00
Chao Wu	7031b5db07	Merge pull request #7535 from ManaSugi/fix/allow-redundant-clone agent: Allow clippy::redundant_clone in the unit tests	2023-08-11 14:17:56 +08:00
Gabriela Cervantes	fdcd52ff78	metrics: Add check containers are running in tensorflow mobilenet This PR adds check containers are running in tensorflow mobilenet that is being defined in common script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 20:17:20 +00:00
Gabriela Cervantes	36337ee146	metrics: Add check containers are up in tensorflow script This PR adds the check containers are up function from common in tensorflow script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 20:15:18 +00:00
Gabriela Cervantes	f700f9b0ba	metrics: Remove unused variable in tensorflow script This PR removes an unused variable in tensorflow script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 20:13:37 +00:00
Gabriela Cervantes	833cf7a684	metrics: Add check containers are running function This PR adds the check containers are running function the common metrics script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 20:12:22 +00:00
Gabriela Cervantes	918c783084	metrics: Add check containers are up in tensorflow mobilenet script This PR adds the check containers are up in the common script in the tensorflow mobilenet script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 20:06:40 +00:00
Gabriela Cervantes	9d57a1fab4	metrics: Use check containers are up in tensorflow script This PR uses the check containers are up from the common script in the tensorflow script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 17:42:09 +00:00
Gabriela Cervantes	1c84680d8c	metrics: Add check containers are up in common script This PR adds check containers are up in common script for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 17:39:24 +00:00
Gabriela Cervantes	d3e57cf454	metrics: Use collect_results function in tensorflow mobilenet test This PR uses the collect results function defined in common for the tensorflow mobilenet test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 17:34:30 +00:00
Gabriela Cervantes	286de046af	metrics: Remove collect results function definition This PR removes the collect results function from tensorflow script as it is going to be referenced in the common metrics script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 17:31:23 +00:00
Gabriela Cervantes	9879709aae	metrics: Add common functions to the common script This PR adds the collect results function to the common metrics script. Fixes #7617 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-10 17:27:11 +00:00
Fabiano Fidêncio	a89c9cd620	Merge pull request #7557 from wedsonaf/no-new-vecs agent: avoid creating new `Vec` instances when easily avoidable	2023-08-10 18:43:46 +02:00
Manabu Sugimoto	4746fa3daa	docs: Specify supported Firecracker version using `versions.yaml` Specify the supported version of Firecracker using our `versions.yaml` to improve the maintainability of the documentation. Fixes: #7610 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-10 16:49:45 +09:00
Manabu Sugimoto	cc922be5ec	versions: Update firecracker version to 1.4.0 This patch upgrades Firecracker version from v1.1.0 to v1.4.0. * Generate swagger models for v1.4.0 (from `firecracker.yaml`) - The version of go-swagger used is v0.30.0 * The firecracker v1.4.0 includes the following changes. - Added * Added support for custom CPU templates allowing users to adjust vCPU features exposed to the guest via CPUID, MSRs and ARM registers. * Introduced V1N1 static CPU template for ARM to represent Neoverse V1 CPU as Neoverse N1. * Added support for the virtio-rng entropy device. The device is optional. A single device can be enabled per VM using the /entropy endpoint. * Added a cpu-template-helper tool for assisting with creating and managing custom CPU templates. - Changed * Set FDP_EXCPTN_ONLY bit (CPUID.7h.0:EBX[6]) and ZERO_FCS_FDS bit (CPUID.7h.0:EBX[13]) in Intel's CPUID normalization process. - Fixed * Fixed feature flags in T2S CPU template on Intel Ice Lake. * Fixed CPUID leaf 0xb to be exposed to guests running on AMD host. * Fixed a performance regression in the jailer logic for closing open file descriptors. * A race condition that has been identified between the API thread and the VMM thread due to a misconfiguration of the api_event_fd. * Fixed CPUID leaf 0x1 to disable perfmon and debug feature on x86 host. * Fixed passing through cache information from host in CPUID leaf 0x80000006. * Fixed the T2S CPU template to set the RRSBA bit of the IA32_ARCH_CAPABILITIES MSR to 1 in accordance with an Intel microcode update. * Fixed the T2CL CPU template to pass through the RSBA and RRSBA bits of the IA32_ARCH_CAPABILITIES MSR from the host in accordance with an Intel microcode update. * Fixed passing through cache information from host in CPUID leaf 0x80000005. * Fixed the T2A CPU template to disable SVM (nested virtualization). * Fixed the T2A CPU template to set EferLmsleUnsupported bit (CPUID.80000008h:EBX[20]), which indicates that EFER[LMSLE] is not supported. Fixes: #7610 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-10 16:48:13 +09:00
Fupan Li	39e67b06e9	dragonball: vsock add fifo/pipe stream support for passed fd hybridStream Since the passed fd through unix socket would be any stream fd such as pipe/fifo fd or any other socket fd, thus we should deal with it as a normal hybrid stream instead of a unix stream. Fixes:#7584 Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>	2023-08-10 11:07:10 +08:00
David Esparza	7bf994827d	Merge pull request #7609 from dborquez/tensorflow_check_completion metrics: compute tensorflow statistics	2023-08-09 18:47:47 -06:00
David Esparza	dcdb3b067f	Merge pull request #7606 from GabyCT/topic/nginx metrics: Add network nginx benchmark	2023-08-09 16:14:13 -06:00
David Esparza	2defdcc598	Merge pull request #7579 from dborquez/simplify_gha_metrics_workflow metrics: install kata once and run multiple checks	2023-08-09 14:45:09 -06:00
David Esparza	473b0d3a31	metrics: compute tensorflow statistics This PR computes average results for TF bench. Additionally, it improves the data parsing from all running containers. Fixes: #7603 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-09 14:42:30 -06:00
Fabiano Fidêncio	0a8208c670	Merge pull request #7608 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests-follow-up-3 ci: unencrypted-image: Fix build context	2023-08-09 21:00:46 +02:00
Fabiano Fidêncio	03d1fa67b1	ci: unencrypted-image: Fix build context The build context should be the folder where the Dockerfile is present, otherwise the files copied into the image won't be found. Fixes: #7595 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 20:32:36 +02:00
Fabiano Fidêncio	eb463b38ec	ci: unencrypted-image: Don't fail to build on s390x Let's make sure that we don't fail in case we're building non x86_64. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 20:32:36 +02:00
Fabiano Fidêncio	ebc86091d1	Merge pull request #7607 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests-follow-up-2 ci: create-confidential-image: Add dependent actions	2023-08-09 19:53:49 +02:00
Fabiano Fidêncio	a2d731ad26	ci: create-confidential-image: Add dependent actions Following the example on https://github.com/docker/build-push-action, it's clear that the actions to "Set up QEMU" and "Set up Docker Buildx" are missing. Let's add them, and also take the advantage to bump the build-push-action to its v4, which, by the way, had a typo on its name (build-and-push-action does NOT exist, build-push-action does). Fixes: #7595 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 18:36:51 +02:00
Gabriela Cervantes	d1a6296221	metrics: Add nginx documentation to network README This PR adds nginx documentation to network README for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-09 16:17:46 +00:00
Gabriela Cervantes	498f7c0549	metrics: Add nginx kubernetes yaml This PR adds the nginx kubernetes yaml. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-09 16:14:04 +00:00
Gabriela Cervantes	f8a5255cf7	metrics: Add network nginx benchmark This PR adds the network nginx benchmark for kata metrics. Fixes #7605 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-09 16:12:21 +00:00
Fabiano Fidêncio	86f705d98b	Merge pull request #7604 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests-follow-up-1 Follow up fixes for https://github.com/kata-containers/kata-containers/pull/7596	2023-08-09 18:05:46 +02:00
Fabiano Fidêncio	43fe5d1b90	ci: k8s: tees: Ensure PR_NUMBER is exported Right now this is not being used, but it'll as the image generated for the confidential tests have that as part of their tag. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 17:45:42 +02:00
Fabiano Fidêncio	54f6a78500	ci: {{ pr-number }} should be {{ inputs.pr-number }} One of the joys to rely on the `pull_request_target` is to only be able to catch those after those are merged. Fixes: #7595 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 17:41:07 +02:00
Fabiano Fidêncio	5cdf981a2b	Merge pull request #7596 from fidencio/topic/create-image-to-be-used-by-the-confidential-tests tests: Create image that will be used in the unencrypted confidential tests	2023-08-09 17:06:07 +02:00
Fabiano Fidêncio	c932369f42	Merge pull request #7492 from fidencio/topic/adapt-tests-to-the-new-kata-deploy-env-vars kata-deploy: Ensure we cover SHIMS / DEFAULT_SHIM as part of our tests	2023-08-09 12:55:03 +02:00
Fabiano Fidêncio	034d7aab87	tests: k8s: Ensure the runtime classes are properly created With these 2 simple checks we can ensure that we do not regress on the behaviour of allowing the runtime classes / default runtime class to be created by the kata-deploy payload. Fixes: #7491 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 11:46:04 +02:00
Fabiano Fidêncio	fac8ccf5cd	ci: Add build-and-publish-tee-confidential-unencrypted-image This will be done before running TEE tests, and it's a hard dependency fr them. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 11:36:10 +02:00
Fabiano Fidêncio	ab5f603ffa	ci: k8s: Add the image used for unencrypted confidential tests Let's add here the image we'll be using for unencrypted confidential tests. Later on, we'll make sure to build and use this image as part of our CI. The image can easily be built as a multi-arch image, and has `cpuid` installed in case of `x86_64` build, so it can be used to detect whether we're running on a TEE guest without having to rely on `dmesg \| grep ...`. Fixes: #7595 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 11:33:18 +02:00
Fabiano Fidêncio	36d53dd2af	Merge pull request #7598 from UnmeshDeodhar/upgrade-bats-version tests: upgrade bats version	2023-08-09 11:18:56 +02:00
Fabiano Fidêncio	1e8fe131bd	k8s: tests: Take advantage of `SHIMS` and `DEFAULT_SHIM` env vars We don't have to do any sed to replace the runtimeclass being used by the moment we start taking advantage of the `DEFAULT_SHIM` environment variable exposed merged in the previous commits. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-09 11:15:34 +02:00
Wedson Almeida Filho	729b2dd611	agent: avoid creating new `Vec` instances when easily avoidable There are many places where the code currently creates new `Vec` instances when it's not really needed. The result is a perf hit because it allocates memory, copies all elements, then frees the memory; in some cases, copying elements also involves extra allocations (e.g., when elements are strings, or structs containing strings). This patch addresses a number of these cases. Fixes: #7203 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-09 02:38:36 -03:00
Jiang Liu	311671abb5	Merge pull request #7552 from jiangliu/agent-r1 Fix mimor bugs and improve coding stype of agent rpc/sandbox/mount	2023-08-09 13:19:02 +08:00
Unmesh Deodhar	aeaec9dae9	tests: upgrade bats version Instead of using package manager to install bats, building this from source. This gives us the updated version of bats which supports functions such as setup_file and teardown_file. We can use these functions into our current tests. Fixes: #7597 Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>	2023-08-08 18:16:39 -05:00
David Esparza	e664969862	metrics: install kata once and run multiple checks This PR changes the metrics workflow in order to just install kata once, and run the checks for multiple hypervisor variations. In this way we save time avoiding installing kata for each hypervisor to be tested. Fixes: #7578 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-08 10:25:13 -06:00
Jiang Liu	baabfa9f1f	agent: refine implementation of mount related code Refine implementation of mount by: - log message with `path.display()` instead of `{:?}` - add prefix "_" to unused variables - pass by reference instead of by value to avoid creating redundant array - exactly matching prefix "fsgid=" instead of "fsgid" - avoid redundant clone() operations Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:03:03 +08:00
Jiang Liu	98ba211a34	agent: fix a bug in update_ephemeral_mounts() There's a bug in function update_ephemeral_mounts() which only handles the first storage object and ignores all other storage objects. Fixes: #7551 Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:03:02 +08:00
Jiang Liu	5333618d70	agent: make add_storage() take &[Storage] instead of Vec<Storage> Simplify add_storage() by taking &[Storage] instead of Vec<Storage>. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:03:01 +08:00
Jiang Liu	37f34781d1	agent: simplify function online_cpu_memory() Simplify function online_cpu_memory() by on calling update_cpuset_path() for containers with cpuset configured. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:03:00 +08:00
Jiang Liu	d3c5422379	agent: refine style of code related to sandbox Refine style of code related to sandbox by: - remove unnecessary comments for caller to take lock, we have already taken `&mut self`. - change "count < 1 " to "count == 0", `count` is type of u32. - make remove_sandbox_storage() to take `&mut self` instead of `&self`. - group related function to each others - avoid search the map twice in function find_process() - avoid unwrap() in function run_oom_event_monitor() - avoid unwrap() in online_resources() Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:02:59 +08:00
Jiang Liu	71a9f67781	agent: avoid unwrap() in function do_remove_container() Avoid unwrap() in function do_remove_container(), and also make implmementation symmetric for both timeout and non-timeout cases. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:02:58 +08:00
Jiang Liu	84badd89d7	agent: avoid clone objects when possible Optimize agent rpc implementation by: - avoid clone objects when possible - avoid unwrap() when possible - explictly drop object to ensure order Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>	2023-08-08 18:02:56 +08:00
Chao Wu	b098960442	Merge pull request #7581 from justxuewei/bump-versions deps: Bump dependent crate versions	2023-08-08 15:16:57 +08:00
Chao Wu	24bf637835	Merge pull request #7500 from pmores/fix-queue-num-in-dragonball-share-fs fix number of queues handling in dragonball share fs device	2023-08-08 12:07:25 +08:00
Xuewei Niu	b23c5ed155	deps: Bump dependent crate versions This pull request is mainly for updating vm-memory and vmm-sys-util. The affacted crates include: - vm-memory: from 0.9.0 to 0.10.0 - vmm-sys-util: from 0.10.0 to 0.11.0 - virtio-queue: from 0.6.0 to 0.7.0 - fuse-backend-rs: from 0.10.4 to 0.10.5 - linux-loader: from 0.6.0 to 0.8.0 - nydus-api: from 0.3.0 to 0.3.1 - nydus-rafs: from 0.3.1 to 0.3.2 - nydus-storage: from 0.6.3 to 0.6.4 Fixes: #0000 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-08-08 11:54:09 +08:00
Fupan Li	5a20d8dcaf	Merge pull request #7383 from justxuewei/dan runtime-rs: Introduce directly attachable network	2023-08-08 09:54:28 +08:00
Chelsea Mafrica	553fd79ea9	Merge pull request #7572 from GabyCT/topic/resnet50fp32 metrics: General improvements to mobilenet tensorflow test	2023-08-07 13:33:28 -07:00
GabyCT	194120b679	Merge pull request #7540 from GabyCT/topic/enableiperf gha: Add iperf network metrics	2023-08-07 13:40:02 -06:00
Gabriela Cervantes	863283716d	metrics: General improvements to mobilenet tensorflow test This PR renames the mobilenet tensorflow test to have a more specific tensorflow name mainly because tensorflow has different configurations and we will add more tensorflow tests so we want to distinguish each tensorflow test. Fixes #7571 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-07 16:50:00 +00:00
Gabriela Cervantes	3c319d8d4c	metrics: Add iperf to gha run script This PR adds iperf to gha run script. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-07 16:20:00 +00:00
Gabriela Cervantes	5b5caf8908	gha: Add iperf network metrics This PR adds the iperf network metrics to the github actions for kata metrics. Fixes #7535 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-07 16:20:00 +00:00
Chelsea Mafrica	4559caf619	Merge pull request #7467 from ManaSugi/doc/use-k8-control-plane docs: Use control-plane term instead of master	2023-08-06 23:40:51 -07:00
Fabiano Fidêncio	b365bef570	Merge pull request #7191 from wedsonaf/avoid-clones agent: avoid unnecessary calls to `Arc::clone`	2023-08-06 15:34:07 +02:00
GabyCT	7144acb2a5	Merge pull request #7527 from GabyCT/topic/latency metrics: Add network latency test	2023-08-04 15:54:07 -06:00
Gabriela Cervantes	66db5b5350	metrics: Add latency test to network README This PR adds latency test to network README for kata metrics. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-04 20:27:27 +00:00
Wedson Almeida Filho	c36572418f	agent: avoid unnecessary calls to `Arc::clone` These calls cause two extra atomic instructions each time they're used, one to increment and another one to decrement the refcount. Since we don't need them because the referred value is guaranteed to outlive the function, remove the calls. Fixes: #7190 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 20:53:05 -03:00
Fabiano Fidêncio	8c03deac3a	Merge pull request #7106 from wedsonaf/image-pulling Image pulling on the host	2023-08-04 01:08:42 +02:00
Wedson Almeida Filho	4fbe0a3a53	runtime: bind-mount mounted block device into container When the mounted block device isn't a layer, we want to mount it into containers, but since it's already mounted with the correct fs (e.g., tar, ext4, etc.) in the pod, we just bind-mount it into the container. Fixes: #7536 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 17:58:39 -03:00
Wedson Almeida Filho	7e1b1949d4	runtime: add support for kata overlays When at least one `io.katacontainers.fs-opt.layer` option is added to the rootfs, it gets inserted into the VM as a layer, and the file system is mounted as an overlay of all layers using the overlayfs driver. Additionally, if the `io.katacontainers.fs-opt.block_device=file` option is present in a layer, it is mounted as a block device backed by a file on the host. Fixes: #7536 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 17:58:39 -03:00
Wedson Almeida Filho	6c867d9e86	agent: add io.katacontainers.fs-opt.overlay-rw option This causes the overlay-fs driver to add the `upperdir` and `workdir` options to an overlay-fs mount so that the mount becomes writable using a discardable directory under the container id. Fixes: #7536 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 17:58:39 -03:00
Wedson Almeida Filho	6163c35657	agent: skip mount options that start with "io.katacontainers." This is so that file systems don't fail when we pass kata-specific options from the snapshotter to kata. Fixes: #7536 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 17:58:39 -03:00
Fabiano Fidêncio	fa35afa982	Merge pull request #7542 from wedsonaf/ci-fix Use version 0.10.4 of `fuse-backend-rs`	2023-08-03 22:50:11 +02:00
Wedson Almeida Filho	b2ff97aa01	dragonball: use version 0.10.4 of `fuse-backend-rs` Version 0.10.5, which was just released, breaks `nydus-storage`. This is a workaround to fix the CI which is blocking other PRs. Fixes: #7541 Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>	2023-08-03 14:15:17 -03:00
Fabiano Fidêncio	ebdae7cfdf	Merge pull request #7520 from jepio/host-systemctl kata-deploy: Use host's systemctl	2023-08-03 13:53:28 +02:00
Manabu Sugimoto	845eeb4d7b	agent: Allow clippy::redundant_clone in the unit tests Allow `clippy::redundant_clone` in the agent's unit tests because rustc>=1.70 shows the errors as false-negatives. These `clone()` are required because the following codes refer to the variable, but the clippy analyzes them by mistake, using the conservative and limited approach. Ref. https://rust-lang.github.io/rust-clippy/master/index.html#/redundant_clone Fixes: #7534 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-03 19:07:40 +09:00
Fabiano Fidêncio	e2755a47b8	Merge pull request #7524 from fidencio/revert-kata-deploy-changes-after-3.2.0-rc0-release release: Revert kata-deploy changes after 3.2.0-rc0 release	2023-08-03 11:28:43 +02:00
Fabiano Fidêncio	1163fc9de2	release: Revert kata-deploy changes after 3.2.0-rc0 release As 3.2.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup tags back to "latest", and re-add the kata-deploy-stable and the kata-cleanup-stable files. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2023-08-03 10:08:20 +02:00
Xuewei Niu	3958a39d07	runtime-rs: Introduce directly attachable network Kata containers as VM-based containers are allowed to run in the host netns. That is, the network is able to isolate in the L2. The network performance will benefit from this architecture, which eliminates as many hops as possible. We called it a Directly Attachable Network (DAN for short). The network devices are placed at the host netns by the CNI plugins. The configs are saved at {dan_conf}/{sandbox_id}.json in the format of JSON, including device name, type, and network info. At the very beginning stage, the DAN only supports host tap devices. More devices, like the DPDK, will be supported in later versions. The format of file looks like as below: ```json { "netns": "/path/to/netns", "devices": [{ "name": "eth0", "guest_mac": "xx:xx:xx:xx:xx", "device": { "type": "vhost-user", "path": "/tmp/test", "queue_num": 1, "queue_size": 1 }, "network_info": { "interface": { "ip_addresses": ["192.168.0.1/24"], "mtu": 1500, "ntype": "tuntap", "flags": 0 }, "routes": [{ "dest": "172.18.0.0/16", "source": "172.18.0.1", "gateway": "172.18.31.1", "scope": 0, "flags": 0 }], "neighbors": [{ "ip_address": "192.168.0.3/16", "device": "", "state": 0, "flags": 0, "hardware_addr": "xx:xx:xx:xx:xx" }] } }] } ``` Fixes: #1922 Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>	2023-08-03 15:33:34 +08:00
David Esparza	7d1c48c881	Merge pull request #7530 from dborquez/fix_check_running_processes metrics: stop kata components before start a metric test.	2023-08-02 23:51:27 -06:00
Zhongtao Hu	e719423262	Merge pull request #7127 from cmaf/runtime-rs-ch-blk-2 runtime-rs: Add block device handling for cloud hypervisor	2023-08-03 09:46:32 +08:00
David Esparza	1e15369e59	metrics: Improve naming testing containers in launch times test This commit provides a new way to name the containers used in the launch-times-test in this form: 'kata_launch_times_RANDOM_NUMBER', where RANDOM_NUMBER is in the 0-1000 range. Fixes: #7529 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-02 17:04:55 -06:00
David Esparza	5dbe88330f	metrics: Clean kata components before start a metric test. This PR kills all kata components before start a new metric test. Fixes: #7528 Signed-off-by: David Esparza <david.esparza.borquez@intel.com>	2023-08-02 17:04:51 -06:00
Gabriela Cervantes	3b45060b61	metrics: Add latency server yaml This PR adds latency server yaml for kubernetes test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-02 16:52:17 +00:00
Gabriela Cervantes	9bb8451df5	metrics: Add latency client yaml This PR adds latency client yaml for the kubernetes test. Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-02 16:50:51 +00:00
Gabriela Cervantes	64fdb98704	metrics: Add network latency test This PR adds network latency test for kata metrics. Fixes #7526 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2023-08-02 16:46:48 +00:00
Chelsea Mafrica	a81ad3b587	runtime-rs: Add block device handling in cloud hypervisor Add functions for adding a block device to a container for CH. Fixes #6690 Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>	2023-08-02 09:18:48 -07:00
Jeremi Piotrowski	3230dec950	kata-deploy: Use host's systemctl when interacting with systemd. We have occasionally faced issues with compatibility between the systemctl version used inside the kata-deploy container and the systemd version on the host. Instead of using a containerized systemctl with bind mounted sockets, nsenter the host and run systemctl from there. This provides less coupling between the kata-deploy container and the host. Fixes: #7511 Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>	2023-08-02 15:32:01 +02:00
Manabu Sugimoto	1b21a46246	docs: Use control-plane term instead of master Replace `master` with `control-plane` in the context of K8s because `master` is a legacy term and haven't been used any more. Ref. https://github.com/kubernetes/enhancements/tree/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint Fixes: #7466 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-08-01 17:41:40 +09:00
Pavel Mores	28e5e9c86e	runtime-rs: fix number of queues handling in dragonball share fs device Looks like a copy/paste error... Fixes #7501 Signed-off-by: Pavel Mores <pmores@redhat.com>	2023-07-31 17:25:47 +02:00
Manabu Sugimoto	f1d8de9be6	runk: Allow runk to launch a container without pid namespace Allow runk to launch a container even though users don't specify the pid namespace in `config.json` because general container runtimes such as runc also can launch a container without the namespace. On the other hand, Kata Containers doesn't allow it due to security issue so this feature should be enabled in only runk. Fixes: #7168 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2023-07-16 23:31:14 +05:30

5434 changed files with 929697 additions and 371685 deletions

									
										25

.github/actionlint.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,25 @@

				# Copyright (c) 2024 Red Hat

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				# Configuration file with rules for the actionlint tool.

				#

				self-hosted-runner:

				  # Labels of self-hosted runner that linter should ignore

				  labels:

				    - arm64-k8s

				    - ubuntu-22.04-arm

				    - garm-ubuntu-2004

				    - garm-ubuntu-2004-smaller

				    - garm-ubuntu-2204

				    - garm-ubuntu-2304

				    - garm-ubuntu-2304-smaller

				    - garm-ubuntu-2204-smaller

				    - k8s-ppc64le

				    - metrics

				    - ppc64le

				    - riscv-builder

				    - sev-snp

				    - s390x

				    - s390x-large

				    - tdx

									
										2

.github/cargo-deny-composite-action/cargo-deny-generator.sh
									
										vendored
									
												View File
												
				@@ -8,7 +8,7 @@

				script_dir=$(dirname "$(readlink -f "$0")")

				parent_dir=$(realpath "${script_dir}/../..")

				cidir="${parent_dir}/ci"

				source "${cidir}/lib.sh"

				source "${cidir}/../tests/common.bash"

				cargo_deny_file="${script_dir}/action.yaml"

									
										2

.github/cargo-deny-composite-action/cargo-deny-skeleton.yaml.in
									
										vendored
									
												View File
												
				@@ -21,7 +21,7 @@ runs:

				        override: true

				    - name: Cache

				      uses: Swatinem/rust-cache@v2

				      uses: Swatinem/rust-cache@f0deed1e0edfc6a9be95417288c0e1099b1eeec3 # v2.7.7

				    - name: Install Cargo deny

				      shell: bash

									
										90

.github/dependabot.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,90 @@

				---

				version: 2

				updates:

				  - package-ecosystem: "cargo"

				    directories:

				      - "/src/agent"

				      - "/src/dragonball"

				      - "/src/libs"

				      - "/src/mem-agent"

				      - "/src/mem-agent/example"

				      - "/src/runtime-rs"

				      - "/src/tools/agent-ctl"

				      - "/src/tools/genpolicy"

				      - "/src/tools/kata-ctl"

				      - "/src/tools/runk"

				      - "/src/tools/trace-forwarder"

				    schedule:

				      interval: "daily"

				    ignore:

				    # rust-vmm repos might cause incompatibilities on patch versions, so

				    # lets handle them manually for now.

				      - dependency-name: "event-manager"

				      - dependency-name: "kvm-bindings"

				      - dependency-name: "kvm-ioctls"

				      - dependency-name: "linux-loader"

				      - dependency-name: "seccompiler"

				      - dependency-name: "vfio-bindings"

				      - dependency-name: "vfio-ioctls"

				      - dependency-name: "virtio-bindings"

				      - dependency-name: "virtio-queue"

				      - dependency-name: "vm-fdt"

				      - dependency-name: "vm-memory"

				      - dependency-name: "vm-superio"

				      - dependency-name: "vmm-sys-util"

				    # As we often have up to 8/9 components that need the same versions bumps

				    # create groups for common dependencies, so they can all go in a single PR

				    # We can extend this as we see more frequent groups

				    groups:

				      bit-vec:

				        patterns:

				          - bit-vec

				      bumpalo:

				        patterns:

				          - bumpalo

				      clap:

				        patterns:

				          - clap

				      crossbeam:

				        patterns:

				          - crossbeam

				      h2:

				        patterns:

				          - h2

				      idna:

				        patterns:

				          - idna

				      openssl:

				        patterns:

				          - openssl

				      protobuf:

				        patterns:

				          - protobuf

				      rsa:

				        patterns:

				          - rsa

				      rustix:

				        patterns:

				          - rustix

				      time:

				        patterns:

				          - time

				      tokio:

				        patterns:

				          - tokio

				      tracing:

				        patterns:

				          - tracing

				  - package-ecosystem: "gomod"

				    directories:

				      - "src/runtime"

				      - "tools/testing/kata-webhook"

				      - "src/tools/csi-kata-directvolume"

				    schedule:

				      interval: "daily"

				  - package-ecosystem: "github-actions"

				    directory: "/"

				    schedule:

				      interval: "monthly"

									
										8

.github/workflows/PR-wip-checks.yaml
									
										vendored
									
												View File
												
				@@ -9,18 +9,20 @@ on:

				      - labeled

				      - unlabeled

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  pr_wip_check:

				    runs-on: ubuntu-latest

				    runs-on: ubuntu-22.04

				    name: WIP Check

				    steps:

				    - name: WIP Check

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      uses: tim-actions/wip-check@1c2a1ca6c110026b3e2297bb2ef39e1747b5a755

				      uses: tim-actions/wip-check@1c2a1ca6c110026b3e2297bb2ef39e1747b5a755 # master (2021-06-10)

				      with:

				        labels: '["do-not-merge", "wip", "rfc"]'

				        keywords: '["WIP", "wip", "RFC", "rfc", "dnm", "DNM", "do-not-merge"]'

									
										37

.github/workflows/actionlint.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,37 @@

				name: Lint GHA workflows

				on:

				  workflow_dispatch:

				  pull_request:

				    types:

				      - opened

				      - edited

				      - reopened

				      - synchronize

				    paths:

				      - '.github/workflows/**'

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  run-actionlint:

				    env:

				      GH_TOKEN: ${{ github.token }}

				    runs-on: ubuntu-24.04

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Install actionlint gh extension

				        run: gh extension install https://github.com/cschleiden/gh-actionlint

				      - name: Run actionlint

				        run:  gh actionlint

									
										104

.github/workflows/add-backport-label.yaml
									
										vendored
									
												View File
											
				@@ -1,104 +0,0 @@

				name: Add backport label

				on:

				  pull_request:

				    types:

				      - opened

				      - synchronize

				      - reopened

				      - edited

				      - labeled

				      - unlabeled

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  check-issues:

				    if: ${{ github.event.label.name != 'auto-backport' }}

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout code to allow hub to communicate with the project

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        uses: actions/checkout@v3

				      - name: Install hub extension script

				        run: |

				          pushd $(mktemp -d) &>/dev/null

				          git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts

				          sudo install hub-util.sh /usr/local/bin

				          popd &>/dev/null

				      - name: Determine whether to add label 

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        env:

				          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				          CONTAINS_AUTO_BACKPORT: ${{ contains(github.event.pull_request.labels.*.name, 'auto-backport') }}

				        id: add_label

				        run: |

				          pr=${{ github.event.pull_request.number }}

				          linked_issue_urls=$(hub-util.sh \

				            list-issues-for-pr "$pr" |\

				            grep -v "^\#"  |\

				            cut -d';' -f3 || true)

				          [ -z "$linked_issue_urls" ] && {

				            echo "::error::No linked issues for PR $pr"

				            exit 1

				          }

				          has_bug=false

				          for issue_url in $(echo "$linked_issue_urls")

				          do

				            issue=$(echo "$issue_url"| awk -F\/ '{print $NF}' || true)

				            [ -z "$issue" ] && {

				              echo "::error::Cannot determine issue number from $issue_url for PR $pr"

				              exit 1

				            }

				            labels=$(hub-util.sh list-labels-for-issue "$issue")

				            label_names=$(echo $labels | jq -r '.[].name' || true)

				            if [[ "$label_names" =~ "bug" ]]; then

				              has_bug=true

				              break

				            fi

				          done

				          has_backport_needed_label=${{ contains(github.event.pull_request.labels.*.name, 'needs-backport') }}

				          has_no_backport_needed_label=${{ contains(github.event.pull_request.labels.*.name, 'no-backport-needed') }}

				          echo "add_backport_label=false" >> $GITHUB_OUTPUT

				          if [ $has_backport_needed_label  = true ] || [ $has_bug  = true ]; then

				            if [[ $has_no_backport_needed_label = false ]]; then

				              echo "add_backport_label=true" >> $GITHUB_OUTPUT

				            fi

				          fi

				          # Do not spam comment, only if auto-backport label is going to be newly added.

				          echo "auto_backport_added=$CONTAINS_AUTO_BACKPORT" >> $GITHUB_OUTPUT

				      - name: Add comment

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && steps.add_label.outputs.add_backport_label == 'true' && steps.add_label.outputs.auto_backport_added == 'false' }}

				        uses: actions/github-script@v6

				        with:

				          script: |

				            github.rest.issues.createComment({

				              issue_number: context.issue.number,

				              owner: context.repo.owner,

				              repo: context.repo.repo,

				              body: 'This issue has been marked for auto-backporting. Add label(s) backport-to-BRANCHNAME to backport to them'

				            })

				      # Allow label to be removed by adding no-backport-needed label

				      - name: Remove auto-backport label

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && steps.add_label.outputs.add_backport_label == 'false' }}

				        uses: andymckay/labeler@e6c4322d0397f3240f0e7e30a33b5c5df2d39e90

				        with:

				          remove-labels: "auto-backport"

				          repo-token: ${{ secrets.GITHUB_TOKEN }}

				      - name: Add auto-backport label

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && steps.add_label.outputs.add_backport_label == 'true' }}

				        uses: andymckay/labeler@e6c4322d0397f3240f0e7e30a33b5c5df2d39e90

				        with:

				          add-labels: "auto-backport"

				          repo-token: ${{ secrets.GITHUB_TOKEN }}

									
										59

.github/workflows/add-issues-to-project.yaml
									
										vendored
									
												View File
											
				@@ -1,59 +0,0 @@

				# Copyright (c) 2020 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				name: Add newly created issues to the backlog project

				on:

				  issues:

				    types:

				      - opened

				      - reopened

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  add-new-issues-to-backlog:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Install hub

				        run: |

				          HUB_ARCH="amd64"

				          HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\

				            jq -r .tag_name | sed 's/^v//')

				          curl -sL \

				            "https://github.com/github/hub/releases/download/v${HUB_VER}/hub-linux-${HUB_ARCH}-${HUB_VER}.tgz" |\

				          tar xz --strip-components=2 --wildcards '*/bin/hub' && \

				          sudo install hub /usr/local/bin

				      - name: Install hub extension script

				        run: |

				          # Clone into a temporary directory to avoid overwriting

				          # any existing github directory.

				          pushd $(mktemp -d) &>/dev/null

				          git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts

				          sudo install hub-util.sh /usr/local/bin

				          popd &>/dev/null

				      - name: Checkout code to allow hub to communicate with the project

				        uses: actions/checkout@v2

				      - name: Add issue to issue backlog

				        env:

				          GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}

				        run: |

				          issue=${{ github.event.issue.number }}

				          project_name="Issue backlog"

				          project_type="org"

				          project_column="To do"

				          hub-util.sh \

				            add-issue \

				            "$issue" \

				            "$project_name" \

				            "$project_type" \

				            "$project_column"

									
										44

.github/workflows/add-pr-sizing-label.yaml
									
										vendored
									
												View File
											
				@@ -1,44 +0,0 @@

				# Copyright (c) 2022 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				name: Add PR sizing label

				on:

				  pull_request_target:

				    types:

				      - opened

				      - reopened

				      - synchronize

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  add-pr-size-label:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@v1

				      - name: Install PR sizing label script

				        run: |

				          # Clone into a temporary directory to avoid overwriting

				          # any existing github directory.

				          pushd $(mktemp -d) &>/dev/null

				          git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts

				          sudo install pr-add-size-label.sh /usr/local/bin

				          popd &>/dev/null

				      - name: Add PR sizing label

				        env:

				          GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_PR_SIZE_TOKEN }}

				        run: |

				          pr=${{ github.event.number }}

				          # Removing man-db, workflow kept failing, fixes: #4480

				          sudo apt -y remove --purge man-db

				          sudo apt -y install diffstat patchutils

				          pr-add-size-label.sh -p "$pr"

									
										33

.github/workflows/auto-backport.yaml
									
										vendored
									
												View File
											
				@@ -1,33 +0,0 @@

				on:

				  pull_request_target:

				    types: ["labeled", "closed"]

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  backport:

				    name: Backport PR

				    runs-on: ubuntu-latest

				    if: |

				      github.event.pull_request.merged == true

				      && contains(github.event.pull_request.labels.*.name, 'auto-backport')

				      && (

				        (github.event.action == 'labeled' && github.event.label.name == 'auto-backport')

				        || (github.event.action == 'closed')

				      )

				    steps:

				      - name: Backport Action

				        uses: sqren/backport-github-action@v8.9.2

				        with:

				          github_token: ${{ secrets.GITHUB_TOKEN }}

				          auto_backport_label_prefix: backport-to-

				      - name: Info log

				        if: ${{ success() }}

				        run: cat /home/runner/.backport/backport.info.log

				      - name: Debug log

				        if: ${{ failure() }}

				        run: cat /home/runner/.backport/backport.debug.log

									
										412

.github/workflows/basic-ci-amd64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,412 @@

				name: CI | Basic amd64 tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions:

				  contents: read

				jobs:

				  run-containerd-sandboxapi:

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # the all the tests are not flaky, otherwise we'll fail

				      # all the tests due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        containerd_version: ['active']

				        vmm: ['dragonball', 'cloud-hypervisor', 'qemu-runtime-rs']

				    # TODO: enable me when https://github.com/containerd/containerd/issues/11640 is fixed

				    if: false

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      SANDBOXER: "shim"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts

				      - name: Run containerd-sandboxapi tests

				        timeout-minutes: 10

				        run: bash tests/integration/cri-containerd/gha-run.sh run

				  run-containerd-stability:

				    strategy:

				      fail-fast: false

				      matrix:

				        containerd_version: ['lts', 'active']

				        vmm: ['clh', 'cloud-hypervisor', 'dragonball', 'qemu', 'stratovirt']

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      SANDBOXER: "podsandbox"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/stability/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/stability/gha-run.sh install-kata kata-artifacts

				      - name: Run containerd-stability tests

				        timeout-minutes: 15

				        run: bash tests/stability/gha-run.sh run

				  run-nydus:

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # the all the tests are not flaky, otherwise we'll fail

				      # all the tests due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        containerd_version: ['lts', 'active']

				        vmm: ['clh', 'qemu', 'dragonball', 'stratovirt']

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/nydus/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/nydus/gha-run.sh install-kata kata-artifacts

				      - name: Run nydus tests

				        timeout-minutes: 10

				        run: bash tests/integration/nydus/gha-run.sh run

				  run-runk:

				    # Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether

				    if: false

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINERD_VERSION: lts

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/runk/gha-run.sh install-dependencies

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts

				      - name: Run runk tests

				        timeout-minutes: 10

				        run: bash tests/integration/runk/gha-run.sh run

				  run-tracing:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - clh # cloud-hypervisor

				          - qemu

				    # TODO: enable me when https://github.com/kata-containers/kata-containers/issues/9763 is fixed

				    # TODO: Transition to free runner (see #9940).

				    if: false

				    runs-on: garm-ubuntu-2204-smaller

				    env:

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/functional/tracing/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/functional/tracing/gha-run.sh install-kata kata-artifacts

				      - name: Run tracing tests

				        timeout-minutes: 15

				        run: bash tests/functional/tracing/gha-run.sh run

				  run-vfio:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - clh

				          - qemu

				    # TODO: enable with clh when https://github.com/kata-containers/kata-containers/issues/9764 is fixed

				    # TODO: enable with qemu when https://github.com/kata-containers/kata-containers/issues/9851 is fixed

				    # TODO: Transition to free runner (see #9940).

				    if: false

				    runs-on: garm-ubuntu-2304

				    env:

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/functional/vfio/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Run vfio tests

				        timeout-minutes: 15

				        run: bash tests/functional/vfio/gha-run.sh run

				  run-docker-tests:

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # all the tests are not flaky, otherwise we'll fail them

				      # all due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        vmm:

				          - clh

				          - qemu

				          - dragonball

				          - cloud-hypervisor

				    runs-on: ubuntu-22.04

				    env:

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/docker/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/docker/gha-run.sh install-kata kata-artifacts

				      - name: Run docker smoke test

				        timeout-minutes: 5

				        run: bash tests/integration/docker/gha-run.sh run

				  run-nerdctl-tests:

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # all the tests are not flaky, otherwise we'll fail them

				      # all due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        vmm:

				          - clh

				          - dragonball

				          - qemu

				          - cloud-hypervisor

				    runs-on: ubuntu-22.04

				    env:

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        env:

				          GITHUB_API_TOKEN: ${{ github.token }}

				          GH_TOKEN: ${{ github.token }}

				        run: bash tests/integration/nerdctl/gha-run.sh install-dependencies

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/nerdctl/gha-run.sh install-kata kata-artifacts

				      - name: Run nerdctl smoke test

				        timeout-minutes: 5

				        run: bash tests/integration/nerdctl/gha-run.sh run

				      - name: Collect artifacts ${{ matrix.vmm }}

				        if: always()

				        run: bash tests/integration/nerdctl/gha-run.sh collect-artifacts

				        continue-on-error: true

				      - name: Archive artifacts ${{ matrix.vmm }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: nerdctl-tests-garm-${{ matrix.vmm }}

				          path: /tmp/artifacts

				          retention-days: 1

				  run-kata-agent-apis:

				    runs-on: ubuntu-22.04

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/functional/kata-agent-apis/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/functional/kata-agent-apis/gha-run.sh install-kata kata-artifacts

				      - name: Run kata agent api tests with agent-ctl

				        run: bash tests/functional/kata-agent-apis/gha-run.sh run

									
										147

.github/workflows/basic-ci-s390x.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,147 @@

				name: CI | Basic s390x tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions:

				  contents: read

				jobs:

				  run-containerd-sandboxapi:

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # the all the tests are not flaky, otherwise we'll fail

				      # all the tests due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        containerd_version: ['active']

				        vmm: ['qemu-runtime-rs']

				    # TODO: enable me when https://github.com/containerd/containerd/issues/11640 is fixed

				    if: false

				    runs-on: s390x-large

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      SANDBOXER: "shim"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/cri-containerd/gha-run.sh

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts

				      - name: Run containerd-sandboxapi tests

				        timeout-minutes: 10

				        run: bash tests/integration/cri-containerd/gha-run.sh run

				  run-containerd-stability:

				    strategy:

				      fail-fast: false

				      matrix:

				        containerd_version: ['lts', 'active']

				        vmm: ['qemu']

				    runs-on: s390x-large

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      SANDBOXER: "podsandbox"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/stability/gha-run.sh install-dependencies

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/stability/gha-run.sh install-kata kata-artifacts

				      - name: Run containerd-stability tests

				        timeout-minutes: 15

				        run: bash tests/stability/gha-run.sh run

				  run-docker-tests:

				    strategy:

				      # We can set this to true whenever we're 100% sure that

				      # all the tests are not flaky, otherwise we'll fail them

				      # all due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        vmm: ['qemu']

				    runs-on: s390x-large

				    env:

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/docker/gha-run.sh install-dependencies

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/docker/gha-run.sh install-kata kata-artifacts

				      - name: Run docker smoke test

				        timeout-minutes: 5

				        run: bash tests/integration/docker/gha-run.sh run

									
										132

.github/workflows/build-checks-preview-riscv64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,132 @@

				# This yaml is designed to be used until all components listed in

				# `build-checks.yaml` are supported

				on:

				  workflow_dispatch:

				    inputs:

				      instance:

				        default: "riscv-builder"

				        description: "Default instance when manually triggering"

				  workflow_call:

				    inputs:

				      instance:

				        required: true

				        type: string

				permissions:

				  contents: read

				name: Build checks preview riscv64

				jobs:

				  check:

				    runs-on: ${{ inputs.instance }}

				    strategy:

				      fail-fast: false

				      matrix:

				        command:

				          - "make vendor"

				          - "make check"

				          - "make test"

				          - "sudo -E PATH=\"$PATH\" make test"

				        component:

				          - name: agent

				            path: src/agent

				            needs:

				              - rust

				              - libdevmapper

				              - libseccomp

				              - protobuf-compiler

				              - clang

				          - name: agent-ctl

				            path: src/tools/agent-ctl

				            needs:

				              - rust

				              - musl-tools

				              - protobuf-compiler

				              - clang

				          - name: trace-forwarder

				            path: src/tools/trace-forwarder

				            needs:

				              - rust

				              - musl-tools

				          - name: genpolicy

				            path: src/tools/genpolicy

				            needs:

				              - rust

				              - musl-tools

				              - protobuf-compiler

				          - name: runtime

				            path: src/runtime

				            needs:

				              - golang

				              - XDG_RUNTIME_DIR

				          - name: runtime-rs

				            path: src/runtime-rs

				            needs:

				              - rust

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE" "$HOME"

				          sudo rm -rf "$GITHUB_WORKSPACE"/* || { sleep 10 && sudo rm -rf "$GITHUB_WORKSPACE"/*; }

				          sudo rm -f /tmp/kata_hybrid*  # Sometime we got leftover from test_setup_hvsock_failed()

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Install yq

				        run: |

				          ./ci/install_yq.sh

				        env:

				          INSTALL_IN_GOPATH: false

				      - name: Install golang

				        if: contains(matrix.component.needs, 'golang')

				        run: |

				          ./tests/install_go.sh -f -p

				          echo "/usr/local/go/bin" >> "$GITHUB_PATH"

				      - name: Setup rust

				        if: contains(matrix.component.needs, 'rust')

				        run: |

				          ./tests/install_rust.sh

				          echo "${HOME}/.cargo/bin" >> "$GITHUB_PATH"

				          if [ "$(uname -m)" == "x86_64" ] || [ "$(uname -m)" == "aarch64" ]; then

				            sudo apt-get update && sudo apt-get -y install musl-tools

				          fi

				      - name: Install devicemapper

				        if: contains(matrix.component.needs, 'libdevmapper') && matrix.command == 'make check'

				        run: sudo apt-get update && sudo apt-get -y install libdevmapper-dev

				      - name: Install libseccomp

				        if: contains(matrix.component.needs, 'libseccomp') && matrix.command != 'make vendor' && matrix.command != 'make check'

				        run: |

				          libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)

				          gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)

				          ./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"

				          echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"

				          echo "LIBSECCOMP_LINK_TYPE=static" >> "$GITHUB_ENV"

				          echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> "$GITHUB_ENV"

				      - name: Install protobuf-compiler

				        if: contains(matrix.component.needs, 'protobuf-compiler') && matrix.command != 'make vendor'

				        run: sudo apt-get update && sudo apt-get -y install protobuf-compiler

				      - name: Install clang

				        if: contains(matrix.component.needs, 'clang') && matrix.command == 'make check'

				        run: sudo apt-get update && sudo apt-get -y install clang

				      - name: Setup XDG_RUNTIME_DIR

				        if: contains(matrix.component.needs, 'XDG_RUNTIME_DIR') && matrix.command != 'make check'

				        run: |

				          XDG_RUNTIME_DIR=$(mktemp -d "/tmp/kata-tests-$USER.XXX" | tee >(xargs chmod 0700))

				          echo "XDG_RUNTIME_DIR=${XDG_RUNTIME_DIR}" >> "$GITHUB_ENV"

				      - name: Skip tests that depend on virtualization capable runners when needed

				        if: inputs.instance == 'riscv-builder'

				        run: |

				          echo "GITHUB_RUNNER_CI_NON_VIRT=true" >> "$GITHUB_ENV"

				      - name: Running `${{ matrix.command }}` for ${{ matrix.component.name }}

				        run: |

				          cd ${{ matrix.component.path }}

				          ${{ matrix.command }}

				        env:

				          RUST_BACKTRACE: "1"

				          RUST_LIB_BACKTRACE: "0"

				          SKIP_GO_VERSION_CHECK: "1"

									
										130

.github/workflows/build-checks.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,130 @@

				on:

				  workflow_call:

				    inputs:

				      instance:

				        required: true

				        type: string

				permissions:

				  contents: read

				name: Build checks

				jobs:

				  check:

				    runs-on: ${{ inputs.instance }}

				    strategy:

				      fail-fast: false

				      matrix:

				        command:

				          - "make vendor"

				          - "make check"

				          - "make test"

				          - "sudo -E PATH=\"$PATH\" make test"

				        component:

				          - name: agent

				            path: src/agent

				            needs:

				              - rust

				              - libdevmapper

				              - libseccomp

				              - protobuf-compiler

				              - clang

				          - name: dragonball

				            path: src/dragonball

				            needs:

				              - rust

				          - name: runtime

				            path: src/runtime

				            needs:

				              - golang

				              - XDG_RUNTIME_DIR

				          - name: runtime-rs

				            path: src/runtime-rs

				            needs:

				              - rust

				          - name: agent-ctl

				            path: src/tools/agent-ctl

				            needs:

				              - rust

				              - protobuf-compiler

				              - clang

				          - name: kata-ctl

				            path: src/tools/kata-ctl

				            needs:

				              - rust

				          - name: trace-forwarder

				            path: src/tools/trace-forwarder

				            needs:

				              - rust

				          - name: genpolicy

				            path: src/tools/genpolicy

				            needs:

				              - rust

				              - protobuf-compiler

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE" "$HOME"

				          sudo rm -rf "$GITHUB_WORKSPACE"/* || { sleep 10 && sudo rm -rf "$GITHUB_WORKSPACE"/*; }

				          sudo rm -f /tmp/kata_hybrid*  # Sometime we got leftover from test_setup_hvsock_failed()

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Install yq

				        run: |

				          ./ci/install_yq.sh

				        env:

				          INSTALL_IN_GOPATH: false

				      - name: Install golang

				        if: contains(matrix.component.needs, 'golang')

				        run: |

				          ./tests/install_go.sh -f -p

				          echo "/usr/local/go/bin" >> "$GITHUB_PATH"

				      - name: Setup rust

				        if: contains(matrix.component.needs, 'rust')

				        run: |

				          ./tests/install_rust.sh

				          echo "${HOME}/.cargo/bin" >> "$GITHUB_PATH"

				          if [ "$(uname -m)" == "x86_64" ] || [ "$(uname -m)" == "aarch64" ]; then

				            sudo apt-get update && sudo apt-get -y install musl-tools

				          fi

				      - name: Install devicemapper

				        if: contains(matrix.component.needs, 'libdevmapper') && matrix.command == 'make check'

				        run: sudo apt-get update && sudo apt-get -y install libdevmapper-dev

				      - name: Install libseccomp

				        if: contains(matrix.component.needs, 'libseccomp') && matrix.command != 'make vendor' && matrix.command != 'make check'

				        run: |

				          libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)

				          gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)

				          ./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"

				          echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"

				          echo "LIBSECCOMP_LINK_TYPE=static" >> "$GITHUB_ENV"

				          echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> "$GITHUB_ENV"

				      - name: Install protobuf-compiler

				        if: contains(matrix.component.needs, 'protobuf-compiler') && matrix.command != 'make vendor'

				        run: sudo apt-get update && sudo apt-get -y install protobuf-compiler

				      - name: Install clang

				        if: contains(matrix.component.needs, 'clang') && matrix.command == 'make check'

				        run: sudo apt-get update && sudo apt-get -y install clang

				      - name: Setup XDG_RUNTIME_DIR

				        if: contains(matrix.component.needs, 'XDG_RUNTIME_DIR') && matrix.command != 'make check'

				        run: |

				          XDG_RUNTIME_DIR=$(mktemp -d "/tmp/kata-tests-$USER.XXX" | tee >(xargs chmod 0700))

				          echo "XDG_RUNTIME_DIR=${XDG_RUNTIME_DIR}" >> "$GITHUB_ENV"

				      - name: Skip tests that depend on virtualization capable runners when needed

				        if: ${{ endsWith(inputs.instance, '-arm') }}

				        run: |

				          echo "GITHUB_RUNNER_CI_NON_VIRT=true" >> "$GITHUB_ENV"

				      - name: Running `${{ matrix.command }}` for ${{ matrix.component.name }}

				        run: |

				          cd ${{ matrix.component.path }}

				          ${{ matrix.command }}

				        env:

				          RUST_BACKTRACE: "1"

				          RUST_LIB_BACKTRACE: "0"

				          SKIP_GO_VERSION_CHECK: "1"

									
										297

.github/workflows/build-kata-static-tarball-amd64.yaml
									
										vendored
									
												View File
												
				@@ -16,94 +16,339 @@ on:

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: false

				permissions:

				  contents: read

				jobs:

				  build-asset:

				    runs-on: ubuntu-latest

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    strategy:

				      matrix:

				        asset:

				          - agent

				          - agent-ctl

				          - busybox

				          - cloud-hypervisor

				          - cloud-hypervisor-glibc

				          - coco-guest-components

				          - csi-kata-directvolume

				          - firecracker

				          - genpolicy

				          - kata-ctl

				          - kata-manager

				          - kernel

				          - kernel-sev

				          - kernel-confidential

				          - kernel-dragonball-experimental

				          - kernel-tdx-experimental

				          - kernel-nvidia-gpu

				          - kernel-nvidia-gpu-snp

				          - kernel-nvidia-gpu-tdx-experimental

				          - kernel-nvidia-gpu-confidential

				          - nydus

				          - ovmf

				          - ovmf-sev

				          - pause-image

				          - qemu

				          - qemu-snp-experimental

				          - qemu-tdx-experimental

				          - rootfs-image

				          - rootfs-image-tdx

				          - rootfs-initrd

				          - rootfs-initrd-mariner

				          - rootfs-initrd-sev

				          - shim-v2

				          - tdvf

				          - stratovirt

				          - trace-forwarder

				          - virtiofsd

				        stage:

				          - ${{ inputs.stage }}

				        exclude:

				          - asset: cloud-hypervisor-glibc

				            stage: release

				    env:

				      PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@v2

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        id: build

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          sudo cp -r "${build_dir}" "kata-build"

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: Parse OCI image name and digest

				        id: parse-oci-segments

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        run: |

				          oci_image="$(<"build/${{ matrix.asset }}-oci-image")"

				          echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"

				          echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"

				      - uses: oras-project/setup-oras@5c0b487ce3fe0ce3ab0d034e63669e426e294e4d # v1.2.2

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          version: "1.2.0"

				      # for pushing attestations to the registry

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - uses: actions/attest-build-provenance@ef244123eb79f2f7a7e75d99086184180e6d0018 # v1.4.4

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}

				          subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}

				          push-to-registry: true

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@v3

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-amd64${{ inputs.tarball-suffix }}

				          name: kata-artifacts-amd64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          retention-days: 1

				          retention-days: 15

				          if-no-files-found: error

				      - name: store-extratarballs-artifact ${{ matrix.asset }}

				        if: ${{ startsWith(matrix.asset, 'kernel-nvidia-gpu') }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-amd64-${{ matrix.asset }}-headers${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}-headers.tar.xz

				          retention-days: 15

				          if-no-files-found: error

				  build-asset-rootfs:

				    runs-on: ubuntu-22.04

				    needs: build-asset

				    permissions:

				      contents: read

				      packages: write

				    strategy:

				      matrix:

				        asset:

				          - rootfs-image

				          - rootfs-image-confidential

				          - rootfs-image-mariner

				          - rootfs-initrd

				          - rootfs-initrd-confidential

				          - rootfs-initrd-nvidia-gpu

				          - rootfs-initrd-nvidia-gpu-confidential

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build ${{ matrix.asset }}

				        id: build

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-amd64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          retention-days: 15

				          if-no-files-found: error

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts:

				    runs-on: ubuntu-22.04

				    needs: build-asset-rootfs

				    strategy:

				      matrix:

				        asset:

				          - busybox

				          - coco-guest-components

				          - kernel-nvidia-gpu-headers

				          - kernel-nvidia-gpu-confidential-headers

				          - pause-image

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        with:

				          name: kata-artifacts-amd64-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts-for-release:

				    runs-on: ubuntu-22.04

				    needs: build-asset-rootfs

				    strategy:

				      matrix:

				        asset:

				          - agent

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        if: ${{ inputs.stage == 'release' }}

				        with:

				          name: kata-artifacts-amd64-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  build-asset-shim-v2:

				    runs-on: ubuntu-22.04

				    needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts, remove-rootfs-binary-artifacts-for-release]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build shim-v2

				        id: build

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: shim-v2

				          TAR_OUTPUT: shim-v2.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				          MEASURED_ROOTFS: yes

				      - name: store-artifact shim-v2

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-amd64-shim-v2${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-shim-v2.tar.xz

				          retention-days: 15

				          if-no-files-found: error

				  create-kata-tarball:

				    runs-on: ubuntu-latest

				    needs: build-asset

				    runs-on: ubuntu-22.04

				    needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          fetch-tags: true

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@v3

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-artifacts-amd64${{ inputs.tarball-suffix }}

				          pattern: kata-artifacts-amd64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: merge-artifacts

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml

				        env:

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifacts

				        uses: actions/upload-artifact@v3

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-static.tar.xz

				          retention-days: 1

				          retention-days: 15

				          if-no-files-found: error

									
										281

.github/workflows/build-kata-static-tarball-arm64.yaml
									
										vendored
									
												View File
												
				@@ -16,84 +16,309 @@ on:

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: false

				permissions:

				  contents: read

				jobs:

				  build-asset:

				    runs-on: arm64

				    runs-on: ubuntu-22.04-arm

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    strategy:

				      matrix:

				        asset:

				          - agent

				          - busybox

				          - cloud-hypervisor

				          - firecracker

				          - kernel

				          - kernel-dragonball-experimental

				          - kernel-nvidia-gpu

				          - nydus

				          - ovmf

				          - qemu

				          - rootfs-image

				          - rootfs-initrd

				          - shim-v2

				          - stratovirt

				          - virtiofsd

				        stage:

				          - ${{ inputs.stage }}

				    env:

				      PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R $USER:$USER $GITHUB_WORKSPACE

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@v2

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          sudo cp -r "${build_dir}" "kata-build"

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: Parse OCI image name and digest

				        id: parse-oci-segments

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        run: |

				          oci_image="$(<"build/${{ matrix.asset }}-oci-image")"

				          echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"

				          echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"

				      - uses: oras-project/setup-oras@5c0b487ce3fe0ce3ab0d034e63669e426e294e4d # v1.2.2

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          version: "1.2.0"

				      # for pushing attestations to the registry

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - uses: actions/attest-build-provenance@ef244123eb79f2f7a7e75d99086184180e6d0018 # v1.4.4

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}

				          subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}

				          push-to-registry: true

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@v3

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-arm64${{ inputs.tarball-suffix }}

				          name: kata-artifacts-arm64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          retention-days: 1

				          retention-days: 15

				          if-no-files-found: error

				      - name: store-extratarballs-artifact ${{ matrix.asset }}

				        if: ${{ startsWith(matrix.asset, 'kernel-nvidia-gpu') }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-arm64-${{ matrix.asset }}-headers${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}-headers.tar.xz

				          retention-days: 15

				          if-no-files-found: error

				  build-asset-rootfs:

				    runs-on: ubuntu-22.04-arm

				    needs: build-asset

				    permissions:

				      contents: read

				      packages: write

				    strategy:

				      matrix:

				        asset:

				          - rootfs-image

				          - rootfs-initrd

				          - rootfs-initrd-nvidia-gpu

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build ${{ matrix.asset }}

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-arm64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          retention-days: 15

				          if-no-files-found: error

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts:

				    runs-on: ubuntu-22.04-arm

				    needs: build-asset-rootfs

				    strategy:

				      matrix:

				        asset:

				          - busybox

				          - kernel-nvidia-gpu-headers

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        with:

				          name: kata-artifacts-arm64-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts-for-release:

				    runs-on: ubuntu-22.04-arm

				    needs: build-asset-rootfs

				    strategy:

				      matrix:

				        asset:

				          - agent

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        if: ${{ inputs.stage == 'release' }}

				        with:

				          name: kata-artifacts-arm64-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  build-asset-shim-v2:

				    runs-on: ubuntu-22.04-arm

				    needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts, remove-rootfs-binary-artifacts-for-release]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build shim-v2

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: shim-v2

				          TAR_OUTPUT: shim-v2.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact shim-v2

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-arm64-shim-v2${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-shim-v2.tar.xz

				          retention-days: 15

				          if-no-files-found: error

				  create-kata-tarball:

				    runs-on: arm64

				    needs: build-asset

				    runs-on: ubuntu-22.04-arm

				    needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R $USER:$USER $GITHUB_WORKSPACE

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          fetch-tags: true

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@v3

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-artifacts-arm64${{ inputs.tarball-suffix }}

				          pattern: kata-artifacts-arm64-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: merge-artifacts

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml

				        env:

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifacts

				        uses: actions/upload-artifact@v3

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-static-tarball-arm64${{ inputs.tarball-suffix }}

				          path: kata-static.tar.xz

				          retention-days: 1

				          retention-days: 15

				          if-no-files-found: error

									
										267

.github/workflows/build-kata-static-tarball-ppc64le.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,267 @@

				name: CI | Build kata-static tarball for ppc64le

				on:

				  workflow_call:

				    inputs:

				      stage:

				        required: false

				        type: string

				        default: test

				      tarball-suffix:

				        required: false

				        type: string

				      push-to-registry:

				        required: false

				        type: string

				        default: no

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions:

				  contents: read

				jobs:

				  build-asset:

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ppc64le

				    strategy:

				      matrix:

				        asset:

				          - agent

				          - kernel

				          - qemu

				          - virtiofsd

				        stage:

				          - ${{ inputs.stage }}

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-ppc64le-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          retention-days: 1

				          if-no-files-found: error

				  build-asset-rootfs:

				    runs-on: ppc64le

				    needs: build-asset

				    permissions:

				      contents: read

				      packages: write

				    strategy:

				      matrix:

				        asset:

				          - rootfs-initrd

				        stage:

				          - ${{ inputs.stage }}

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build ${{ matrix.asset }}

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-ppc64le-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          retention-days: 1

				          if-no-files-found: error

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts:

				    runs-on: ubuntu-22.04

				    needs: build-asset-rootfs

				    strategy:

				      matrix:

				        asset:

				          - agent

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        if: ${{ inputs.stage == 'release' }}

				        with:

				          name: kata-artifacts-ppc64le-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  build-asset-shim-v2:

				    runs-on: ppc64le

				    needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build shim-v2

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: shim-v2

				          TAR_OUTPUT: shim-v2.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact shim-v2

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-ppc64le-shim-v2${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-shim-v2.tar.xz

				          retention-days: 1

				          if-no-files-found: error

				  create-kata-tarball:

				    runs-on: ppc64le

				    needs: [build-asset, build-asset-rootfs, build-asset-shim-v2]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R "$USER":"$USER" "$GITHUB_WORKSPACE"

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          fetch-tags: true

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-ppc64le-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: merge-artifacts

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml

				        env:

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifacts

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-static-tarball-ppc64le${{ inputs.tarball-suffix }}

				          path: kata-static.tar.xz

				          retention-days: 1

				          if-no-files-found: error

									
										86

.github/workflows/build-kata-static-tarball-riscv64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,86 @@

				name: CI | Build kata-static tarball for riscv64

				on:

				  workflow_call:

				    inputs:

				      stage:

				        required: false

				        type: string

				        default: test

				      tarball-suffix:

				        required: false

				        type: string

				      push-to-registry:

				        required: false

				        type: string

				        default: no

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions:

				  contents: read

				jobs:

				  build-asset:

				    runs-on: riscv-builder

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    strategy:

				      matrix:

				        asset:

				          - kernel

				          - virtiofsd

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-riscv64-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          retention-days: 15

				          if-no-files-found: error

									
										309

.github/workflows/build-kata-static-tarball-s390x.yaml
									
										vendored
									
												View File
												
				@@ -16,81 +16,338 @@ on:

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      CI_HKD_PATH:

				        required: true

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions:

				  contents: read

				jobs:

				  build-asset:

				    runs-on: s390x

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    strategy:

				      matrix:

				        asset:

				          - agent

				          - coco-guest-components

				          - kernel

				          - kernel-confidential

				          - pause-image

				          - qemu

				          - rootfs-image

				          - rootfs-initrd

				          - shim-v2

				          - virtiofsd

				        stage:

				          - ${{ inputs.stage }}

				    env:

				      PERFORM_ATTESTATION: ${{ matrix.asset == 'agent' && inputs.push-to-registry == 'yes' && 'yes' || 'no' }}

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R $USER:$USER $GITHUB_WORKSPACE

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@v2

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Build ${{ matrix.asset }}

				        id: build

				        run: |

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          sudo cp -r "${build_dir}" "kata-build"

				          sudo chown -R $(id -u):$(id -g) "kata-build"

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: Parse OCI image name and digest

				        id: parse-oci-segments

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        run: |

				          oci_image="$(<"build/${{ matrix.asset }}-oci-image")"

				          echo "oci-name=${oci_image%@*}" >> "$GITHUB_OUTPUT"

				          echo "oci-digest=${oci_image#*@}" >> "$GITHUB_OUTPUT"

				      # for pushing attestations to the registry

				      - uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - uses: actions/attest-build-provenance@ef244123eb79f2f7a7e75d99086184180e6d0018 # v1.4.4

				        if: ${{ env.PERFORM_ATTESTATION == 'yes' }}

				        with:

				          subject-name: ${{ steps.parse-oci-segments.outputs.oci-name }}

				          subject-digest: ${{ steps.parse-oci-segments.outputs.oci-digest }}

				          push-to-registry: true

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@v3

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-s390x-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          retention-days: 15

				          if-no-files-found: error

				  build-asset-rootfs:

				    runs-on: s390x

				    needs: build-asset

				    permissions:

				      contents: read

				      packages: write

				    strategy:

				      matrix:

				        asset:

				          - rootfs-image

				          - rootfs-image-confidential

				          - rootfs-initrd

				          - rootfs-initrd-confidential

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build ${{ matrix.asset }}

				        id: build

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: ${{ matrix.asset }}

				          TAR_OUTPUT: ${{ matrix.asset }}.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifact ${{ matrix.asset }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-s390x-${{ matrix.asset }}${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          retention-days: 15

				          if-no-files-found: error

				  build-asset-boot-image-se:

				    runs-on: s390x

				    needs: [build-asset, build-asset-rootfs]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Place a host key document

				        run: |

				          mkdir -p "host-key-document"

				          cp "${CI_HKD_PATH}" "host-key-document"

				        env:

				          CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      - name: Build boot-image-se

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "boot-image-se"

				          make boot-image-se-tarball

				          build_dir=$(readlink -f build)

				          sudo cp -r "${build_dir}" "kata-build"

				          sudo chown -R "$(id -u)":"$(id -g)" "kata-build"

				        env:

				          HKD_PATH: "host-key-document"

				      - name: store-artifact boot-image-se

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-s390x${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-${{ matrix.asset }}.tar.xz

				          path: kata-build/kata-static-boot-image-se.tar.xz

				          retention-days: 1

				          if-no-files-found: error

				  # We don't need the binaries installed in the rootfs as part of the release tarball, so can delete them now we've built the rootfs

				  remove-rootfs-binary-artifacts:

				    runs-on: ubuntu-22.04

				    needs: [build-asset-rootfs, build-asset-boot-image-se]

				    strategy:

				      matrix:

				        asset:

				          - agent

				          - coco-guest-components

				          - pause-image

				    steps:

				      - uses: geekyeggo/delete-artifact@f275313e70c08f6120db482d7a6b98377786765b # v5.1.0

				        if: ${{ inputs.stage == 'release' }}

				        with:

				          name: kata-artifacts-s390x-${{ matrix.asset}}${{ inputs.tarball-suffix }}

				  build-asset-shim-v2:

				    runs-on: s390x

				    needs: [build-asset, build-asset-rootfs, remove-rootfs-binary-artifacts]

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.push-to-registry == 'yes' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0 # This is needed in order to keep the commit ids history

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: Build shim-v2

				        id: build

				        run: |

				          ./tests/gha-adjust-to-use-prebuilt-components.sh kata-artifacts "${KATA_ASSET}"

				          make "${KATA_ASSET}-tarball"

				          build_dir=$(readlink -f build)

				          # store-artifact does not work with symlink

				          mkdir -p kata-build && cp "${build_dir}"/kata-static-"${KATA_ASSET}"*.tar.* kata-build/.

				        env:

				          KATA_ASSET: shim-v2

				          TAR_OUTPUT: shim-v2.tar.gz

				          PUSH_TO_REGISTRY: ${{ inputs.push-to-registry }}

				          ARTEFACT_REGISTRY: ghcr.io

				          ARTEFACT_REGISTRY_USERNAME: ${{ github.actor }}

				          ARTEFACT_REGISTRY_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				          MEASURED_ROOTFS: no

				      - name: store-artifact shim-v2

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-artifacts-s390x-shim-v2${{ inputs.tarball-suffix }}

				          path: kata-build/kata-static-shim-v2.tar.xz

				          retention-days: 15

				          if-no-files-found: error

				  create-kata-tarball:

				    runs-on: s390x

				    needs: build-asset

				    needs:

				      - build-asset

				      - build-asset-rootfs

				      - build-asset-boot-image-se

				      - build-asset-shim-v2

				    permissions:

				      contents: read

				      packages: write

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R $USER:$USER $GITHUB_WORKSPACE

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          fetch-tags: true

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-artifacts

				        uses: actions/download-artifact@v3

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-artifacts-s390x${{ inputs.tarball-suffix }}

				          pattern: kata-artifacts-s390x-*${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				          merge-multiple: true

				      - name: merge-artifacts

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-merge-builds.sh kata-artifacts versions.yaml

				        env:

				          RELEASE: ${{ inputs.stage == 'release' && 'yes' || 'no' }}

				      - name: store-artifacts

				        uses: actions/upload-artifact@v3

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}

				          path: kata-static.tar.xz

				          retention-days: 1

				          retention-days: 15

				          if-no-files-found: error

									
										15

.github/workflows/cargo-deny-runner.yaml
									
										vendored
									
												View File
												
				@@ -6,26 +6,27 @@ on:

				      - edited

				      - reopened

				      - synchronize

				    paths-ignore: [ '**.md', '**.png', '**.jpg', '**.jpeg', '**.svg', '/docs/**' ]

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				permissions:

				  contents: read

				jobs:

				  cargo-deny-runner:

				    runs-on: ubuntu-latest

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout Code

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        uses: actions/checkout@v3

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Generate Action

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        run: bash cargo-deny-generator.sh

				        working-directory: ./.github/cargo-deny-composite-action/

				        env:

				          GOPATH: ${{ runner.workspace }}/kata-containers

				          GOPATH: ${{ github.workspace }}/kata-containers

				      - name: Run Action

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        uses: ./.github/cargo-deny-composite-action

									
										33

.github/workflows/ci-coco-stability.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,33 @@

				name: Kata Containers CoCo Stability Tests Weekly

				on:

				  # Note: This workload is not currently maintained, so skipping it's scheduled runs

				  # schedule:

				  #   - cron: '0 0 * * 0'

				  workflow_dispatch:

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				permissions:

				  contents: read

				jobs:

				  kata-containers-ci-on-push:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/ci-weekly.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      pr-number: "weekly"

				      tag: ${{ github.sha }}-weekly

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

									
										34

.github/workflows/ci-devel.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,34 @@

				name: Kata Containers CI (manually triggered)

				on:

				  workflow_dispatch:

				permissions:

				  contents: read

				jobs:

				  kata-containers-ci-on-push:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/ci.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      pr-number: "dev"

				      tag: ${{ github.sha }}-dev

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      ITA_KEY: ${{ secrets.ITA_KEY }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-checks:

				    uses: ./.github/workflows/build-checks.yaml

				    with:

				      instance: ubuntu-22.04

									
										26

.github/workflows/ci-nightly-s390x.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,26 @@

				on:

				  schedule:

				    - cron: '0 5 * * *'

				name: Nightly CI for s390x

				permissions:

				  contents: read

				jobs:

				  check-internal-test-result:

				    runs-on: s390x

				    strategy:

				      fail-fast: false

				      matrix:

				        test_title:

				          - kata-vfio-ap-e2e-tests

				          - cc-vfio-ap-e2e-tests

				          - cc-se-e2e-tests

				    steps:

				    - name: Fetch a test result for {{ matrix.test_title }}

				      run: |

				        file_name="${TEST_TITLE}-$(date +%Y-%m-%d).log"

				        "/home/${USER}/script/handle_test_log.sh" download "$file_name"

				      env:

				        TEST_TITLE: ${{ matrix.test_title }}

									
										19

.github/workflows/ci-nightly.yaml
									
										vendored
									
												View File
												
				@@ -2,17 +2,32 @@ name: Kata Containers Nightly CI

				on:

				  schedule:

				    - cron: '0 0 * * *'

				  workflow_dispatch:

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				permissions:

				  contents: read

				jobs:

				  kata-containers-ci-on-push:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/ci.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      pr-number: "nightly"

				      tag: ${{ github.sha }}-nightly

				    secrets: inherit

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      ITA_KEY: ${{ secrets.ITA_KEY }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

									
										31

.github/workflows/ci-on-push.yaml
									
										vendored
									
												View File
												
				@@ -13,19 +13,42 @@ on:

				      - synchronize

				      - reopened

				      - labeled

				    paths-ignore:

				      - 'docs/**'

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  kata-containers-ci-on-push:

				  skipper:

				    if: ${{ contains(github.event.pull_request.labels.*.name, 'ok-to-test') }}

				    uses: ./.github/workflows/gatekeeper-skipper.yaml

				    with:

				      commit-hash: ${{ github.event.pull_request.head.sha }}

				      target-branch: ${{ github.event.pull_request.base.ref }}

				  kata-containers-ci-on-push:

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_build != 'yes' }}

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/ci.yaml

				    with:

				      commit-hash: ${{ github.event.pull_request.head.sha }}

				      pr-number: ${{ github.event.pull_request.number }}

				      tag: ${{ github.event.pull_request.number }}-${{ github.event.pull_request.head.sha }}

				    secrets: inherit

				      target-branch: ${{ github.event.pull_request.base.ref }}

				      skip-test: ${{ needs.skipper.outputs.skip_test }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      ITA_KEY: ${{ secrets.ITA_KEY }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

									
										124

.github/workflows/ci-weekly.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,124 @@

				name: Run the CoCo Kata Containers Stability CI

				on:

				  workflow_call:

				    inputs:

				      commit-hash:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD:

				        required: true

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions:

				  contents: read

				jobs:

				  build-kata-static-tarball-amd64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				  publish-kata-deploy-payload-amd64:

				    needs: build-kata-static-tarball-amd64

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ubuntu-22.04

				      arch: amd64

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-and-publish-tee-confidential-unencrypted-image:

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Set up QEMU

				        uses: docker/setup-qemu-action@29109295f81e9208d7d86ff1c6c12d2833863392 # v3.6.0

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Docker build and push

				        uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0

				        with:

				          tags: ghcr.io/kata-containers/test-images:unencrypted-${{ inputs.pr-number }}

				          push: true

				          context: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/

				          platforms: linux/amd64

				          file: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/Dockerfile

				  run-kata-coco-stability-tests:

				    needs: [publish-kata-deploy-payload-amd64, build-and-publish-tee-confidential-unencrypted-image]

				    uses: ./.github/workflows/run-kata-coco-stability-tests.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				      tarball-suffix: -${{ inputs.tag }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				    permissions:

				      contents: read

				      id-token: write

									
										469

.github/workflows/ci.yaml
									
										vendored
									
												View File
												
				@@ -11,87 +11,494 @@ on:

				      tag:

				        required: true

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				      skip-test:

				        required: false

				        type: string

				        default: no

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD:

				        required: true

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				      CI_HKD_PATH:

				        required: true

				      ITA_KEY:

				        required: true

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions:

				  contents: read

				  id-token: write

				jobs:

				  build-kata-static-tarball-amd64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				  publish-kata-deploy-payload-amd64:

				    needs: build-kata-static-tarball-amd64

				    uses: ./.github/workflows/publish-kata-deploy-payload-amd64.yaml

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				    secrets: inherit

				      target-branch: ${{ inputs.target-branch }}

				      runner: ubuntu-22.04

				      arch: amd64

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-kata-static-tarball-arm64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-arm64.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				  publish-kata-deploy-payload-arm64:

				    needs: build-kata-static-tarball-arm64

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-arm64

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ubuntu-22.04-arm

				      arch: arm64

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-kata-static-tarball-s390x:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-s390x.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      CI_HKD_PATH: ${{ secrets.ci_hkd_path }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-kata-static-tarball-ppc64le:

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-kata-static-tarball-riscv64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-riscv64.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-s390x:

				    needs: build-kata-static-tarball-s390x

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-s390x

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: s390x

				      arch: s390x

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-ppc64le:

				    needs: build-kata-static-tarball-ppc64le

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-ppc64le

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ppc64le

				      arch: ppc64le

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-and-publish-tee-confidential-unencrypted-image:

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Set up QEMU

				        uses: docker/setup-qemu-action@29109295f81e9208d7d86ff1c6c12d2833863392 # v3.6.0

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Docker build and push

				        uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0

				        with:

				          tags: ghcr.io/kata-containers/test-images:unencrypted-${{ inputs.pr-number }}

				          push: true

				          context: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/

				          platforms: linux/amd64, linux/s390x

				          file: tests/integration/kubernetes/runtimeclass_workloads/confidential/unencrypted/Dockerfile

				  publish-csi-driver-amd64:

				    needs: build-kata-static-tarball-amd64

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64-${{ inputs.tag }}

				          path: kata-artifacts

				      - name: Install tools

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts

				      - name: Copy binary into Docker context

				        run: |

				          # Copy to the location where the Dockerfile expects the binary.

				          mkdir -p src/tools/csi-kata-directvolume/bin/

				          cp /opt/kata/bin/csi-kata-directvolume src/tools/csi-kata-directvolume/bin/directvolplugin

				      - name: Set up Docker Buildx

				        uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Docker build and push

				        uses: docker/build-push-action@ca052bb54ab0790a636c9b5f226502c73d547a25 # v5.4.0

				        with:

				          tags: ghcr.io/kata-containers/csi-kata-directvolume:${{ inputs.pr-number }}

				          push: true

				          context: src/tools/csi-kata-directvolume/

				          platforms: linux/amd64

				          file: src/tools/csi-kata-directvolume/Dockerfile

				  run-kata-monitor-tests:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-amd64

				    uses: ./.github/workflows/run-kata-monitor-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				  run-k8s-tests-on-aks:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: publish-kata-deploy-payload-amd64

				    uses: ./.github/workflows/run-k8s-tests-on-aks.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				  run-k8s-tests-on-amd64:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: publish-kata-deploy-payload-amd64

				    uses: ./.github/workflows/run-k8s-tests-on-amd64.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				    secrets: inherit

				      target-branch: ${{ inputs.target-branch }}

				  run-k8s-tests-on-sev:

				    needs: publish-kata-deploy-payload-amd64

				    uses: ./.github/workflows/run-k8s-tests-on-sev.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				  run-k8s-tests-on-snp:

				    needs: publish-kata-deploy-payload-amd64

				    uses: ./.github/workflows/run-k8s-tests-on-snp.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				  run-k8s-tests-on-tdx:

				    needs: publish-kata-deploy-payload-amd64

				    uses: ./.github/workflows/run-k8s-tests-on-tdx.yaml

				  run-k8s-tests-on-arm64:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: publish-kata-deploy-payload-arm64

				    uses: ./.github/workflows/run-k8s-tests-on-arm64.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-arm64

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				  run-kata-coco-tests:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs:

				     - publish-kata-deploy-payload-amd64

				     - build-and-publish-tee-confidential-unencrypted-image

				     - publish-csi-driver-amd64

				    uses: ./.github/workflows/run-kata-coco-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AZ_APPID: ${{ secrets.AZ_APPID }}

				      AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				      AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      ITA_KEY: ${{ secrets.ITA_KEY }}

				  run-k8s-tests-on-zvsi:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: [publish-kata-deploy-payload-s390x, build-and-publish-tee-confidential-unencrypted-image]

				    uses: ./.github/workflows/run-k8s-tests-on-zvsi.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-s390x

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				  run-k8s-tests-on-ppc64le:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: publish-kata-deploy-payload-ppc64le

				    uses: ./.github/workflows/run-k8s-tests-on-ppc64le.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-ppc64le

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				  run-kata-deploy-tests:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: [publish-kata-deploy-payload-amd64]

				    uses: ./.github/workflows/run-kata-deploy-tests.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				  run-metrics-tests:

				    # Skip metrics tests whilst runner is broken

				    if: false

				    # if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-amd64

				    uses: ./.github/workflows/run-metrics.yaml

				    with:

				      registry: ghcr.io

				      repo: ${{ github.repository_owner }}/kata-deploy-ci

				      tag: ${{ inputs.tag }}-amd64

				      commit-hash: ${{ inputs.commit-hash }}

				      pr-number: ${{ inputs.pr-number }}

				      target-branch: ${{ inputs.target-branch }}

				  run-basic-amd64-tests:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-amd64

				    uses: ./.github/workflows/basic-ci-amd64.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				  run-cri-containerd-tests:

				  run-basic-s390x-tests:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-s390x

				    uses: ./.github/workflows/basic-ci-s390x.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				  run-cri-containerd-amd64:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-amd64

				    strategy:

				      fail-fast: false

				      matrix:

				        params: [

				          { containerd_version: lts,    vmm: clh              },

				          { containerd_version: lts,    vmm: dragonball       },

				          { containerd_version: lts,    vmm: qemu             },

				          { containerd_version: lts,    vmm: stratovirt       },

				          { containerd_version: lts,    vmm: cloud-hypervisor },

				          { containerd_version: lts,    vmm: qemu-runtime-rs  },

				          { containerd_version: active, vmm: clh              },

				          { containerd_version: active, vmm: dragonball       },

				          { containerd_version: active, vmm: qemu             },

				          { containerd_version: active, vmm: stratovirt       },

				          { containerd_version: active, vmm: cloud-hypervisor },

				          { containerd_version: active, vmm: qemu-runtime-rs  },

				         ]

				    uses: ./.github/workflows/run-cri-containerd-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ubuntu-22.04

				      arch: amd64

				      containerd_version: ${{ matrix.params.containerd_version }}

				      vmm: ${{ matrix.params.vmm }}

				  run-nydus-tests:

				    needs: build-kata-static-tarball-amd64

				    uses: ./.github/workflows/run-nydus-tests.yaml

				  run-cri-containerd-s390x:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-s390x

				    strategy:

				      fail-fast: false

				      matrix:

				        params: [

				          { containerd_version: active, vmm: qemu            },

				          { containerd_version: active, vmm: qemu-runtime-rs },

				         ]

				    uses: ./.github/workflows/run-cri-containerd-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: s390x-large

				      arch: s390x

				      containerd_version: ${{ matrix.params.containerd_version }}

				      vmm: ${{ matrix.params.vmm }}

				  run-vfio-tests:

				    needs: build-kata-static-tarball-amd64

				    uses: ./.github/workflows/run-vfio-tests.yaml

				  run-cri-containerd-tests-ppc64le:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-ppc64le

				    strategy:

				      fail-fast: false

				      matrix:

				        params: [

				          { containerd_version: active, vmm: qemu },

				         ]

				    uses: ./.github/workflows/run-cri-containerd-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: ppc64le

				      arch: ppc64le

				      containerd_version: ${{ matrix.params.containerd_version }}

				      vmm: ${{ matrix.params.vmm }}

				  run-cri-containerd-tests-arm64:

				    if: ${{ inputs.skip-test != 'yes' }}

				    needs: build-kata-static-tarball-arm64

				    strategy:

				      fail-fast: false

				      matrix:

				        params: [

				          { containerd_version: active, vmm: qemu },

				         ]

				    uses: ./.github/workflows/run-cri-containerd-tests.yaml

				    with:

				      tarball-suffix: -${{ inputs.tag }}

				      commit-hash: ${{ inputs.commit-hash }}

				      target-branch: ${{ inputs.target-branch }}

				      runner: arm64-non-k8s

				      arch: arm64

				      containerd_version: ${{ matrix.params.containerd_version }}

				      vmm: ${{ matrix.params.vmm }}

									
										37

.github/workflows/cleanup-resources.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,37 @@

				name: Cleanup dangling Azure resources

				on:

				  schedule:

				    - cron: "0 0 * * *"

				  workflow_dispatch:

				permissions:

				  contents: read

				  id-token: write

				jobs:

				  cleanup-resources:

				    runs-on: ubuntu-22.04

				    environment: ci

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Log into Azure

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Install Python dependencies

				        run: |

				          pip3 install --user --upgrade \

				            azure-identity==1.16.0 \

				            azure-mgmt-resource==23.0.1

				      - name: Cleanup resources

				        env:

				          AZ_SUBSCRIPTION_ID: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				          CLEANUP_AFTER_HOURS: 24 # Clean up resources created more than this many hours ago.

				        run: python3 tests/cleanup_resources.py

									
										100

.github/workflows/codeql.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,100 @@

				# For most projects, this workflow file will not need changing; you simply need

				# to commit it to your repository.

				#

				# You may wish to alter this file to override the set of languages analyzed,

				# or to provide custom queries or build logic.

				#

				# ******** NOTE ********

				# We have attempted to detect the languages in your repository. Please check

				# the `language` matrix defined below to confirm you have the correct set of

				# supported CodeQL languages.

				#

				name: "CodeQL Advanced"

				on:

				  push:

				    branches: [ "main" ]

				  pull_request:

				    branches: [ "main" ]

				  schedule:

				    - cron: '45 0 * * 1'

				permissions:

				  contents: read

				jobs:

				  analyze:

				    name: Analyze (${{ matrix.language }})

				    # Runner size impacts CodeQL analysis time. To learn more, please see:

				    #   - https://gh.io/recommended-hardware-resources-for-running-codeql

				    #   - https://gh.io/supported-runners-and-hardware-resources

				    #   - https://gh.io/using-larger-runners (GitHub.com only)

				    # Consider using larger runners or machines with greater resources for possible analysis time improvements.

				    runs-on: ubuntu-24.04

				    permissions:

				      # required for all workflows

				      security-events: write

				      # required to fetch internal or private CodeQL packs

				      packages: read

				      # only required for workflows in private repositories

				      actions: read

				      contents: read

				    strategy:

				      fail-fast: false

				      matrix:

				        include:

				        - language: go

				          build-mode: manual

				        - language: python

				          build-mode: none

				        # CodeQL supports the following values keywords for 'language': 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'swift'

				        # Use `c-cpp` to analyze code written in C, C++ or both

				        # Use 'java-kotlin' to analyze code written in Java, Kotlin or both

				        # Use 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both

				        # To learn more about changing the languages that are analyzed or customizing the build mode for your analysis,

				        # see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/customizing-your-advanced-setup-for-code-scanning.

				        # If you are analyzing a compiled language, you can modify the 'build-mode' for that language to customize how

				        # your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages

				    steps:

				    - name: Checkout repository

				      uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      with:

				        persist-credentials: false

				    # Add any setup steps before running the `github/codeql-action/init` action.

				    # This includes steps like installing compilers or runtimes (`actions/setup-node`

				    # or others). This is typically only required for manual builds.

				    # - name: Setup runtime (example)

				    #   uses: actions/setup-example@v1

				    # Initializes the CodeQL tools for scanning.

				    - name: Initialize CodeQL

				      uses: github/codeql-action/init@v3

				      with:

				        languages: ${{ matrix.language }}

				        build-mode: ${{ matrix.build-mode }}

				        # If you wish to specify custom queries, you can do so here or in a config file.

				        # By default, queries listed here will override any specified in a config file.

				        # Prefix the list here with "+" to use these queries and those in the config file.

				        # For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs

				        # queries: security-extended,security-and-quality

				    # If the analyze step fails for one of the languages you are analyzing with

				    # "We were unable to automatically build your code", modify the matrix above

				    # to set the build mode to "manual" for that language. Then modify this step

				    # to build your code.

				    # ℹ️ Command-line programs to run using the OS shell.

				    # 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun

				    - if: matrix.build-mode == 'manual' && matrix.language == 'go'

				      shell: bash

				      run: |

				        make -C src/runtime

				    - name: Perform CodeQL Analysis

				      uses: github/codeql-action/analyze@v3

				      with:

				        category: "/language:${{matrix.language}}"

									
										45

.github/workflows/commit-message-check.yaml
									
										vendored
									
												View File
												
				@@ -6,6 +6,9 @@ on:

				      - reopened

				      - synchronize

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				@@ -18,13 +21,14 @@ env:

				jobs:

				  commit-message-check:

				    runs-on: ubuntu-latest

				    runs-on: ubuntu-22.04

				    env:

				      PR_AUTHOR: ${{ github.event.pull_request.user.login }}

				    name: Commit Message Check

				    steps:

				    - name: Get PR Commits

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      id: 'get-pr-commits'

				      uses: tim-actions/get-pr-commits@v1.2.0

				      uses: tim-actions/get-pr-commits@c64db31d359214d244884dd68f971a110b29ab83 # v1.2.0

				      with:

				        token: ${{ secrets.GITHUB_TOKEN }}

				        # Filter out revert commits

				@@ -32,23 +36,25 @@ jobs:

				        #

				        # Revert "<original-subject-line>"

				        #

				        filter_out_pattern: '^Revert "'

				        # The format of a re-re-vert commit as follows:

				        #

				        # Reapply "<original-subject-line>"

				        filter_out_pattern: '^Revert "|^Reapply "'

				    - name: DCO Check

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      uses: tim-actions/dco@2fd0504dc0d27b33f542867c300c60840c6dcb20

				      uses: tim-actions/dco@2fd0504dc0d27b33f542867c300c60840c6dcb20 # master (2020-04-28)

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				    - name: Commit Body Missing Check

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}

				      uses: tim-actions/commit-body-check@v1.0.2

				      if: ${{ success() || failure() }}

				      uses: tim-actions/commit-body-check@d2e0e8e1f0332b3281c98867c42a2fbe25ad3f15 # v1.0.2

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				    - name: Check Subject Line Length

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}

				      uses: tim-actions/commit-message-checker-with-regex@v0.3.1

				      if: ${{ (env.PR_AUTHOR != 'dependabot[bot]') && ( success() || failure() ) }}

				      uses: tim-actions/commit-message-checker-with-regex@d6d9770051dd6460679d1cab1dcaa8cffc5c2bbd # v0.3.1

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				        pattern: '^.{0,75}(\n.*)*$'

				@@ -56,8 +62,8 @@ jobs:

				        post_error: ${{ env.error_msg }}

				    - name: Check Body Line Length

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}

				      uses: tim-actions/commit-message-checker-with-regex@v0.3.1

				      if: ${{ (env.PR_AUTHOR != 'dependabot[bot]') && ( success() || failure() ) }}

				      uses: tim-actions/commit-message-checker-with-regex@d6d9770051dd6460679d1cab1dcaa8cffc5c2bbd # v0.3.1

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				        # Notes:

				@@ -86,20 +92,9 @@ jobs:

				        error: 'Body line too long (max 150)'

				        post_error: ${{ env.error_msg }}

				    - name: Check Fixes

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}

				      uses: tim-actions/commit-message-checker-with-regex@v0.3.1

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				        pattern: '\s*Fixes\s*:?\s*(#\d+|github\.com\/kata-containers\/[a-z-.]*#\d+)|^\s*release\s*:'

				        flags: 'i'

				        error: 'No "Fixes" found'

				        post_error: ${{ env.error_msg }}

				        one_pass_all_pass: 'true'

				    - name: Check Subsystem

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') && ( success() || failure() ) }}

				      uses: tim-actions/commit-message-checker-with-regex@v0.3.1

				      if: ${{ (env.PR_AUTHOR != 'dependabot[bot]') && ( success() || failure() ) }}

				      uses: tim-actions/commit-message-checker-with-regex@d6d9770051dd6460679d1cab1dcaa8cffc5c2bbd # v0.3.1

				      with:

				        commits: ${{ steps.get-pr-commits.outputs.commits }}

				        pattern: '^[\s\t]*[^:\s\t]+[\s\t]*:'

									
										12

.github/workflows/darwin-tests.yaml
									
										vendored
									
												View File
												
				@@ -5,7 +5,9 @@ on:

				      - edited

				      - reopened

				      - synchronize

				    paths-ignore: [ '**.md', '**.png', '**.jpg', '**.jpeg', '**.svg', '/docs/**' ]

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				@@ -17,10 +19,12 @@ jobs:

				    runs-on: macos-latest

				    steps:

				    - name: Install Go

				      uses: actions/setup-go@v2

				      uses: actions/setup-go@d35c59abb061a4a6fb18e82ac0862c26744d6ab5 # v5.5.0

				      with:

				        go-version: 1.19.3

				        go-version: 1.23.10

				    - name: Checkout code

				      uses: actions/checkout@v2

				      uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      with:

				        persist-credentials: false

				    - name: Build utils

				      run: ./ci/darwin-test.sh

									
										25

.github/workflows/docs-url-alive-check.yaml
									
										vendored
									
												View File
												
				@@ -2,36 +2,35 @@ on:

				  schedule:

				    - cron:  '0 23 * * 0'

				permissions:

				  contents: read

				name: Docs URL Alive Check

				jobs:

				  test:

				    runs-on: ubuntu-20.04

				    runs-on: ubuntu-22.04

				    # don't run this action on forks

				    if: github.repository_owner == 'kata-containers'

				    env:

				      target_branch: ${{ github.base_ref }}

				    steps:

				    - name: Install Go

				      uses: actions/setup-go@v2

				      uses: actions/setup-go@d35c59abb061a4a6fb18e82ac0862c26744d6ab5 # v5.5.0

				      with:

				        go-version: 1.19.3

				        go-version: 1.23.10

				      env:

				        GOPATH: ${{ runner.workspace }}/kata-containers

				        GOPATH: ${{ github.workspace }}/kata-containers

				    - name: Set env

				      run: |

				        echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV

				        echo "${{ github.workspace }}/bin" >> $GITHUB_PATH

				        echo "GOPATH=${{ github.workspace }}" >> "$GITHUB_ENV"

				        echo "${{ github.workspace }}/bin" >> "$GITHUB_PATH"

				    - name: Checkout code

				      uses: actions/checkout@v2

				      uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      with:

				        fetch-depth: 0

				        persist-credentials: false

				        path: ./src/github.com/${{ github.repository }}

				    - name: Setup

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh

				      env:

				        GOPATH: ${{ runner.workspace }}/kata-containers

				    # docs url alive check

				    - name: Docs URL Alive Check

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && make docs-url-alive-check

				        cd "${GOPATH}/src/github.com/${{ github.repository }}" && make docs-url-alive-check

									
										55

.github/workflows/gatekeeper-skipper.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,55 @@

				name: Skipper

				# This workflow sets various "skip_*" output values that can be used to

				# determine what workflows/jobs are expected to be executed. Sample usage:

				#

				#   skipper:

				#     uses: ./.github/workflows/gatekeeper-skipper.yaml

				#     with:

				#       commit-hash: ${{ github.event.pull_request.head.sha }}

				#       target-branch: ${{ github.event.pull_request.base.ref }}

				#

				#   your-workflow:

				#     needs: skipper

				#     if: ${{ needs.skipper.outputs.skip_build != 'yes' }}

				on:

				  workflow_call:

				    inputs:

				      commit-hash:

				        required: true

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    outputs:

				      skip_build:

				        value: ${{ jobs.skipper.outputs.skip_build }}

				      skip_test:

				        value: ${{ jobs.skipper.outputs.skip_test }}

				      skip_static:

				        value: ${{ jobs.skipper.outputs.skip_static }}

				permissions:

				  contents: read

				jobs:

				  skipper:

				    runs-on: ubuntu-22.04

				    outputs:

				      skip_build: ${{ steps.skipper.outputs.skip_build }}

				      skip_test: ${{ steps.skipper.outputs.skip_test }}

				      skip_static: ${{ steps.skipper.outputs.skip_static }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - id: skipper

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				        run: |

				          python3 tools/testing/gatekeeper/skips.py | tee -a "$GITHUB_OUTPUT"

				        shell: /usr/bin/bash -x {0}

									
										53

.github/workflows/gatekeeper.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,53 @@

				name: Gatekeeper

				# Gatekeeper uses the "skips.py" to determine which job names/regexps are

				# required for given PR and waits for them to either complete or fail

				# reporting the status.

				on:

				  pull_request_target:

				    types:

				      - opened

				      - synchronize

				      - reopened

				      - labeled

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  gatekeeper:

				    runs-on: ubuntu-22.04

				    permissions:

				      actions: read

				      contents: read

				      issues: read

				      pull-requests: read

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ github.event.pull_request.head.sha }}

				          fetch-depth: 0

				          persist-credentials: false

				      - id: gatekeeper

				        env:

				          TARGET_BRANCH: ${{ github.event.pull_request.base.ref }}

				          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

				          COMMIT_HASH: ${{ github.event.pull_request.head.sha }}

				          GH_PR_NUMBER: ${{ github.event.pull_request.number }}

				        run: |

				          #!/usr/bin/env bash -x

				          mapfile -t lines < <(python3 tools/testing/gatekeeper/skips.py -t)

				          export REQUIRED_JOBS="${lines[0]}"

				          export REQUIRED_REGEXPS="${lines[1]}"

				          export REQUIRED_LABELS="${lines[2]}"

				          echo "REQUIRED_JOBS: $REQUIRED_JOBS"

				          echo "REQUIRED_REGEXPS: $REQUIRED_REGEXPS"

				          echo "REQUIRED_LABELS: $REQUIRED_LABELS"

				          python3 tools/testing/gatekeeper/jobs.py

				          exit $?

				        shell: /usr/bin/bash -x {0}

									
										50

.github/workflows/govulncheck.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,50 @@

				on:

				  workflow_call:

				name: Govulncheck

				permissions:

				  contents: read

				jobs:

				  govulncheck:

				    runs-on: ubuntu-22.04

				    strategy:

				      matrix:

				        include:

				          - binary: "kata-runtime"

				            make_target: "runtime"

				          - binary: "containerd-shim-kata-v2" 

				            make_target: "containerd-shim-v2"

				          - binary: "kata-monitor"

				            make_target: "monitor"

				      fail-fast: false

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Install golang

				        run: |

				          ./tests/install_go.sh -f -p

				          echo "/usr/local/go/bin" >> "${GITHUB_PATH}"

				      - name: Install govulncheck

				        run: |

				          go install golang.org/x/vuln/cmd/govulncheck@latest

				          echo "${HOME}/go/bin" >> "${GITHUB_PATH}"

				      - name: Build runtime binaries

				        run: |

				          cd src/runtime

				          make ${{ matrix.make_target }}

				        env:

				          SKIP_GO_VERSION_CHECK: "1"

				      - name: Run govulncheck on ${{ matrix.binary }}

				        run: |

				          cd src/runtime

				          bash ../../tests/govulncheck-runner.sh "./${{ matrix.binary }}"

									
										13

.github/workflows/kata-runtime-classes-sync.yaml
									
										vendored
									
												View File
												
				@@ -6,23 +6,28 @@ on:

				      - reopened

				      - synchronize

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  kata-deploy-runtime-classes-check:

				    runs-on: ubuntu-latest

				    runs-on: ubuntu-22.04

				    steps:

				    - name: Checkout code

				      uses: actions/checkout@v3

				      uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				      with:

				        persist-credentials: false

				    - name: Ensure the split out runtime classes match the all-in-one file

				      run: |

				        pushd tools/packaging/kata-deploy/runtimeclasses/

				        echo "::group::Combine runtime classes"

				        for runtimeClass in `find . -type f \( -name "*.yaml" -and -not -name "kata-runtimeClasses.yaml" \) | sort`; do

				        for runtimeClass in $(find . -type f \( -name "*.yaml" -and -not -name "kata-runtimeClasses.yaml" \) | sort); do

				            echo "Adding ${runtimeClass} to the resultingRuntimeClasses.yaml"

				            cat ${runtimeClass} >> resultingRuntimeClasses.yaml;

				            cat "${runtimeClass}" >> resultingRuntimeClasses.yaml;

				        done

				        echo "::endgroup::"

				        echo "::group::Displaying the content of resultingRuntimeClasses.yaml"

									
										82

.github/workflows/move-issues-to-in-progress.yaml
									
										vendored
									
												View File
											
				@@ -1,82 +0,0 @@

				# Copyright (c) 2020 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				name: Move issues to "In progress" in backlog project when referenced by a PR

				on:

				  pull_request_target:

				    types:

				      - opened

				      - reopened

				jobs:

				  move-linked-issues-to-in-progress:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Install hub

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        run: |

				          HUB_ARCH="amd64"

				          HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\

				            jq -r .tag_name | sed 's/^v//')

				          curl -sL \

				            "https://github.com/github/hub/releases/download/v${HUB_VER}/hub-linux-${HUB_ARCH}-${HUB_VER}.tgz" |\

				          tar xz --strip-components=2 --wildcards '*/bin/hub' && \

				          sudo install hub /usr/local/bin

				      - name: Install hub extension script

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        run: |

				          # Clone into a temporary directory to avoid overwriting

				          # any existing github directory.

				          pushd $(mktemp -d) &>/dev/null

				          git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts

				          sudo install hub-util.sh /usr/local/bin

				          popd &>/dev/null

				      - name: Checkout code to allow hub to communicate with the project

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        uses: actions/checkout@v2

				      - name: Move issue to "In progress"

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        env:

				          GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}

				        run: |

				          pr=${{ github.event.pull_request.number }}

				          linked_issue_urls=$(hub-util.sh \

				            list-issues-for-pr "$pr" |\

				            grep -v "^\#"  |\

				            cut -d';' -f3 || true)

				          # PR doesn't have any linked issues

				          # (it should, but maybe a new user forgot to add a "Fixes: #XXX" commit).

				          [ -z "$linked_issue_urls" ] && {

				            echo "::error::No linked issues for PR $pr"

				            exit 1

				          }

				          project_name="Issue backlog"

				          project_type="org"

				          project_column="In progress"

				          for issue_url in $(echo "$linked_issue_urls")

				          do

				            issue=$(echo "$issue_url"| awk -F\/ '{print $NF}' || true)

				            [ -z "$issue" ] && {

				              echo "::error::Cannot determine issue number from $issue_url for PR $pr"

				              exit 1

				            }

				            # Move the issue to the correct column on the project board

				            hub-util.sh \

				              move-issue \

				              "$issue" \

				              "$project_name" \

				              "$project_type" \

				              "$project_column"

				          done

									
										41

.github/workflows/osv-scanner.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,41 @@

				# A sample workflow which sets up periodic OSV-Scanner scanning for vulnerabilities,

				# in addition to a PR check which fails if new vulnerabilities are introduced.

				#

				# For more examples and options, including how to ignore specific vulnerabilities,

				# see https://google.github.io/osv-scanner/github-action/

				name: OSV-Scanner

				on:

				  workflow_dispatch:

				  pull_request:

				    branches: [ "main" ]

				  schedule:

				    - cron: '0 1 * * 0'

				  push:

				    branches: [ "main" ]

				jobs:

				  scan-scheduled:

				    permissions:

				      actions: read # # Required to upload SARIF file to CodeQL

				      contents: read  # Read commit contents

				      security-events: write  # Require writing security events to upload SARIF file to security tab

				    if: ${{ github.event_name == 'push' || github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' }}

				    uses: "google/osv-scanner-action/.github/workflows/osv-scanner-reusable.yml@b00f71e051ddddc6e46a193c31c8c0bf283bf9e6" # v2.1.0

				    with:

				      scan-args: |-

				        -r

				        ./

				  scan-pr:

				    permissions:

				      actions: read # Required to upload SARIF file to CodeQL

				      contents: read  # Read commit contents

				      security-events: write  # Require writing security events to upload SARIF file to security tab

				    if: ${{ github.event_name == 'pull_request' }}

				    uses: "google/osv-scanner-action/.github/workflows/osv-scanner-reusable-pr.yml@b00f71e051ddddc6e46a193c31c8c0bf283bf9e6" # v2.1.0

				    with:

				      # Example of specifying custom arguments

				      scan-args: |-

				        -r

				        ./

									
										126

.github/workflows/payload-after-push.yaml
									
										vendored
									
												View File
												
				@@ -3,82 +3,160 @@ on:

				  push:

				    branches:

				      - main

				      - stable-*

				  workflow_dispatch:

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  build-assets-amd64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      push-to-registry: yes

				    secrets: inherit

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-assets-arm64:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-arm64.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      push-to-registry: yes

				    secrets: inherit

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-assets-s390x:

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/build-kata-static-tarball-s390x.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      push-to-registry: yes

				    secrets: inherit

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-assets-ppc64le:

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      push-to-registry: yes

				      target-branch: ${{ github.ref_name }}

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-amd64:

				    needs: build-assets-amd64

				    uses: ./.github/workflows/publish-kata-deploy-payload-amd64.yaml

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      registry: quay.io

				      repo: kata-containers/kata-deploy-ci

				      tag: kata-containers-amd64

				    secrets: inherit

				      tag: kata-containers-latest-amd64

				      target-branch: ${{ github.ref_name }}

				      runner: ubuntu-22.04

				      arch: amd64

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-arm64:

				    needs: build-assets-arm64

				    uses: ./.github/workflows/publish-kata-deploy-payload-arm64.yaml

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      registry: quay.io

				      repo: kata-containers/kata-deploy-ci

				      tag: kata-containers-arm64

				    secrets: inherit

				      tag: kata-containers-latest-arm64

				      target-branch: ${{ github.ref_name }}

				      runner: ubuntu-22.04-arm

				      arch: arm64

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-s390x:

				    needs: build-assets-s390x

				    uses: ./.github/workflows/publish-kata-deploy-payload-s390x.yaml

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      registry: quay.io

				      repo: kata-containers/kata-deploy-ci

				      tag: kata-containers-s390x

				    secrets: inherit

				      tag: kata-containers-latest-s390x

				      target-branch: ${{ github.ref_name }}

				      runner: s390x

				      arch: s390x

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-kata-deploy-payload-ppc64le:

				    needs: build-assets-ppc64le

				    permissions:

				      contents: read

				      packages: write

				    uses: ./.github/workflows/publish-kata-deploy-payload.yaml

				    with:

				      commit-hash: ${{ github.sha }}

				      registry: quay.io

				      repo: kata-containers/kata-deploy-ci

				      tag: kata-containers-latest-ppc64le

				      target-branch: ${{ github.ref_name }}

				      runner: ppc64le

				      arch: ppc64le

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-manifest:

				    runs-on: ubuntu-latest

				    needs: [publish-kata-deploy-payload-amd64, publish-kata-deploy-payload-arm64, publish-kata-deploy-payload-s390x]

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: read

				      packages: write

				    needs: [publish-kata-deploy-payload-amd64, publish-kata-deploy-payload-arm64, publish-kata-deploy-payload-s390x, publish-kata-deploy-payload-ppc64le]

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@v3

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@v2

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - name: Push multi-arch manifest

				        run: |

				          docker manifest create quay.io/kata-containers/kata-deploy-ci:kata-containers-latest \

				          --amend quay.io/kata-containers/kata-deploy-ci:kata-containers-amd64 \

				          --amend quay.io/kata-containers/kata-deploy-ci:kata-containers-arm64 \

				          --amend quay.io/kata-containers/kata-deploy-ci:kata-containers-s390x

				          docker manifest push quay.io/kata-containers/kata-deploy-ci:kata-containers-latest

				          ./tools/packaging/release/release.sh publish-multiarch-manifest

				        env:

				          KATA_DEPLOY_IMAGE_TAGS: "kata-containers-latest"

				          KATA_DEPLOY_REGISTRIES: "quay.io/kata-containers/kata-deploy-ci"

									
										55

.github/workflows/publish-kata-deploy-payload-amd64.yaml
									
										vendored
									
												View File
											
				@@ -1,55 +0,0 @@

				name: CI | Publish kata-deploy payload for amd64

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				jobs:

				  kata-payload:

				    runs-on: ubuntu-latest

				    steps:

				      - uses: actions/checkout@v3

				        with:

				          ref: ${{ inputs.commit-hash }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v3

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.registry == 'quay.io' }}

				        uses: docker/login-action@v2

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - name: Login to Kata Containers ghcr.io

				        if: ${{ inputs.registry == 'ghcr.io' }}

				        uses: docker/login-action@v2

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: build-and-push-kata-payload

				        id: build-and-push-kata-payload

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				          $(pwd)/kata-static.tar.xz \

				          ${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

									
										60

.github/workflows/publish-kata-deploy-payload-arm64.yaml
									
										vendored
									
												View File
											
				@@ -1,60 +0,0 @@

				name: CI | Publish kata-deploy payload for arm64

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				jobs:

				  kata-payload:

				    runs-on: arm64

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R $USER:$USER $GITHUB_WORKSPACE

				      - uses: actions/checkout@v3

				        with:

				          ref: ${{ inputs.commit-hash }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v3

				        with:

				          name: kata-static-tarball-arm64${{ inputs.tarball-suffix }}

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.registry == 'quay.io' }}

				        uses: docker/login-action@v2

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - name: Login to Kata Containers ghcr.io

				        if: ${{ inputs.registry == 'ghcr.io' }}

				        uses: docker/login-action@v2

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: build-and-push-kata-payload

				        id: build-and-push-kata-payload

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				          $(pwd)/kata-static.tar.xz \

				          ${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

									
										59

.github/workflows/publish-kata-deploy-payload-s390x.yaml
									
										vendored
									
												View File
											
				@@ -1,59 +0,0 @@

				name: CI | Publish kata-deploy payload for s390x

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				jobs:

				  kata-payload:

				    runs-on: s390x

				    steps:

				      - name: Adjust a permission for repo

				        run: |

				          sudo chown -R $USER:$USER $GITHUB_WORKSPACE

				      - uses: actions/checkout@v3

				        with:

				          ref: ${{ inputs.commit-hash }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v3

				        with:

				          name: kata-static-tarball-s390x${{ inputs.tarball-suffix }}

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.registry == 'quay.io' }}

				        uses: docker/login-action@v2

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - name: Login to Kata Containers ghcr.io

				        if: ${{ inputs.registry == 'ghcr.io' }}

				        uses: docker/login-action@v2

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: build-and-push-kata-payload

				        id: build-and-push-kata-payload

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				          $(pwd)/kata-static.tar.xz \

				          ${{ inputs.registry }}/${{ inputs.repo }} ${{ inputs.tag }}

									
										90

.github/workflows/publish-kata-deploy-payload.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,90 @@

				name: CI | Publish kata-deploy payload

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				      runner:

				        default: 'ubuntu-22.04'

				        description: The runner to execute the workflow on. Defaults to 'ubuntu-22.04'.

				        required: false

				        type: string

				      arch:

				        description: The arch of the tarball.

				        required: true

				        type: string

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions:

				  contents: read

				jobs:

				  kata-payload:

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ${{ inputs.runner }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tarball for ${{ inputs.arch }}

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-${{ inputs.arch}}${{ inputs.tarball-suffix }}

				      - name: Login to Kata Containers quay.io

				        if: ${{ inputs.registry == 'quay.io' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - name: Login to Kata Containers ghcr.io

				        if: ${{ inputs.registry == 'ghcr.io' }}

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: build-and-push-kata-payload for ${{ inputs.arch }}

				        id: build-and-push-kata-payload

				        env:

				          REGISTRY: ${{ inputs.registry }}

				          REPO: ${{ inputs.repo }}

				          TAG: ${{ inputs.tag }}

				        run: |

				          ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				          "$(pwd)/kata-static.tar.xz" \

				          "${REGISTRY}/${REPO}" \

				          "${TAG}"

									
										62

.github/workflows/release-amd64.yaml
									
										vendored
									
												View File
												
				@@ -5,49 +5,75 @@ on:

				      target-arch:

				        required: true

				        type: string

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions:

				  contents: read

				jobs:

				  build-kata-static-tarball-amd64:

				    uses: ./.github/workflows/build-kata-static-tarball-amd64.yaml

				    with:

				      push-to-registry: yes

				      stage: release

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				  kata-deploy:

				    needs: build-kata-static-tarball-amd64

				    runs-on: ubuntu-latest

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Login to Kata Containers docker.io

				        uses: docker/login-action@v2

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          username: ${{ secrets.DOCKER_USERNAME }}

				          password: ${{ secrets.DOCKER_PASSWORD }}

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@v2

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v3

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64

				      - name: build-and-push-kata-deploy-ci-amd64

				        id: build-and-push-kata-deploy-ci-amd64

				        env:

				          TARGET_ARCH: ${{ inputs.target-arch }}

				        run: |

				          # We need to do such trick here as the format of the $GITHUB_REF 

				          # We need to do such trick here as the format of the $GITHUB_REF

				          # is "refs/tags/<tag>"

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tags=($tag)

				          tags+=($([[ "$tag" =~ "alpha"|"rc" ]] && echo "latest" || echo "stable"))

				          for tag in ${tags[@]}; do

				          tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)

				          if [ "${tag}" = "main" ]; then

				              tag=$(./tools/packaging/release/release.sh release-version)

				              tags=("${tag}" "latest")

				          else

				              tags=("${tag}")

				          fi

				          for tag in "${tags[@]}"; do

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  $(pwd)/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \

				                  "${tag}-${{ inputs.target-arch }}"

				                  "$(pwd)"/kata-static.tar.xz "ghcr.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  $(pwd)/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${{ inputs.target-arch }}"

				                  "$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				          done

									
										62

.github/workflows/release-arm64.yaml
									
										vendored
									
												View File
												
				@@ -5,49 +5,75 @@ on:

				      target-arch:

				        required: true

				        type: string

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions:

				  contents: read

				jobs:

				  build-kata-static-tarball-arm64:

				    uses: ./.github/workflows/build-kata-static-tarball-arm64.yaml

				    with:

				      push-to-registry: yes

				      stage: release

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				  kata-deploy:

				    needs: build-kata-static-tarball-arm64

				    runs-on: arm64

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ubuntu-22.04-arm

				    steps:

				      - name: Login to Kata Containers docker.io

				        uses: docker/login-action@v2

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          username: ${{ secrets.DOCKER_USERNAME }}

				          password: ${{ secrets.DOCKER_PASSWORD }}

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@v2

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v3

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-arm64

				      - name: build-and-push-kata-deploy-ci-arm64

				        id: build-and-push-kata-deploy-ci-arm64

				        env:

				          TARGET_ARCH: ${{ inputs.target-arch }}

				        run: |

				          # We need to do such trick here as the format of the $GITHUB_REF 

				          # We need to do such trick here as the format of the $GITHUB_REF

				          # is "refs/tags/<tag>"

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tags=($tag)

				          tags+=($([[ "$tag" =~ "alpha"|"rc" ]] && echo "latest" || echo "stable"))

				          for tag in ${tags[@]}; do

				          tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)

				          if [ "${tag}" = "main" ]; then

				              tag=$(./tools/packaging/release/release.sh release-version)

				              tags=("${tag}" "latest")

				          else

				              tags=("${tag}")

				          fi

				          for tag in "${tags[@]}"; do

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  $(pwd)/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \

				                  "${tag}-${{ inputs.target-arch }}"

				                  "$(pwd)"/kata-static.tar.xz "ghcr.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  $(pwd)/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${{ inputs.target-arch }}"

				                  "$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				          done

									
										79

.github/workflows/release-ppc64le.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,79 @@

				name: Publish Kata release artifacts for ppc64le

				on:

				  workflow_call:

				    inputs:

				      target-arch:

				        required: true

				        type: string

				    secrets:

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions:

				  contents: read

				jobs:

				  build-kata-static-tarball-ppc64le:

				    uses: ./.github/workflows/build-kata-static-tarball-ppc64le.yaml

				    with:

				      push-to-registry: yes

				      stage: release

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				  kata-deploy:

				    needs: build-kata-static-tarball-ppc64le

				    permissions:

				      contents: read

				      packages: write

				    runs-on: ppc64le

				    steps:

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-ppc64le

				      - name: build-and-push-kata-deploy-ci-ppc64le

				        id: build-and-push-kata-deploy-ci-ppc64le

				        env:

				          TARGET_ARCH: ${{ inputs.target-arch }}

				        run: |

				          # We need to do such trick here as the format of the $GITHUB_REF

				          # is "refs/tags/<tag>"

				          tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)

				          if [ "${tag}" = "main" ]; then

				              tag=$(./tools/packaging/release/release.sh release-version)

				              tags=("${tag}" "latest")

				          else

				              tags=("${tag}")

				          fi

				          for tag in "${tags[@]}"; do

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  "$(pwd)"/kata-static.tar.xz "ghcr.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  "$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				          done

									
										64

.github/workflows/release-s390x.yaml
									
										vendored
									
												View File
												
				@@ -5,49 +5,79 @@ on:

				      target-arch:

				        required: true

				        type: string

				    secrets:

				      CI_HKD_PATH:

				        required: true

				      QUAY_DEPLOYER_PASSWORD:

				        required: true

				permissions:

				  contents: read

				jobs:

				  build-kata-static-tarball-s390x:

				    uses: ./.github/workflows/build-kata-static-tarball-s390x.yaml

				    with:

				      push-to-registry: yes

				      stage: release

				    secrets:

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				  kata-deploy:

				    needs: build-kata-static-tarball-s390x

				    permissions:

				      contents: read

				      packages: write

				    runs-on: s390x

				    steps:

				      - name: Login to Kata Containers docker.io

				        uses: docker/login-action@v2

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          username: ${{ secrets.DOCKER_USERNAME }}

				          password: ${{ secrets.DOCKER_PASSWORD }}

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@v2

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v3

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-s390x

				      - name: build-and-push-kata-deploy-ci-s390x

				        id: build-and-push-kata-deploy-ci-s390x

				        env:

				          TARGET_ARCH: ${{ inputs.target-arch }}

				        run: |

				          # We need to do such trick here as the format of the $GITHUB_REF 

				          # We need to do such trick here as the format of the $GITHUB_REF

				          # is "refs/tags/<tag>"

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tags=($tag)

				          tags+=($([[ "$tag" =~ "alpha"|"rc" ]] && echo "latest" || echo "stable"))

				          for tag in ${tags[@]}; do

				          tag=$(echo "$GITHUB_REF" | cut -d/ -f3-)

				          if [ "${tag}" = "main" ]; then

				              tag=$(./tools/packaging/release/release.sh release-version)

				              tags=("${tag}" "latest")

				          else

				              tags=("${tag}")

				          fi

				          for tag in "${tags[@]}"; do

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  $(pwd)/kata-static.tar.xz "docker.io/katadocker/kata-deploy" \

				                  "${tag}-${{ inputs.target-arch }}"

				                  "$(pwd)"/kata-static.tar.xz "ghcr.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				              ./tools/packaging/kata-deploy/local-build/kata-deploy-build-and-upload-payload.sh \

				                  $(pwd)/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${{ inputs.target-arch }}"

				                  "$(pwd)"/kata-static.tar.xz "quay.io/kata-containers/kata-deploy" \

				                  "${tag}-${TARGET_ARCH}"

				          done

									
										349

.github/workflows/release.yaml
									
										vendored
									
												View File
												
				@@ -1,179 +1,282 @@

				name: Publish Kata release artifacts

				name: Release Kata Containers

				on:

				  push:

				    tags:

				      - '[0-9]+.[0-9]+.[0-9]+*'

				  workflow_dispatch

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				permissions:

				  contents: read

				jobs:

				  release:

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release create` command

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Create a new release

				        run: |

				          ./tools/packaging/release/release.sh create-new-release

				        env:

				          GH_TOKEN: ${{ github.token }}

				  build-and-push-assets-amd64:

				    needs: release

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/release-amd64.yaml

				    with:

				      target-arch: amd64

				    secrets: inherit

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-and-push-assets-arm64:

				    needs: release

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/release-arm64.yaml

				    with:

				      target-arch: arm64

				    secrets: inherit

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-and-push-assets-s390x:

				    needs: release

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/release-s390x.yaml

				    with:

				      target-arch: s390x

				    secrets: inherit

				    secrets:

				      CI_HKD_PATH: ${{ secrets.CI_HKD_PATH }}

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  build-and-push-assets-ppc64le:

				    needs: release

				    permissions:

				      contents: read

				      packages: write

				      id-token: write

				      attestations: write

				    uses: ./.github/workflows/release-ppc64le.yaml

				    with:

				      target-arch: ppc64le

				    secrets:

				      QUAY_DEPLOYER_PASSWORD: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				  publish-multi-arch-images:

				    runs-on: ubuntu-latest

				    needs: [build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x]

				    runs-on: ubuntu-22.04

				    needs: [build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x, build-and-push-assets-ppc64le]

				    permissions:

				      contents: write # needed for the `gh release` commands

				      packages: write # needed to push the multi-arch manifest to ghcr.io

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@v3

				      - name: Login to Kata Containers docker.io

				        uses: docker/login-action@v2

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          username: ${{ secrets.DOCKER_USERNAME }}

				          password: ${{ secrets.DOCKER_PASSWORD }}

				          persist-credentials: false

				      - name: Login to Kata Containers ghcr.io

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: ghcr.io

				          username: ${{ github.actor }}

				          password: ${{ secrets.GITHUB_TOKEN }}

				      - name: Login to Kata Containers quay.io

				        uses: docker/login-action@v2

				        uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0

				        with:

				          registry: quay.io

				          username: ${{ secrets.QUAY_DEPLOYER_USERNAME }}

				          username: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          password: ${{ secrets.QUAY_DEPLOYER_PASSWORD }}

				      - name: Push multi-arch manifest

				      - name: Get the image tags

				        run: |

				          # tag the container image we created and push to DockerHub

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tags=($tag)

				          tags+=($([[ "$tag" =~ "alpha"|"rc" ]] && echo "latest" || echo "stable"))

				          # push to quay.io and docker.io

				          for tag in ${tags[@]}; do

				            docker manifest create quay.io/kata-containers/kata-deploy:${tag} \

				              --amend quay.io/kata-containers/kata-deploy:${tag}-amd64 \

				              --amend quay.io/kata-containers/kata-deploy:${tag}-arm64 \

				              --amend quay.io/kata-containers/kata-deploy:${tag}-s390x

				          release_version=$(./tools/packaging/release/release.sh release-version)

				          echo "KATA_DEPLOY_IMAGE_TAGS=$release_version latest" >> "$GITHUB_ENV"

				            docker manifest create docker.io/katadocker/kata-deploy:${tag} \

				              --amend docker.io/katadocker/kata-deploy:${tag}-amd64 \

				              --amend docker.io/katadocker/kata-deploy:${tag}-arm64 \

				              --amend docker.io/katadocker/kata-deploy:${tag}-s390x

				            docker manifest push quay.io/kata-containers/kata-deploy:${tag}

				            docker manifest push docker.io/katadocker/kata-deploy:${tag}

				          done

				      - name: Publish multi-arch manifest on quay.io & ghcr.io

				        run: |

				          ./tools/packaging/release/release.sh publish-multiarch-manifest

				        env:

				          KATA_DEPLOY_REGISTRIES: "quay.io/kata-containers/kata-deploy ghcr.io/kata-containers/kata-deploy"

				  upload-multi-arch-static-tarball:

				    needs: publish-multi-arch-images

				    runs-on: ubuntu-latest

				    needs: [build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x, build-and-push-assets-ppc64le]

				    permissions:

				      contents: write # needed for the `gh release` commands

				    runs-on: ubuntu-22.04

				    steps:

				      - uses: actions/checkout@v3

				      - name: install hub

				        run: |

				          wget -q -O- https://github.com/mislav/hub/releases/download/v2.14.2/hub-linux-amd64-2.14.2.tgz | \

				          tar xz --strip-components=2 --wildcards '*/bin/hub' && sudo mv hub /usr/local/bin/hub

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: download-artifacts-amd64

				        uses: actions/download-artifact@v3

				      - name: Set KATA_STATIC_TARBALL env var

				        run: |

				          tarball=$(pwd)/kata-static.tar.xz

				          echo "KATA_STATIC_TARBALL=${tarball}" >> "$GITHUB_ENV"

				      - name: Download amd64 artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64

				      - name: push amd64 static tarball to github

				        run: |

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tarball="kata-static-$tag-amd64.tar.xz"

				          mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"

				          pushd $GITHUB_WORKSPACE

				          echo "uploading asset '${tarball}' for tag: ${tag}"

				          GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"

				          popd

				      - name: download-artifacts-arm64

				        uses: actions/download-artifact@v3

				      - name: Upload amd64 static tarball to GitHub

				        run: |

				          ./tools/packaging/release/release.sh upload-kata-static-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				          ARCHITECTURE: amd64

				      - name: Download arm64 artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-arm64

				      - name: push arm64 static tarball to github

				        run: |

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tarball="kata-static-$tag-arm64.tar.xz"

				          mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"

				          pushd $GITHUB_WORKSPACE

				          echo "uploading asset '${tarball}' for tag: ${tag}"

				          GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"

				          popd

				      - name: download-artifacts-s390x

				        uses: actions/download-artifact@v3

				      - name: Upload arm64 static tarball to GitHub

				        run: |

				          ./tools/packaging/release/release.sh upload-kata-static-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				          ARCHITECTURE: arm64

				      - name: Download s390x artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-s390x

				      - name: push s390x static tarball to github

				      - name: Upload s390x static tarball to GitHub

				        run: |

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tarball="kata-static-$tag-s390x.tar.xz"

				          mv kata-static.tar.xz "$GITHUB_WORKSPACE/${tarball}"

				          pushd $GITHUB_WORKSPACE

				          echo "uploading asset '${tarball}' for tag: ${tag}"

				          GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"

				          popd

				          ./tools/packaging/release/release.sh upload-kata-static-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				          ARCHITECTURE: s390x

				      - name: Download ppc64le artifacts

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-ppc64le

				      - name: Upload ppc64le static tarball to GitHub

				        run: |

				          ./tools/packaging/release/release.sh upload-kata-static-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				          ARCHITECTURE: ppc64le

				  upload-versions-yaml:

				    runs-on: ubuntu-latest

				    needs: release

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release` commands

				    steps:

				      - uses: actions/checkout@v3

				      - name: upload versions.yaml

				        env:

				          GITHUB_TOKEN: ${{ secrets.GIT_UPLOAD_TOKEN }}

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Upload versions.yaml to GitHub

				        run: |

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          pushd $GITHUB_WORKSPACE

				          versions_file="kata-containers-$tag-versions.yaml"

				          cp versions.yaml ${versions_file}

				          hub release edit -m "" -a "${versions_file}" "${tag}"

				          popd

				          ./tools/packaging/release/release.sh upload-versions-yaml-file

				        env:

				          GH_TOKEN: ${{ github.token }}

				  upload-cargo-vendored-tarball:

				    needs: upload-multi-arch-static-tarball

				    runs-on: ubuntu-latest

				    needs: release

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release` commands

				    steps:

				      - uses: actions/checkout@v3

				      - name: generate-and-upload-tarball

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Generate and upload vendored code tarball

				        run: |

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          tarball="kata-containers-$tag-vendor.tar.gz"

				          pushd $GITHUB_WORKSPACE

				          bash -c "tools/packaging/release/generate_vendor.sh ${tarball}"

				          GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}" 

				          popd

				          ./tools/packaging/release/release.sh upload-vendored-code-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				  upload-libseccomp-tarball:

				    needs: upload-cargo-vendored-tarball

				    runs-on: ubuntu-latest

				    needs: release

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release` commands

				    steps:

				      - uses: actions/checkout@v3

				      - name: download-and-upload-tarball

				        env:

				          GITHUB_TOKEN: ${{ secrets.GIT_UPLOAD_TOKEN }}

				          GOPATH: ${HOME}/go

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Download libseccomp tarball and upload it to GitHub

				        run: |

				          pushd $GITHUB_WORKSPACE

				          ./ci/install_yq.sh

				          tag=$(echo $GITHUB_REF | cut -d/ -f3-)

				          versions_yaml="versions.yaml"

				          version=$(${GOPATH}/bin/yq read ${versions_yaml} "externals.libseccomp.version")

				          repo_url=$(${GOPATH}/bin/yq read ${versions_yaml} "externals.libseccomp.url")

				          download_url="${repo_url}/releases/download/v${version}"

				          tarball="libseccomp-${version}.tar.gz"

				          asc="${tarball}.asc"

				          curl -sSLO "${download_url}/${tarball}"

				          curl -sSLO "${download_url}/${asc}"

				          # "-m" option should be empty to re-use the existing release title

				          # without opening a text editor.

				          # For the details, check https://hub.github.com/hub-release.1.html.

				          hub release edit -m "" -a "${tarball}" "${tag}"

				          hub release edit -m "" -a "${asc}" "${tag}"

				          popd

				          ./tools/packaging/release/release.sh upload-libseccomp-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				  upload-helm-chart-tarball:

				    needs: release

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release` commands

				      packages: write # needed to push the helm chart to ghcr.io

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Install helm

				        uses: azure/setup-helm@fe7b79cd5ee1e45176fcad797de68ecaf3ca4814 # v4.2.0

				        id: install

				      - name: Generate and upload helm chart tarball

				        run: |

				          ./tools/packaging/release/release.sh upload-helm-chart-tarball

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: Login to the OCI registries

				        env:

				          QUAY_DEPLOYER_USERNAME: ${{ vars.QUAY_DEPLOYER_USERNAME }}

				          GITHUB_ACTOR: ${{ github.actor }}

				        run: |

				          echo "${{ secrets.QUAY_DEPLOYER_PASSWORD }}" | helm registry login quay.io --username "${QUAY_DEPLOYER_USERNAME}" --password-stdin

				          echo "${{ github.token }}" | helm registry login ghcr.io --username "${GITHUB_ACTOR}" --password-stdin

				      - name: Push helm chart to the OCI registries

				        run: |

				          release_version=$(./tools/packaging/release/release.sh release-version)

				          helm push "kata-deploy-${release_version}.tgz" oci://quay.io/kata-containers/kata-deploy-charts

				          helm push "kata-deploy-${release_version}.tgz" oci://ghcr.io/kata-containers/kata-deploy-charts

				  publish-release:

				    needs: [ build-and-push-assets-amd64, build-and-push-assets-arm64, build-and-push-assets-s390x, build-and-push-assets-ppc64le, publish-multi-arch-images, upload-multi-arch-static-tarball, upload-versions-yaml, upload-cargo-vendored-tarball, upload-libseccomp-tarball ]

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: write # needed for the `gh release` commands

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: Publish a release

				        run: |

				          ./tools/packaging/release/release.sh publish-release

				        env:

				          GH_TOKEN: ${{ github.token }}

									
										58

.github/workflows/require-pr-porting-labels.yaml
									
										vendored
									
												View File
											
				@@ -1,58 +0,0 @@

				# Copyright (c) 2020 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				name: Ensure PR has required porting labels

				on:

				  pull_request_target:

				    types:

				      - opened

				      - reopened

				      - labeled

				      - unlabeled

				    branches:

				      - main

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  check-pr-porting-labels:

				    runs-on: ubuntu-latest

				    steps:

				      - name: Install hub

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        run: |

				          HUB_ARCH="amd64"

				          HUB_VER=$(curl -sL "https://api.github.com/repos/github/hub/releases/latest" |\

				            jq -r .tag_name | sed 's/^v//')

				          curl -sL \

				            "https://github.com/github/hub/releases/download/v${HUB_VER}/hub-linux-${HUB_ARCH}-${HUB_VER}.tgz" |\

				          tar xz --strip-components=2 --wildcards '*/bin/hub' && \

				          sudo install hub /usr/local/bin

				      - name: Checkout code to allow hub to communicate with the project

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        uses: actions/checkout@v2

				      - name: Install porting checker script

				        run: |

				          # Clone into a temporary directory to avoid overwriting

				          # any existing github directory.

				          pushd $(mktemp -d) &>/dev/null

				          git clone --single-branch --depth 1 "https://github.com/kata-containers/.github" && cd .github/scripts

				          sudo install pr-porting-checks.sh /usr/local/bin

				          popd &>/dev/null

				      - name: Stop PR being merged unless it has a correct set of porting labels

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        env:

				          GITHUB_TOKEN: ${{ secrets.KATA_GITHUB_ACTIONS_TOKEN }}

				        run: |

				          pr=${{ github.event.number }}

				          repo=${{ github.repository }}

				          pr-porting-checks.sh "$pr" "$repo"

									
										58

.github/workflows/run-cri-containerd-tests.yaml
									
										vendored
									
												View File
												
				@@ -1,4 +1,8 @@

				name: CI | Run cri-containerd tests

				permissions:

				  contents: read

				on:

				  workflow_call:

				    inputs:

				@@ -8,35 +12,65 @@ on:

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				      runner:

				        description: The runner to execute the workflow on.

				        required: true

				        type: string

				      arch:

				        description: The arch of the tarball.

				        required: true

				        type: string

				      containerd_version:

				        description: The version of containerd for testing.

				        required: true

				        type: string

				      vmm:

				        description: The kata hypervisor for testing.

				        required: true

				        type: string

				jobs:

				  run-cri-containerd:

				    name: run-cri-containerd-${{ inputs.arch }} (${{ inputs.containerd_version }}, ${{ inputs.vmm }})

				    strategy:

				      fail-fast: true

				      matrix:

				        containerd_version: ['lts', 'active']

				        vmm: ['clh', 'qemu']

				    runs-on: garm-ubuntu-2204

				      fail-fast: false

				    runs-on: ${{ inputs.runner }}

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      CONTAINERD_VERSION: ${{ inputs.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KATA_HYPERVISOR: ${{ inputs.vmm }}

				    steps:

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        timeout-minutes: 15

				        run: bash tests/integration/cri-containerd/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v3

				      - name: get-kata-tarball for ${{ inputs.arch }}

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          name: kata-static-tarball-${{ inputs.arch }}${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/cri-containerd/gha-run.sh install-kata kata-artifacts

				      - name: Run cri-containerd tests

				      - name: Run cri-containerd tests for ${{ inputs.arch }}

				        timeout-minutes: 10

				        run: bash tests/integration/cri-containerd/gha-run.sh run

									
										93

.github/workflows/run-k8s-tests-on-aks.yaml
									
										vendored
									
												View File
												
				@@ -2,6 +2,9 @@ name: CI | Run kubernetes tests on AKS

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      registry:

				        required: true

				        type: string

				@@ -17,6 +20,23 @@ on:

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				permissions:

				  contents: read

				  id-token: write

				jobs:

				  run-k8s-tests:

				@@ -29,10 +49,29 @@ jobs:

				          - clh

				          - dragonball

				          - qemu

				          - qemu-runtime-rs

				          - stratovirt

				          - cloud-hypervisor

				        instance-type:

				          - small

				          - normal

				        include:

				          - host_os: cbl-mariner

				            vmm: clh

				    runs-on: ubuntu-latest

				            instance-type: small

				            genpolicy-pull-method: oci-distribution

				            auto-generate-policy: yes

				          - host_os: cbl-mariner

				            vmm: clh

				            instance-type: small

				            genpolicy-pull-method: containerd

				            auto-generate-policy: yes

				          - host_os: cbl-mariner

				            vmm: clh

				            instance-type: normal

				            auto-generate-policy: yes

				    runs-on: ubuntu-22.04

				    environment: ci

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				@@ -40,31 +79,61 @@ jobs:

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HOST_OS: ${{ matrix.host_os }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: "vanilla"

				      USING_NFD: "false"

				      K8S_TEST_HOST_TYPE: ${{ matrix.instance-type }}

				      GENPOLICY_PULL_METHOD: ${{ matrix.genpolicy-pull-method }}

				      AUTO_GENERATE_POLICY: ${{ matrix.auto-generate-policy }}

				    steps:

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts

				      - name: Download Azure CLI

				        run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli

				        uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1

				        with:

				          version: 'latest'

				      - name: Log into the Azure account

				        run: bash tests/integration/kubernetes/gha-run.sh login-azure

				        env:

				          AZ_APPID: ${{ secrets.AZ_APPID }}

				          AZ_PASSWORD: ${{ secrets.AZ_PASSWORD }}

				          AZ_TENANT_ID: ${{ secrets.AZ_TENANT_ID }}

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Create AKS cluster

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh create-cluster

				        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # v3.0.2

				        with:

				          timeout_minutes: 15

				          max_attempts: 20

				          retry_on: error

				          retry_wait_seconds: 10

				          command: bash tests/integration/kubernetes/gha-run.sh create-cluster

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Install `kubectl`

				        run: bash tests/integration/kubernetes/gha-run.sh install-kubectl

				        uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1

				        with:

				          version: 'latest'

				      - name: Download credentials for the Kubernetes CLI to use them

				        run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials

				@@ -72,7 +141,7 @@ jobs:

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-aks

				      - name: Run tests

				        timeout-minutes: 60

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

									
										115

.github/workflows/run-k8s-tests-on-amd64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,115 @@

				name: CI | Run kubernetes tests on amd64

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions:

				  contents: read

				jobs:

				  run-k8s-tests-amd64:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - clh #cloud-hypervisor

				          - dragonball

				          - fc #firecracker

				          - qemu

				          - cloud-hypervisor

				        container_runtime:

				          - containerd

				        snapshotter:

				          - devmapper

				        k8s:

				          - k3s

				        include:

				          - vmm: qemu

				            container_runtime: crio

				            snapshotter: ""

				            k8s: k0s

				    runs-on: ubuntu-22.04

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: ${{ matrix.k8s }}

				      KUBERNETES_EXTRA_PARAMS: ${{ matrix.container_runtime != 'crio' && '' || '--cri-socket remote:unix:///var/run/crio/crio.sock --kubelet-extra-args --cgroup-driver="systemd"' }}

				      SNAPSHOTTER: ${{ matrix.snapshotter }}

				      USING_NFD: "false"

				      K8S_TEST_HOST_TYPE: all

				      CONTAINER_RUNTIME: ${{ matrix.container_runtime }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Configure CRI-O

				        if: matrix.container_runtime == 'crio'

				        run: bash tests/integration/kubernetes/gha-run.sh setup-crio

				      - name: Deploy ${{ matrix.k8s }}

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-k8s

				        env:

				          CONTAINER_RUNTIME: ${{ matrix.container_runtime }}

				      - name: Configure the ${{ matrix.snapshotter }} snapshotter

				        if: matrix.snapshotter != ''

				        run: bash tests/integration/kubernetes/gha-run.sh configure-snapshotter

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Run tests

				        timeout-minutes: 30

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Collect artifacts ${{ matrix.vmm }}

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh collect-artifacts

				        continue-on-error: true

				      - name: Archive artifacts ${{ matrix.vmm }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: k8s-tests-${{ matrix.vmm }}-${{ matrix.snapshotter }}-${{ matrix.k8s }}-${{ inputs.tag }}

				          path: /tmp/artifacts

				          retention-days: 1

				      - name: Delete kata-deploy

				        if: always()

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup

									
										87

.github/workflows/run-k8s-tests-on-arm64.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,87 @@

				name: CI | Run kubernetes tests on arm64

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions:

				  contents: read

				jobs:

				  run-k8s-tests-on-arm64:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu

				        k8s:

				          - kubeadm

				    runs-on: arm64-k8s

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: ${{ matrix.k8s }}

				      USING_NFD: "false"

				      K8S_TEST_HOST_TYPE: all

				      TARGET_ARCH: "aarch64"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Run tests

				        timeout-minutes: 30

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Collect artifacts ${{ matrix.vmm }}

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh collect-artifacts

				        continue-on-error: true

				      - name: Archive artifacts ${{ matrix.vmm }}

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: k8s-tests-${{ matrix.vmm }}-${{ matrix.k8s }}-${{ inputs.tag }}

				          path: /tmp/artifacts

				          retention-days: 1

				      - name: Delete kata-deploy

				        if: always()

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup

									
										81

.github/workflows/run-k8s-tests-on-ppc64le.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,81 @@

				name: CI | Run kubernetes tests on Power(ppc64le)

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions:

				  contents: read

				jobs:

				  run-k8s-tests:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu

				        k8s:

				          - kubeadm

				    runs-on: k8s-ppc64le

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: ${{ matrix.k8s }}

				      USING_NFD: "false"

				      TARGET_ARCH: "ppc64le"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install golang

				        run: |

				          ./tests/install_go.sh -f -p

				          echo "/usr/local/go/bin" >> "$GITHUB_PATH"

				      - name: Prepare the runner for k8s cluster creation

				        run: bash "${HOME}/scripts/k8s_cluster_cleanup.sh"

				      - name: Create k8s cluster using kubeadm

				        run: bash "${HOME}/scripts/k8s_cluster_create.sh"

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-kubeadm

				      - name: Run tests

				        timeout-minutes: 30

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Delete cluster and post cleanup actions

				        run: bash "${HOME}/scripts/k8s_cluster_cleanup.sh"

									
										48

.github/workflows/run-k8s-tests-on-sev.yaml
									
										vendored
									
												View File
											
				@@ -1,48 +0,0 @@

				name: CI | Run kubernetes tests on SEV

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				jobs:

				  run-k8s-tests:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu-sev

				    runs-on: sev

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBECONFIG: /home/kata/.kube/config

				      USING_NFD: "false"

				    steps:

				      - uses: actions/checkout@v3

				        with:

				          ref: ${{ inputs.commit-hash }}

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-sev

				      - name: Run tests

				        timeout-minutes: 30

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Delete kata-deploy

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-sev

									
										48

.github/workflows/run-k8s-tests-on-snp.yaml
									
										vendored
									
												View File
											
				@@ -1,48 +0,0 @@

				name: CI | Run kubernetes tests on SEV-SNP

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				jobs:

				  run-k8s-tests:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu-snp

				    runs-on: sev-snp

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBECONFIG: /home/kata/.kube/config

				      USING_NFD: "false"

				    steps:

				      - uses: actions/checkout@v3

				        with:

				          ref: ${{ inputs.commit-hash }}

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-snp

				      - name: Run tests

				        timeout-minutes: 30

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Delete kata-deploy

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-snp

									
										47

.github/workflows/run-k8s-tests-on-tdx.yaml
									
										vendored
									
												View File
											
				@@ -1,47 +0,0 @@

				name: CI | Run kubernetes tests on TDX

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				jobs:

				  run-k8s-tests:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu-tdx

				    runs-on: tdx

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      USING_NFD: "true"

				    steps:

				      - uses: actions/checkout@v3

				        with:

				          ref: ${{ inputs.commit-hash }}

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-tdx

				      - name: Run tests

				        timeout-minutes: 30

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Delete kata-deploy

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-tdx

									
										144

.github/workflows/run-k8s-tests-on-zvsi.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,144 @@

				name: CI | Run kubernetes tests on IBM Cloud Z virtual server instance (zVSI)

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD:

				        required: true

				permissions:

				  contents: read

				jobs:

				  run-k8s-tests:

				    strategy:

				      fail-fast: false

				      matrix:

				        snapshotter:

				          - overlayfs

				          - devmapper

				          - nydus

				        vmm:

				          - qemu

				          - qemu-runtime-rs

				          - qemu-coco-dev

				        k8s:

				          - kubeadm

				        include:

				          - snapshotter: devmapper

				            pull-type: default

				            using-nfd: true

				            deploy-cmd: configure-snapshotter

				          - snapshotter: nydus

				            pull-type: guest-pull

				            using-nfd: false

				            deploy-cmd: deploy-snapshotter

				        exclude:

				          - snapshotter: overlayfs

				            vmm: qemu

				          - snapshotter: overlayfs

				            vmm: qemu-coco-dev

				          - snapshotter: devmapper

				            vmm: qemu-runtime-rs

				          - snapshotter: devmapper

				            vmm: qemu-coco-dev

				          - snapshotter: nydus

				            vmm: qemu

				          - snapshotter: nydus

				            vmm: qemu-runtime-rs

				    runs-on: s390x-large

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HOST_OS: "ubuntu"

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: ${{ matrix.k8s }}

				      PULL_TYPE: ${{ matrix.pull-type }}

				      SNAPSHOTTER: ${{ matrix.snapshotter }}

				      USING_NFD: ${{ matrix.using-nfd }}

				      TARGET_ARCH: "s390x"

				      AUTHENTICATED_IMAGE_USER: ${{ vars.AUTHENTICATED_IMAGE_USER }}

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Set SNAPSHOTTER to empty if overlayfs

				        run: echo "SNAPSHOTTER=" >> "$GITHUB_ENV"

				        if: ${{ matrix.snapshotter == 'overlayfs' }}

				      - name: Set KBS and KBS_INGRESS if qemu-coco-dev

				        run: |

				          echo "KBS=true" >> "$GITHUB_ENV"

				          echo "KBS_INGRESS=nodeport" >> "$GITHUB_ENV"

				        if: ${{ matrix.vmm == 'qemu-coco-dev' }}

				      # qemu-runtime-rs only works with overlayfs

				      # See: https://github.com/kata-containers/kata-containers/issues/10066

				      - name: Configure the ${{ matrix.snapshotter }} snapshotter

				        run: bash tests/integration/kubernetes/gha-run.sh ${{ matrix.deploy-cmd }}

				        if: ${{ matrix.snapshotter != 'overlayfs' }}

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-zvsi

				      - name: Uninstall previous `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client

				        if: ${{ matrix.vmm == 'qemu-coco-dev' }}

				      - name: Deploy CoCo KBS

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs

				        if: ${{ matrix.vmm == 'qemu-coco-dev' }}

				      - name: Install `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client

				        if: ${{ matrix.vmm == 'qemu-coco-dev' }}

				      - name: Run tests

				        timeout-minutes: 60

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Delete kata-deploy

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-zvsi

				      - name: Delete CoCo KBS

				        if: always()

				        run: |

				          if [ "${KBS}" == "true" ]; then

				            bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

				          fi

									
										146

.github/workflows/run-kata-coco-stability-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,146 @@

				name: CI | Run Kata CoCo k8s Stability Tests

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				      tarball-suffix:

				        required: false

				        type: string

				    secrets:

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				      AUTHENTICATED_IMAGE_PASSWORD:

				        required: true

				permissions:

				  contents: read

				  id-token: write

				jobs:

				  # Generate jobs for testing CoCo on non-TEE environments

				  run-stability-k8s-tests-coco-nontee:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu-coco-dev

				        snapshotter:

				          - nydus

				        pull-type:

				          - guest-pull

				    runs-on: ubuntu-22.04

				    environment: ci

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      # Some tests rely on that variable to run (or not)

				      KBS: "true"

				      # Set the KBS ingress handler (empty string disables handling)

				      KBS_INGRESS: "aks"

				      KUBERNETES: "vanilla"

				      PULL_TYPE: ${{ matrix.pull-type }}

				      AUTHENTICATED_IMAGE_USER: ${{ vars.AUTHENTICATED_IMAGE_USER }}

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      SNAPSHOTTER: ${{ matrix.snapshotter }}

				      USING_NFD: "false"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts

				      - name: Download Azure CLI

				        run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli

				      - name: Log into the Azure account

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Create AKS cluster

				        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # v3.0.2

				        with:

				          timeout_minutes: 15

				          max_attempts: 20

				          retry_on: error

				          retry_wait_seconds: 10

				          command: bash tests/integration/kubernetes/gha-run.sh create-cluster

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Install `kubectl`

				        uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1

				        with:

				          version: 'latest'

				      - name: Download credentials for the Kubernetes CLI to use them

				        run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials

				      - name: Deploy Snapshotter

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-snapshotter

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-aks

				      - name: Deploy CoCo KBS

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs

				      - name: Install `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client

				      - name: Run stability tests

				        timeout-minutes: 300

				        run: bash tests/stability/gha-stability-run.sh run-tests

				      - name: Delete AKS cluster

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh delete-cluster

									
										331

.github/workflows/run-kata-coco-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,331 @@

				name: CI | Run kata coco tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AUTHENTICATED_IMAGE_PASSWORD:

				        required: true

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				      ITA_KEY:

				        required: true

				permissions:

				  contents: read

				  id-token: write

				jobs:

				  run-k8s-tests-on-tdx:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu-tdx

				        snapshotter:

				          - nydus

				        pull-type:

				          - guest-pull

				    runs-on: tdx

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: "vanilla"

				      USING_NFD: "true"

				      KBS: "true"

				      K8S_TEST_HOST_TYPE: "baremetal"

				      KBS_INGRESS: "nodeport"

				      SNAPSHOTTER: ${{ matrix.snapshotter }}

				      PULL_TYPE: ${{ matrix.pull-type }}

				      AUTHENTICATED_IMAGE_USER: ${{ vars.AUTHENTICATED_IMAGE_USER }}

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      ITA_KEY: ${{ secrets.ITA_KEY }}

				      AUTO_GENERATE_POLICY: "yes"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Deploy Snapshotter

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-snapshotter

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-tdx

				      - name: Uninstall previous `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client

				      - name: Deploy CoCo KBS

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs

				      - name: Install `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client

				      - name: Deploy CSI driver

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver

				      - name: Run tests

				        timeout-minutes: 100

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Delete kata-deploy

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-tdx

				      - name: Delete Snapshotter

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-snapshotter

				      - name: Delete CoCo KBS

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

				      - name: Delete CSI driver

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh delete-csi-driver

				  run-k8s-tests-sev-snp:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu-snp

				        snapshotter:

				          - nydus

				        pull-type:

				          - guest-pull

				    runs-on: sev-snp

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBECONFIG: /home/kata/.kube/config

				      KUBERNETES: "vanilla"

				      USING_NFD: "false"

				      KBS: "true"

				      KBS_INGRESS: "nodeport"

				      K8S_TEST_HOST_TYPE: "baremetal"

				      SNAPSHOTTER: ${{ matrix.snapshotter }}

				      PULL_TYPE: ${{ matrix.pull-type }}

				      AUTHENTICATED_IMAGE_USER: ${{ vars.AUTHENTICATED_IMAGE_USER }}

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      AUTO_GENERATE_POLICY: "yes"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Deploy Snapshotter

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-snapshotter

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-snp

				      - name: Uninstall previous `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh uninstall-kbs-client

				      - name: Deploy CoCo KBS

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs

				      - name: Install `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client

				      - name: Deploy CSI driver

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver

				      - name: Run tests

				        timeout-minutes: 50

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Delete kata-deploy

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-snp

				      - name: Delete Snapshotter

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-snapshotter

				      - name: Delete CoCo KBS

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh delete-coco-kbs

				      - name: Delete CSI driver

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh delete-csi-driver

				  # Generate jobs for testing CoCo on non-TEE environments

				  run-k8s-tests-coco-nontee:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu-coco-dev

				        snapshotter:

				          - nydus

				        pull-type:

				          - guest-pull

				    runs-on: ubuntu-22.04

				    environment: ci

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      # Some tests rely on that variable to run (or not)

				      KBS: "true"

				      # Set the KBS ingress handler (empty string disables handling)

				      KBS_INGRESS: "aks"

				      KUBERNETES: "vanilla"

				      PULL_TYPE: ${{ matrix.pull-type }}

				      AUTHENTICATED_IMAGE_USER: ${{ vars.AUTHENTICATED_IMAGE_USER }}

				      AUTHENTICATED_IMAGE_PASSWORD: ${{ secrets.AUTHENTICATED_IMAGE_PASSWORD }}

				      SNAPSHOTTER: ${{ matrix.snapshotter }}

				      # Caution: current ingress controller used to expose the KBS service

				      # requires much vCPUs, lefting only a few for the tests. Depending on the

				      # host type chose it will result on the creation of a cluster with

				      # insufficient resources.

				      K8S_TEST_HOST_TYPE: "all"

				      USING_NFD: "false"

				      AUTO_GENERATE_POLICY: "yes"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/kubernetes/gha-run.sh install-kata-tools kata-artifacts

				      - name: Download Azure CLI

				        run: bash tests/integration/kubernetes/gha-run.sh install-azure-cli

				      - name: Log into the Azure account

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Create AKS cluster

				        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # v3.0.2

				        with:

				          timeout_minutes: 15

				          max_attempts: 20

				          retry_on: error

				          retry_wait_seconds: 10

				          command: bash tests/integration/kubernetes/gha-run.sh create-cluster

				      - name: Install `bats`

				        run: bash tests/integration/kubernetes/gha-run.sh install-bats

				      - name: Install `kubectl`

				        uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1

				        with:

				          version: 'latest'

				      - name: Download credentials for the Kubernetes CLI to use them

				        run: bash tests/integration/kubernetes/gha-run.sh get-cluster-credentials

				      - name: Deploy Snapshotter

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-snapshotter

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-aks

				      - name: Deploy CoCo KBS

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-coco-kbs

				      - name: Install `kbs-client`

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh install-kbs-client

				      - name: Deploy CSI driver

				        timeout-minutes: 5

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-csi-driver

				      - name: Run tests

				        timeout-minutes: 80

				        run: bash tests/integration/kubernetes/gha-run.sh run-tests

				      - name: Report tests

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh report-tests

				      - name: Delete AKS cluster

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh delete-cluster

									
										110

.github/workflows/run-kata-deploy-tests-on-aks.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,110 @@

				name: CI | Run kata-deploy tests on AKS

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				    secrets:

				      AZ_APPID:

				        required: true

				      AZ_TENANT_ID:

				       required: true

				      AZ_SUBSCRIPTION_ID:

				        required: true

				permissions:

				  contents: read

				  id-token: write

				jobs:

				  run-kata-deploy-tests:

				    strategy:

				      fail-fast: false

				      matrix:

				        host_os:

				          - ubuntu

				        vmm:

				          - clh

				          - dragonball

				          - qemu

				          - qemu-runtime-rs

				        include:

				          - host_os: cbl-mariner

				            vmm: clh

				    runs-on: ubuntu-22.04

				    environment: ci

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HOST_OS: ${{ matrix.host_os }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: "vanilla"

				      USING_NFD: "false"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Download Azure CLI

				        run: bash tests/functional/kata-deploy/gha-run.sh install-azure-cli

				      - name: Log into the Azure account

				        uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2.3.0

				        with:

				          client-id: ${{ secrets.AZ_APPID }}

				          tenant-id: ${{ secrets.AZ_TENANT_ID }}

				          subscription-id: ${{ secrets.AZ_SUBSCRIPTION_ID }}

				      - name: Create AKS cluster

				        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # v3.0.2

				        with:

				          timeout_minutes: 15

				          max_attempts: 20

				          retry_on: error

				          retry_wait_seconds: 10

				          command: bash tests/integration/kubernetes/gha-run.sh create-cluster

				      - name: Install `bats`

				        run: bash tests/functional/kata-deploy/gha-run.sh install-bats

				      - name: Install `kubectl`

				        uses: azure/setup-kubectl@776406bce94f63e41d621b960d78ee25c8b76ede # v4.0.1

				        with:

				          version: 'latest'

				      - name: Download credentials for the Kubernetes CLI to use them

				        run: bash tests/functional/kata-deploy/gha-run.sh get-cluster-credentials

				      - name: Run tests

				        run: bash tests/functional/kata-deploy/gha-run.sh run-tests

				      - name: Delete AKS cluster

				        if: always()

				        run: bash tests/functional/kata-deploy/gha-run.sh delete-cluster

									
										69

.github/workflows/run-kata-deploy-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,69 @@

				name: CI | Run kata-deploy tests

				on:

				  workflow_call:

				    inputs:

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions:

				  contents: read

				jobs:

				  run-kata-deploy-tests:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu

				        k8s:

				          - k0s

				          - k3s

				          - rke2

				          - microk8s

				    runs-on: ubuntu-22.04

				    env:

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      KUBERNETES: ${{ matrix.k8s }}

				      USING_NFD: "false"

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Deploy ${{ matrix.k8s }}

				        run:  bash tests/functional/kata-deploy/gha-run.sh deploy-k8s

				      - name: Install `bats`

				        run: bash tests/functional/kata-deploy/gha-run.sh install-bats

				      - name: Run tests

				        run: bash tests/functional/kata-deploy/gha-run.sh run-tests

									
										70

.github/workflows/run-kata-monitor-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,70 @@

				name: CI | Run kata-monitor tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions:

				  contents: read

				jobs:

				  run-monitor:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm:

				          - qemu

				        container_engine:

				          - crio

				          - containerd

				        # TODO: enable when https://github.com/kata-containers/kata-containers/issues/9853 is fixed

				        #include:

				        #  - container_engine: containerd

				        #    containerd_version: lts

				        exclude:

				          # TODO: enable with containerd when https://github.com/kata-containers/kata-containers/issues/9761 is fixed

				          - container_engine: containerd

				            vmm: qemu

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINER_ENGINE: ${{ matrix.container_engine }}

				      #CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/functional/kata-monitor/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/functional/kata-monitor/gha-run.sh install-kata kata-artifacts

				      - name: Run kata-monitor tests

				        run: bash tests/functional/kata-monitor/gha-run.sh run

									
										89

.github/workflows/run-metrics.yaml
									
										vendored
									
												View File
												
				@@ -2,17 +2,36 @@ name: CI | Run test metrics

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				      registry:

				        required: true

				        type: string

				      repo:

				        required: true

				        type: string

				      tag:

				        required: true

				        type: string

				      pr-number:

				        required: true

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions:

				  contents: read

				jobs:

				  run-metrics:

				    strategy:

				      fail-fast: true

				      # We can set this to true whenever we're 100% sure that

				      # the all the tests are not flaky, otherwise we'll fail

				      # all the tests due to a single flaky instance.

				      fail-fast: false

				      matrix:

				        vmm: ['clh', 'qemu']

				      max-parallel: 1

				@@ -20,45 +39,91 @@ jobs:

				    env:

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				      DOCKER_REGISTRY: ${{ inputs.registry }}

				      DOCKER_REPO: ${{ inputs.repo }}

				      DOCKER_TAG: ${{ inputs.tag }}

				      GH_PR_NUMBER: ${{ inputs.pr-number }}

				      K8S_TEST_HOST_TYPE: "baremetal"

				      USING_NFD: "false"

				      KUBERNETES: kubeadm

				    steps:

				      - uses: actions/checkout@v3

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v3

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install kata

				        run: bash tests/metrics/gha-run.sh install-kata kata-artifacts

				      - name: Deploy Kata

				        timeout-minutes: 10

				        run: bash tests/integration/kubernetes/gha-run.sh deploy-kata-kubeadm

				      - name: Install check metrics

				        run: bash tests/metrics/gha-run.sh install-checkmetrics

				      - name: enabling the hypervisor

				        run: bash tests/metrics/gha-run.sh enabling-hypervisor

				      - name: run launch times test

				        timeout-minutes: 15

				        continue-on-error: true

				        run: bash tests/metrics/gha-run.sh run-test-launchtimes

				      - name: run memory foot print test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-memory-usage

				      - name: run memory usage inside container test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-memory-usage-inside-container

				      - name: run blogbench test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-blogbench

				      - name: run tensorflow test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-tensorflow

				      - name: run fio test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-fio

				      - name: run iperf test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-iperf

				      - name: run latency test

				        timeout-minutes: 15

				        continue-on-error: true

				        run:  bash tests/metrics/gha-run.sh run-test-latency

				      - name: check metrics

				        run:  bash tests/metrics/gha-run.sh check-metrics

				      - name: make metrics tarball ${{ matrix.vmm }}

				        run: bash tests/metrics/gha-run.sh make-tarball-results

				      - name: archive metrics results ${{ matrix.vmm }}

				        uses: actions/upload-artifact@v3

				        uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2

				        with:

				          name: metrics-artifacts-${{ matrix.vmm }}

				          path: results-${{ matrix.vmm }}.tar.gz

				          retention-days: 1

				          if-no-files-found: error

				      - name: Delete kata-deploy

				        timeout-minutes: 10

				        if: always()

				        run: bash tests/integration/kubernetes/gha-run.sh cleanup-kubeadm

									
										42

.github/workflows/run-nydus-tests.yaml
									
										vendored
									
												View File
											
				@@ -1,42 +0,0 @@

				name: CI | Run nydus tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				jobs:

				  run-nydus:

				    strategy:

				      fail-fast: true

				      matrix:

				        containerd_version: ['lts', 'active']

				        vmm: ['clh', 'qemu', 'dragonball']

				    runs-on: garm-ubuntu-2204

				    env:

				      CONTAINERD_VERSION: ${{ matrix.containerd_version }}

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@v3

				        with:

				          ref: ${{ inputs.commit-hash }}

				      - name: Install dependencies

				        run: bash tests/integration/nydus/gha-run.sh install-dependencies

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v3

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/nydus/gha-run.sh install-kata kata-artifacts

				      - name: Run nydus tests

				        run: bash tests/integration/nydus/gha-run.sh run

									
										54

.github/workflows/run-runk-tests.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,54 @@

				name: CI | Run runk tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				      target-branch:

				        required: false

				        type: string

				        default: ""

				permissions:

				  contents: read

				jobs:

				  run-runk:

				    # Skip runk tests as we have no maintainers. TODO: Decide when to remove altogether

				    if: false

				    runs-on: ubuntu-22.04

				    env:

				      CONTAINERD_VERSION: lts

				    steps:

				      - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          ref: ${{ inputs.commit-hash }}

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Rebase atop of the latest target branch

				        run: |

				          ./tests/git-helper.sh "rebase-atop-of-the-latest-target-branch"

				        env:

				          TARGET_BRANCH: ${{ inputs.target-branch }}

				      - name: Install dependencies

				        run: bash tests/integration/runk/gha-run.sh install-dependencies

				        env:

				          GH_TOKEN: ${{ github.token }}

				      - name: get-kata-tarball

				        uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4.3.0

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Install kata

				        run: bash tests/integration/runk/gha-run.sh install-kata kata-artifacts

				      - name: Run runk tests

				        run: bash tests/integration/runk/gha-run.sh run

									
										37

.github/workflows/run-vfio-tests.yaml
									
										vendored
									
												View File
											
				@@ -1,37 +0,0 @@

				name: CI | Run vfio tests

				on:

				  workflow_call:

				    inputs:

				      tarball-suffix:

				        required: false

				        type: string

				      commit-hash:

				        required: false

				        type: string

				jobs:

				  run-vfio:

				    strategy:

				      fail-fast: false

				      matrix:

				        vmm: ['clh', 'qemu']

				    runs-on: garm-ubuntu-2204

				    env:

				      GOPATH: ${{ github.workspace }}

				      KATA_HYPERVISOR: ${{ matrix.vmm }}

				    steps:

				      - uses: actions/checkout@v3

				        with:

				          ref: ${{ inputs.commit-hash }}

				      - name: Install dependencies

				        run: bash tests/functional/vfio/gha-run.sh install-dependencies

				      - name: get-kata-tarball

				        uses: actions/download-artifact@v3

				        with:

				          name: kata-static-tarball-amd64${{ inputs.tarball-suffix }}

				          path: kata-artifacts

				      - name: Run vfio tests

				        run: bash tests/functional/vfio/gha-run.sh run

									
										60

.github/workflows/scorecard.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,60 @@

				# This workflow uses actions that are not certified by GitHub. They are provided

				# by a third-party and are governed by separate terms of service, privacy

				# policy, and support documentation.

				name: Scorecard supply-chain security

				on:

				  # For Branch-Protection check. Only the default branch is supported. See

				  # https://github.com/ossf/scorecard/blob/main/docs/checks.md#branch-protection

				  branch_protection_rule:

				  push:

				    branches: [ "main" ]

				  workflow_dispatch:

				permissions: {}

				jobs:

				  analysis:

				    name: Scorecard analysis

				    runs-on: ubuntu-latest

				    # `publish_results: true` only works when run from the default branch. conditional can be removed if disabled.

				    if: github.event.repository.default_branch == github.ref_name || github.event_name == 'pull_request'

				    permissions:

				      # Needed to upload the results to code-scanning dashboard.

				      security-events: write

				      # Needed to publish results and get a badge (see publish_results below).

				      id-token: write

				    steps:

				      - name: "Checkout code"

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          persist-credentials: false

				      - name: "Run analysis"

				        uses: ossf/scorecard-action@f49aabe0b5af0936a0987cfb85d86b75731b0186 # v2.4.1

				        with:

				          results_file: results.sarif

				          results_format: sarif

				          # Public repositories:

				          #   - Publish results to OpenSSF REST API for easy access by consumers

				          #   - Allows the repository to include the Scorecard badge.

				          #   - See https://github.com/ossf/scorecard-action#publishing-results.

				          publish_results: true

				      # Upload the results as artifacts (optional). Commenting out will disable uploads of run results in SARIF

				      # format to the repository Actions tab.

				      - name: "Upload artifact"

				        uses: actions/upload-artifact@4cec3d8aa04e39d1a68397de0c4cd6fb9dce8ec1 # v4.6.1

				        with:

				          name: SARIF file

				          path: results.sarif

				          retention-days: 5

				      # Upload the results to GitHub's code scanning dashboard (optional).

				      # Commenting out will disable upload of results to your repo's Code Scanning dashboard

				      - name: "Upload to code-scanning"

				        uses: github/codeql-action/upload-sarif@v3

				        with:

				          sarif_file: results.sarif

									
										32

.github/workflows/shellcheck.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,32 @@

				# https://github.com/marketplace/actions/shellcheck

				name: Check shell scripts

				on:

				  workflow_dispatch:

				  pull_request:

				    types:

				      - opened

				      - edited

				      - reopened

				      - synchronize

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  shellcheck:

				    runs-on: ubuntu-24.04

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Run ShellCheck

				        uses: ludeeus/action-shellcheck@00b27aa7cb85167568cb48a3838b75f4265f2bca # master (2024-06-20)

				        with:

				          ignore_paths: "**/vendor/**"

									
										35

.github/workflows/shellcheck_required.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,35 @@

				# https://github.com/marketplace/actions/shellcheck

				name: Shellcheck required

				on:

				  workflow_dispatch:

				  pull_request:

				    types:

				      - opened

				      - edited

				      - reopened

				      - synchronize

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  shellcheck-required:

				    runs-on: ubuntu-24.04

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Run ShellCheck

				        uses: ludeeus/action-shellcheck@00b27aa7cb85167568cb48a3838b75f4265f2bca # master (2024-06-20)

				        with:

				          severity: error

				          ignore_paths: "**/vendor/**"

									
										20

.github/workflows/stale.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,20 @@

				name: 'Automatically close stale PRs'

				on:

				  schedule:

				    - cron: '0 0 * * *'

				  workflow_dispatch:

				permissions:

				  contents: read

				jobs:

				  stale:

				    runs-on: ubuntu-22.04

				    steps:

				      - uses: actions/stale@5bef64f19d7facfb25b37b414482c7164d639639 # v9.1.0

				        with:

				          stale-pr-message: 'This PR has been opened without with no activity for 180 days. Comment on the issue otherwise it will be closed in 7 days'

				          days-before-pr-stale: 180

				          days-before-pr-close: 7

				          days-before-issue-stale: -1

				          days-before-issue-close: -1

									
										37

.github/workflows/static-checks-dragonball.yaml
									
										vendored
									
												View File
											
				@@ -1,37 +0,0 @@

				on:

				  pull_request:

				    types:

				      - opened

				      - edited

				      - reopened

				      - synchronize

				    paths-ignore: [ '**.md', '**.png', '**.jpg', '**.jpeg', '**.svg', '/docs/**' ]

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				name: Static checks dragonball

				jobs:

				  test-dragonball:

				    runs-on: dragonball

				    env:

				      RUST_BACKTRACE: "1"

				    steps:

				      - uses: actions/checkout@v3

				      - name: Set env

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        run: |

				          echo "GOPATH=${{ github.workspace }}" >> $GITHUB_ENV

				      - name: Install Rust

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        run: |

				          ./ci/install_rust.sh

				          echo PATH="$HOME/.cargo/bin:$PATH" >> $GITHUB_ENV

				      - name: Run Unit Test

				        if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				        run: |

				          cd src/dragonball

				          cargo version

				          rustc --version

				          sudo -E env PATH=$PATH LIBC=gnu SUPPORT_VIRTUALIZATION=true make test

									
										49

.github/workflows/static-checks-self-hosted.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,49 @@

				on:

				  pull_request:

				    types:

				      - opened

				      - synchronize

				      - reopened

				      - labeled # a workflow runs only when the 'ok-to-test' label is added

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				name: Static checks self-hosted

				jobs:

				  skipper:

				    if: ${{ contains(github.event.pull_request.labels.*.name, 'ok-to-test') }}

				    uses: ./.github/workflows/gatekeeper-skipper.yaml

				    with:

				      commit-hash: ${{ github.event.pull_request.head.sha }}

				      target-branch: ${{ github.event.pull_request.base.ref }}

				  build-checks:

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    strategy:

				      fail-fast: false

				      matrix:

				        instance:

				          - "ubuntu-22.04-arm"

				          - "s390x"

				          - "ppc64le"

				    uses: ./.github/workflows/build-checks.yaml

				    with:

				      instance: ${{ matrix.instance }}

				  build-checks-preview:

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    strategy:

				      fail-fast: false

				      matrix:

				        instance:

				          - "riscv-builder"

				    uses: ./.github/workflows/build-checks-preview-riscv64.yaml

				    with:

				      instance: ${{ matrix.instance }}

									
										226

.github/workflows/static-checks.yaml
									
										vendored
									
												View File
												
				@@ -5,6 +5,10 @@ on:

				      - edited

				      - reopened

				      - synchronize

				  workflow_dispatch:

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				@@ -12,74 +16,170 @@ concurrency:

				name: Static checks

				jobs:

				  static-checks:

				    runs-on: ubuntu-20.04

				  skipper:

				    uses: ./.github/workflows/gatekeeper-skipper.yaml

				    with:

				      commit-hash: ${{ github.event.pull_request.head.sha }}

				      target-branch: ${{ github.event.pull_request.base.ref }}

				  check-kernel-config-version:

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    runs-on: ubuntu-22.04

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Ensure the kernel config version has been updated

				        run: |

				          kernel_dir="tools/packaging/kernel/"

				          kernel_version_file="${kernel_dir}kata_config_version"

				          modified_files=$(git diff --name-only origin/"$GITHUB_BASE_REF"..HEAD)

				          if git diff --name-only origin/"$GITHUB_BASE_REF"..HEAD "${kernel_dir}" | grep "${kernel_dir}"; then

				            echo "Kernel directory has changed, checking if $kernel_version_file has been updated"

				            if echo "$modified_files" | grep -v "README.md" | grep "${kernel_dir}" >>"/dev/null"; then

				              echo "$modified_files" | grep "$kernel_version_file" >>/dev/null || ( echo "Please bump version in $kernel_version_file" && exit 1)

				            else

				              echo "Readme file changed, no need for kernel config version update."

				            fi

				            echo "Check passed"

				          fi

				  build-checks:

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    uses: ./.github/workflows/build-checks.yaml

				    with:

				      instance: ubuntu-22.04

				  build-checks-depending-on-kvm:

				    runs-on: ubuntu-22.04

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    strategy:

				      fail-fast: false

				      matrix:

				        component:

				          - runtime-rs

				        include:

				          - component: runtime-rs

				            command: "sudo -E env PATH=$PATH LIBC=gnu SUPPORT_VIRTUALIZATION=true make test"

				          - component: runtime-rs

				            component-path: src/dragonball

				    steps:

				      - name: Checkout the code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Install system deps

				        run: |

				          sudo apt-get update && sudo apt-get install -y build-essential musl-tools

				      - name: Install yq

				        run: |

				          sudo -E ./ci/install_yq.sh

				        env:

				          INSTALL_IN_GOPATH: false

				      - name: Install rust

				        run: |

				          export PATH="$PATH:/usr/local/bin"

				          ./tests/install_rust.sh

				      - name: Running `${{ matrix.command }}` for ${{ matrix.component }}

				        run: |

				          export PATH="$PATH:${HOME}/.cargo/bin"

				          cd ${{ matrix.component-path }}

				          ${{ matrix.command }}

				        env:

				          RUST_BACKTRACE: "1"

				          RUST_LIB_BACKTRACE: "0"

				  static-checks:

				    runs-on: ubuntu-22.04

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    strategy:

				      fail-fast: false

				      matrix:

				        cmd:

				          - "make vendor"

				          - "make static-checks"

				          - "make check"

				          - "make test"

				          - "sudo -E PATH=\"$PATH\" make test"

				    env:

				      RUST_BACKTRACE: "1"

				      target_branch: ${{ github.base_ref }}

				      GOPATH: ${{ github.workspace }}

				    permissions:

				      contents: read  # for checkout

				      packages: write # for push to ghcr.io

				    steps:

				    - name: Free disk space

				      run: |

				        sudo rm -rf /usr/share/dotnet

				        sudo rm -rf "$AGENT_TOOLSDIRECTORY"

				    - name: Checkout code

				      uses: actions/checkout@v3

				      with:

				        fetch-depth: 0

				        path: ./src/github.com/${{ github.repository }}

				    - name: Install Go

				      uses: actions/setup-go@v3

				      with:

				        go-version: 1.19.3

				    - name: Check kernel config version

				      run: |

				        cd "${{ github.workspace }}/src/github.com/${{ github.repository }}"

				        kernel_dir="tools/packaging/kernel/"

				        kernel_version_file="${kernel_dir}kata_config_version"

				        modified_files=$(git diff --name-only origin/main..HEAD)

				        if git diff --name-only origin/main..HEAD "${kernel_dir}" | grep "${kernel_dir}"; then

				          echo "Kernel directory has changed, checking if $kernel_version_file has been updated"

				          if echo "$modified_files" | grep -v "README.md" | grep "${kernel_dir}" >>"/dev/null"; then

				            echo "$modified_files" | grep "$kernel_version_file" >>/dev/null || ( echo "Please bump version in $kernel_version_file" && exit 1)

				          else

				            echo "Readme file changed, no need for kernel config version update."

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				          path: ./src/github.com/${{ github.repository }}

				      - name: Install yq

				        run: |

				          cd "${GOPATH}/src/github.com/${{ github.repository }}"

				          ./ci/install_yq.sh

				        env:

				          INSTALL_IN_GOPATH: false

				      - name: Install golang

				        run: |

				          cd "${GOPATH}/src/github.com/${{ github.repository }}"

				          ./tests/install_go.sh -f -p

				          echo "/usr/local/go/bin" >> "$GITHUB_PATH"

				      - name: Install system dependencies

				        run: |

				          sudo apt-get update && sudo apt-get -y install moreutils hunspell hunspell-en-gb hunspell-en-us pandoc

				      - name: Install open-policy-agent

				        run: |

				          cd "${GOPATH}/src/github.com/${{ github.repository }}"

				          ./tests/install_opa.sh

				      - name: Install regorus

				        env:

				          ARTEFACT_REPOSITORY: "${{ github.repository }}"

				          ARTEFACT_REGISTRY_USERNAME: "${{ github.actor }}"

				          ARTEFACT_REGISTRY_PASSWORD: "${{ secrets.GITHUB_TOKEN }}"

				        run: |

				          "${GOPATH}/src/github.com/${{ github.repository }}/tests/install_regorus.sh"

				      - name: Run check

				        run: |

				          export PATH="${PATH}:${GOPATH}/bin"

				          cd "${GOPATH}/src/github.com/${{ github.repository }}" && ${{ matrix.cmd }}

				  govulncheck:

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    uses: ./.github/workflows/govulncheck.yaml

				  codegen:

				    runs-on: ubuntu-22.04

				    needs: skipper

				    if: ${{ needs.skipper.outputs.skip_static != 'yes' }}

				    permissions:

				      contents: read  # for checkout

				    steps:

				      - name: Checkout code

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: generate

				        run: make -C src/agent generate-protocols

				      - name: check for diff

				        run: |

				          diff=$(git diff)

				          if [[ -z "${diff}" ]]; then

				            echo "No diff detected."

				            exit 0

				          fi

				          echo "Check passed"

				        fi

				    - name: Set PATH

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        echo "${{ github.workspace }}/bin" >> $GITHUB_PATH

				    - name: Setup

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/setup.sh

				    - name: Installing rust

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_rust.sh

				        PATH=$PATH:"$HOME/.cargo/bin"

				        rustup target add x86_64-unknown-linux-musl

				        rustup component add rustfmt clippy

				    - name: Setup seccomp

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        libseccomp_install_dir=$(mktemp -d -t libseccomp.XXXXXXXXXX)

				        gperf_install_dir=$(mktemp -d -t gperf.XXXXXXXXXX)

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && ./ci/install_libseccomp.sh "${libseccomp_install_dir}" "${gperf_install_dir}"

				        echo "Set environment variables for the libseccomp crate to link the libseccomp library statically"

				        echo "LIBSECCOMP_LINK_TYPE=static" >> $GITHUB_ENV

				        echo "LIBSECCOMP_LIB_PATH=${libseccomp_install_dir}/lib" >> $GITHUB_ENV

				    - name: Run check

				      if: ${{ !contains(github.event.pull_request.labels.*.name, 'force-skip-ci') }}

				      run: |

				        cd ${GOPATH}/src/github.com/${{ github.repository }} && ${{ matrix.cmd }}

				          cat << EOF >> "${GITHUB_STEP_SUMMARY}"

				          Run \`make -C src/agent generate-protocols\` to update protobuf bindings.

				          \`\`\`diff

				          ${diff}

				          \`\`\`

				          EOF

				          echo "::error::Golang protobuf bindings need to be regenerated (see Github step summary for diff)."

				          exit 1

									
										29

.github/workflows/zizmor.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,29 @@

				name: GHA security analysis

				on:

				  push:

				    branches: ["main"]

				  pull_request:

				permissions:

				  contents: read

				concurrency:

				  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}

				  cancel-in-progress: true

				jobs:

				  zizmor:

				    runs-on: ubuntu-22.04

				    permissions:

				      contents: read

				      security-events: write

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

				        with:

				          fetch-depth: 0

				          persist-credentials: false

				      - name: Run zizmor

				        uses: zizmorcore/zizmor-action@f52a838cfabf134edcbaa7c8b3677dde20045018 # v0.1.1

3

.gitignore vendored

View File

@@ -15,3 +15,6 @@ src/agent/protocols/src/*.rs
 !src/agent/protocols/src/lib.rs
 build
 src/tools/log-parser/kata-log-parser
 tools/packaging/static-build/agent/install_libseccomp.sh
 .envrc
 .direnv

83

CODEOWNERS

View File

@@ -1,4 +1,4 @@
 # Copyright (c) 2019 Intel Corporation
 # Copyright (c) 2019-2023 Intel Corporation
 #
 # SPDX-License-Identifier: Apache-2.0
 #
@@ -9,4 +9,83 @@
 # Order in this file is important. Only the last match will be
 # used. See https://help.github.com/articles/about-code-owners/
 *.md    @kata-containers/documentation
 /CODEOWNERS			@kata-containers/codeowners
 VERSION				@kata-containers/release
 # The versions database needs careful handling
 versions.yaml			@kata-containers/release @kata-containers/ci @kata-containers/tests
 Makefile*			@kata-containers/build
 *.mak				@kata-containers/build
 *.mk				@kata-containers/build
 # Documentation related files could also appear anywhere
 # else in the repo.
 *.md				@kata-containers/documentation
 *.drawio			@kata-containers/documentation
 *.jpg				@kata-containers/documentation
 *.png				@kata-containers/documentation
 *.svg				@kata-containers/documentation
 *.bash				@kata-containers/shell
 *.sh				@kata-containers/shell
 **/completions/			@kata-containers/shell
 Dockerfile*			@kata-containers/docker
 /ci/				@kata-containers/ci
 *.bats				@kata-containers/tests
 /tests/				@kata-containers/tests
 *.rs				@kata-containers/rust
 *.go				@kata-containers/golang
 /utils/				@kata-containers/utils
 # FIXME: Maybe a new "protocol" team would be better?
 #
 # All protocol changes must be reviewed.
 # Note, we include all subdirs, including the vendor dir, as at present there are no .proto files
 # in the vendor dir. Later we may have to extend this matching rule if that changes.
 /src/libs/protocols/*.proto	@kata-containers/architecture-committee @kata-containers/builder @kata-containers/packaging
 # GitHub Actions
 /.github/workflows/		@kata-containers/action-admins @kata-containers/ci
 /ci/				@kata-containers/ci @kata-containers/tests
 /docs/				@kata-containers/documentation
 /src/agent/			@kata-containers/agent
 /src/runtime*/			@kata-containers/runtime
 /src/runtime/			@kata-containers/golang
 src/runtime-rs/			@kata-containers/rust
 src/libs/			@kata-containers/rust
 src/dragonball/			@kata-containers/dragonball
 /tools/osbuilder/		@kata-containers/builder
 /tools/packaging/		@kata-containers/packaging
 /tools/packaging/kernel/	@kata-containers/kernel
 /tools/packaging/kata-deploy/	@kata-containers/kata-deploy
 /tools/packaging/qemu/		@kata-containers/qemu
 /tools/packaging/release/	@kata-containers/release
 **/vendor/			@kata-containers/vendoring
 # Handle arch specific files last so they match more specifically than
 # the kernel packaging files.
 **/*aarch64*			@kata-containers/arch-aarch64
 **/*arm64*			@kata-containers/arch-aarch64
 **/*amd64*			@kata-containers/arch-amd64
 **/*x86-64*			@kata-containers/arch-amd64
 **/*x86_64*			@kata-containers/arch-amd64
 **/*ppc64*			@kata-containers/arch-ppc64le
 **/*s390x*			@kata-containers/arch-s390x

									
										5

Makefile
									
												View File
												
				@@ -1,4 +1,4 @@

				# Copyright (c) 2020 Intel Corporation

				# Copyright (c) 2020-2023 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				@@ -18,7 +18,6 @@ TOOLS =

				TOOLS += agent-ctl

				TOOLS += kata-ctl

				TOOLS += log-parser

				TOOLS += log-parser-rs

				TOOLS += runk

				TOOLS += trace-forwarder

				@@ -43,7 +42,7 @@ generate-protocols:

				# Some static checks rely on generated source files of components.

				static-checks: static-checks-build

					bash ci/static-checks.sh

					bash tests/static-checks.sh github.com/kata-containers/kata-containers

				docs-url-alive-check:

					bash ci/docs-url-alive-check.sh

									
										12

README.md
									
												View File
												
				@@ -1,6 +1,7 @@

				<img src="https://object-storage-ca-ymq-1.vexxhost.net/swift/v1/6e4619c416ff4bd19e1c087f27a43eea/www-images-prod/openstack-logo/kata/SVG/kata-1.svg" width="900">

				[![CI | Publish Kata Containers payload](https://github.com/kata-containers/kata-containers/actions/workflows/payload-after-push.yaml/badge.svg)](https://github.com/kata-containers/kata-containers/actions/workflows/payload-after-push.yaml) [![Kata Containers Nightly CI](https://github.com/kata-containers/kata-containers/actions/workflows/ci-nightly.yaml/badge.svg)](https://github.com/kata-containers/kata-containers/actions/workflows/ci-nightly.yaml)

				[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/kata-containers/kata-containers/badge)](https://scorecard.dev/viewer/?uri=github.com/kata-containers/kata-containers)

				# Kata Containers

				@@ -123,7 +124,7 @@ The table below lists the core parts of the project:

				| [agent](src/agent) | core | Management process running inside the virtual machine / POD that sets up the container environment. |

				| [`dragonball`](src/dragonball) | core | An optional built-in VMM brings out-of-the-box Kata Containers experience with optimizations on container workloads |

				| [documentation](docs) | documentation | Documentation common to all components (such as design and install documentation). |

				| [tests](https://github.com/kata-containers/tests) | tests | Excludes unit tests which live with the main code. |

				| [tests](tests) | tests | Excludes unit tests which live with the main code. |

				### Additional components

				@@ -137,17 +138,22 @@ The table below lists the remaining parts of the project:

				| [kata-debug](tools/packaging/kata-debug/README.md) | infrastructure | Utility tool to gather Kata Containers debug information from Kubernetes clusters. |

				| [`agent-ctl`](src/tools/agent-ctl) | utility | Tool that provides low-level access for testing the agent. |

				| [`kata-ctl`](src/tools/kata-ctl) | utility | Tool that provides advanced commands and debug facilities. |

				| [`log-parser-rs`](src/tools/log-parser-rs) | utility | Tool that aid in analyzing logs from the kata runtime. |

				| [`trace-forwarder`](src/tools/trace-forwarder) | utility | Agent tracing helper. |

				| [`runk`](src/tools/runk) | utility | Standard OCI container runtime based on the agent. |

				| [`ci`](https://github.com/kata-containers/ci) | CI | Continuous Integration configuration files and scripts. |

				| [`ci`](.github/workflows) | CI | Continuous Integration configuration files and scripts. |

				| [`ocp-ci`](ci/openshift-ci/README.md) | CI | Continuous Integration configuration for the OpenShift pipelines. |

				| [`katacontainers.io`](https://github.com/kata-containers/www.katacontainers.io) | Source for the [`katacontainers.io`](https://www.katacontainers.io) site. |

				| [`Webhook`](tools/testing/kata-webhook/README.md) | utility | Example of a simple admission controller webhook to annotate pods with the Kata runtime class |

				### Packaging and releases

				Kata Containers is now

				[available natively for most distributions](docs/install/README.md#packaged-installation-methods).

				## General tests

				See the [tests documentation](tests/README.md).

				## Metrics tests

				See the [metrics documentation](tests/metrics/README.md).

2

VERSION

View File

@@ -1 +1 @@
 .2.0-rc0
 .19.1

									
										416

ci/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,416 @@

				# Kata Containers CI

				> [!WARNING]

				> While this project's CI has several areas for improvement, it is constantly

				> evolving. This document attempts to describe its current state, but due to

				> ongoing changes, you may notice some outdated information here. Feel free to

				> modify/improve this document as you use the CI and notice anything odd. The

				> community appreciates it!

				## Introduction

				The Kata Containers CI relies on [GitHub Actions][gh-actions], where the actions

				themselves can be found in the `.github/workflows` directory, and they may call

				helper scripts, which are located under the `tests` directory, to actually

				perform the tasks required for each test case.

				## The different workflows

				There are a few different sets of workflows that are running as part of our CI,

				and here we're going to cover the ones that are less likely to get rotten.  With

				this said, it's fair to advise that if the reader finds something that got

				rotten, opening an issue to the project pointing to the problem is a nice way to

				help, and providing a fix for the issue is a very encouraging way to help.

				### Jobs that run automatically when a PR is raised

				These are a bunch of tests that will automatically run as soon as a PR is

				opened, they're mostly running on "cost free" runners, and they do some

				pre-checks to evaluate that your PR may be okay to start getting reviewed.

				Mind, though, that the community expects the contributors to, at least, build

				their code before submitting a PR, which the community sees as a very fair

				request.

				Without getting into the weeds with details on this, those jobs are the ones

				responsible for ensuring that:

				- The commit message is in the expected format

				- There's no missing Developer's Certificate of Origin

				- Static checks are passing

				### Jobs that require a maintainer's approval to run

				There are some tests, and our so-called "CI".  These require a

				maintainer's approval to run as parts of those jobs will be running on "paid

				runners", which are currently using Azure infrastructure.

				Once a maintainer of the project gives "the green light" (currently by adding an

				`ok-to-test` label to the PR, soon to be changed to commenting "/test" as part

				of a PR review), the following tests will be executed:

				- Build all the components (runs on free cost runners, or bare-metal depending on the architecture)

				- Create a tarball with all the components (runs on free cost runners, or bare-metal depending on the architecture)

				- Create a kata-deploy payload with the tarball generated in the previous step (runs on free costs runner, or bare-metal depending on the architecture)

				- Run the following tests:

				  - Tests depending on the generated tarball

				    - Metrics (runs on bare-metal)

				    - `docker` (runs on cost free runners)

				    - `nerdctl` (runs on cost free runners)

				    - `kata-monitor` (runs on cost free runners)

				    - `cri-containerd` (runs on cost free runners)

				    - `nydus` (runs on cost free runners)

				    - `vfio` (runs on cost free runners)

				  - Tests depending on the generated kata-deploy payload

				    - kata-deploy (runs on cost free runners)

				      - Tests are performed using different "Kubernetes flavors", such as k0s, k3s, rke2, and Azure Kubernetes Service (AKS).

				    - Kubernetes (runs in Azure small and medium instances depending on what's required by each test, and on TEE bare-metal machines)

				      - Tests are performed with different runtime engines, such as CRI-O and containerd.

				      - Tests are performed with different snapshotters for containerd, namely OverlayFS and devmapper.

				      - Tests are performed with all the supported hypervisors, which are Cloud Hypervisor, Dragonball, Firecracker, and QEMU.

				For all the tests relying on Azure instances, real money is being spent, so the

				community asks for the maintainers to be mindful about those, and avoid abusing

				them to merely debug issues.

				## The different runners

				In the previous section we've mentioned using different runners, now in this section we'll go through each type of runner used.

				- Cost free runners:  Those are the runners provided by GitHub itself, and

				  those are fairly small machines with virtualization capabilities enabled.

				- Azure small instances: Those are runners which have virtualization

				  capabilities enabled, 2 CPUs, and 8GB of RAM.  These runners have a "-smaller"

				  suffix to their name.

				- Azure normal instances: Those are runners which have virtualization

				  capabilities enabled, 4 CPUs, and 16GB of RAM.  These runners are usually

				  `garm` ones with no "-smaller" suffix.

				- Bare-metal runners: Those are runners provided by community contributors,

				  and they may vary in architecture, size and virtualization capabilities.

				  Builder runners don't actually require any virtualization capabilities, while

				  runners which will be actually performing the tests must have virtualization

				  capabilities and a reasonable amount for CPU and RAM available (at least

				  matching the Azure normal instances).

				## Adding new tests

				Before someone decides to add a new test, we strongly recommend them to go

				through [GitHub Actions Documentation][gh-actions],

				which will provide you a very sensible background on how to read and understand

				current tests we have, and also become familiar with how to write a new test.

				On the Kata Containers land, there are basically two sets of tests: "standalone"

				and "part of something bigger".

				The "standalone" tests, for example the commit message check, won't be covered

				here as they're better covered by the GitHub Actions documentation pasted above.

				The "part of something bigger" is the more complicated one and not so

				straightforward to add, so we'll be focusing our efforts on describing the

				addition of those.

				> [!NOTE]

				> TODO: Currently, this document refers to "tests" when it actually means the

				> jobs (or workflows) of GitHub. In an ideal world, except in some specific cases,

				> new tests should be added without the need to add new workflows. In the

				> not-too-distant future (hopefully), we will improve the workflows to support

				> this.

				### Adding a new test that's "part of something bigger"

				The first important thing here is to align expectations, and we must say that

				the community strongly prefers receiving tests that already come with:

				- Instructions how to run them

				- A proven run where it's passing

				There are several ways to achieve those two requirements, and an example of that

				can be seen in PR #8115.

				With the expectations aligned, adding a test consists in:

				- Adding a new yaml file for your test, and ensure it's called from the

				  "bigger" yaml. See the [Kata Monitor test example][monitor-ex01].

				- Adding the helper scripts needed for your test to run. Again, use the [Kata Monitor script as example][monitor-ex02].

				Following those examples, the community advice during the review, and even

				asking the community directly on Slack are the best ways to get your test

				accepted.

				## Required tests

				In our CI we have two categories of jobs - required and non-required:

				- Required jobs need to all pass for a PR to be merged normally and

				should cover all the core features on Kata Containers that we want to

				ensure don't have regressions.

				- The non-required jobs are for unstable tests, or for features that

				are experimental and not-fully supported. We'd like those tests to also

				pass on all PRs ideally, but don't block merging if they don't as it's

				not necessarily an indication of the PR code causing regressions.

				### Transitioning between required and non-required status

				Required jobs that fail block merging of PRs, so we want to ensure that

				jobs are stable and maintained before we make them required.

				The [Kata Containers CI Dashboard](https://kata-containers.github.io/)

				is a useful resource to check when collecting evidence of job stability.

				At time of writing it reports the last ten days of Kata CI nightly test

				results for each job. This isn't perfect as it doesn't currently capture

				results on PRs, but is a good guideline for stability.

				> [!NOTE]

				> Below are general guidelines about jobs being marked as

				> required/non-required, but they are subject to change and the Kata

				> Architecture Committee may overrule these guidelines at their

				> discretion.

				#### Initial marking as required

				For new jobs, or jobs that haven't been marked as required recently,

				the criteria to be initially marked as required is ten days

				of passing tests, with no relevant PR failures reported in that time.

				Required jobs also need one or more nominated maintainers that are

				responsible for the stability of their jobs. Maintainers can be registered

				in [`maintainers.yml`](https://github.com/kata-containers/kata-containers.github.io/blob/main/maintainers.yml)

				and will then show on the CI Dashboard.

				To add transparency to making jobs required/non-required and to keep the

				GitHub UI in sync with the [Gatekeeper job](../tools/testing/gatekeeper),

				the process to update a job's required state is as follows:

				1. Create a PR to update `maintainers.yml`, if new maintainers are being

				declared on a CI job.

				1. Create a PR which updates

				[`required-tests.yaml`](../tools/testing/gatekeeper/required-tests.yaml)

				adding the new job and listing the evidence that the job meets the

				requirements above. Ensure that all maintainers and

				@kata-containers/architecture-committee are notified to give them the

				opportunity to review the PR. See

				[#11015](https://github.com/kata-containers/kata-containers/pull/11015)

				as an example.

				1. The maintainers and Architecture Committee get a chance to review the PR.

				It can be discussed in an AC meeting to get broader input.

				1. Once the PR has been merged, a Kata Containers admin should be notified

				to ensure that the GitHub UI is updated to reflect the change in

				`required-tests.yaml`.

				#### Expectation of required job maintainers

				Due to the nature of the Kata Containers community having contributors

				spread around the world, required jobs being blocked due to infrastructure,

				or test issues can have a big impact on work. As such, the expectation is

				that when a problem with a required job is noticed/reported, the maintainers

				have one working day to acknowledge the issue, perform an initial

				investigation and then either fix it, or get it marked as non-required

				whilst the investigation and/or fix it done.

				### Re-marking of required status

				Once a job has been removed from the required list, it requires two

				consecutive successful nightly test runs before being made required

				again.

				## Running tests

				### Running the tests as part of the CI

				If you're a maintainer of the project, you'll be able to kick in the tests by

				yourself.  With the current approach, you just need to add the `ok-to-test`

				label and the tests will automatically start.  We're moving, though, to use a

				`/test` command as part of a GitHub review comment, which will simplify this

				process.

				If you're not a maintainer, please, send a message on Slack or wait till one of

				the maintainers reviews your PR.  Maintainers will then kick in the tests on

				your behalf.

				In case a test fails and there's the suspicion it happens due to flakiness in

				the test itself, please, create an issue for us, and then re-run (or asks

				maintainers to re-run) the tests following these steps:

				- Locate which tests is failing

				- Click in "details"

				- In the top right corner, click in "Re-run jobs"

				- And then in "Re-run failed jobs"

				- And finally click in the green "Re-run jobs" button

				> [!NOTE]

				> TODO: We need figures here

				### Running the tests locally

				In this section, aligning expectations is also something very important, as one

				will not be able to run the tests exactly in the same way the tests are running

				in the CI, as one most likely won't have access to an Azure subscription.

				However, we're trying our best here to provide you with instructions on how to

				run the tests in an environment that's "close enough" and will help you to debug

				issues you find with the current tests, or even provide a proof-of-concept to

				the new test you're trying to add.

				The basic steps, which we will cover in details down below are:

				 1. Create a VM matching the configuration of the target runner

				 2. Generate the artifacts you'll need for the test, or download them from a

				    current failed run

				 3. Follow the steps provided in the action itself to run the tests.

				Although the general overview looks easy, we know that some tricks need to be

				shared, and we'll go through the general process of debugging one non-Kubernetes

				and one Kubernetes specific test for educational purposes.

				One important thing to note is that "Create a VM" can be done in innumerable

				different ways, using the tools of your choice.  For the sake of simplicity on

				this guide, we'll be using `kcli`, which we strongly recommend in case you're a

				non-experienced user, and happen to be developing on a Linux box.

				For both non-Kubernetes and Kubernetes cases, we'll be using PR #8070 as an

				example, which at the time this document is being written serves us very well

				the purpose, as you can see that we have `nerdctl` and Kubernetes tests failing.

				## Debugging tests

				### Debugging a non Kubernetes test

				As shown above, the `nerdctl` test is failing.

				As a developer you can go ahead to the details of the job, and expand the job

				that's failing in order to gather more information.

				But when that doesn't help, we need to set up our own environment to debug

				what's going on.

				Taking a look at the `nerdctl` test, which is located here, you can easily see

				that it runs-on a `garm-ubuntu-2304-smaller` virtual machine.

				The important parts to understand are `ubuntu-2304`, which is the OS where the

				test is running on; and "smaller", which means we're running it on a machine

				with 2 CPUs and 8GB of RAM.

				With this information, we can go ahead and create a similar VM locally using `kcli`.

				```bash

				$ sudo kcli create vm -i ubuntu2304 -P disks=[60] -P numcpus=2 -P memory=8192 -P cpumodel=host-passthrough debug-nerdctl-pr8070

				```

				In order to run the tests, you'll need the "kata-tarball" artifacts, which you

				can build your own using "make kata-tarball" (see below), or simply get them

				from the PR where the tests failed.  To download them, click on the "Summary"

				button that's on the top left corner, and then scroll down till you see the

				artifacts, as shown below.

				Unfortunately GitHub doesn't give us a link that we can download those from

				inside the VM, but we can download them on our local box, and then `scp` the

				tarball to the newly created VM that will be used for debugging purposes.

				> [!NOTE]

				> Those artifacts are only available (for 15 days) when all jobs are finished.

				Once you have the `kata-static.tar.xz` in your VM, you can login to the VM with

				`kcli ssh debug-nerdctl-pr8070`, go ahead and then clone your development branch

				```bash

				$ git clone --branch feat_add-fc-runtime-rs https://github.com/nubificus/kata-containers

				```

				Add the upstream as a remote, set up your git, and rebase your branch atop of the upstream main one

				```bash

				$ git remote add upstream https://github.com/kata-containers/kata-containers

				$ git remote update

				$ git config --global user.email "you@example.com"

				$ git config --global user.name "Your Name"

				$ git rebase upstream/main

				```

				Now copy the `kata-static.tar.xz` into your `kata-containers/kata-artifacts` directory

				```bash

				$ mkdir kata-artifacts

				$ cp ../kata-static.tar.xz kata-artifacts/

				```

				> [!NOTE]

				> If you downloaded the .zip from GitHub you need to uncompress first to see `kata-static.tar.xz`

				And finally run the tests following what's in the yaml file for the test you're

				debugging.

				In our case, the `run-nerdctl-tests-on-garm.yaml`.

				When looking at the file you'll notice that some environment variables are set,

				such as `KATA_HYPERVISOR`, and should be aware that, for this particular example,

				the important steps to follow are:

				Install the dependencies

				Install kata

				Run the tests

				Let's now run the steps mentioned above exporting the expected environment variables

				```bash

				$ export KATA_HYPERVISOR=dragonball

				$ bash ./tests/integration/nerdctl/gha-run.sh install-dependencies

				$ bash ./tests/integration/nerdctl/gha-run.sh install-kata

				$ bash tests/integration/nerdctl/gha-run.sh run

				```

				And with this you should've been able to reproduce exactly the same issue found

				in the CI, and from now on you can build your own code, use your own binaries,

				and have fun debugging and hacking!

				### Debugging a Kubernetes test

				Steps for debugging the Kubernetes tests are very similar to the ones for

				debugging non-Kubernetes tests, with the caveat that what you'll need, this

				time, is not the `kata-static.tar.xz` tarball, but rather a payload to be used

				with kata-deploy.

				In order to generate your own kata-deploy image you can generate your own

				`kata-static.tar.xz` and then take advantage of the following script.  Be aware

				that the image generated and uploaded must be accessible by the VM where you'll

				be performing your tests.

				In case you want to take advantage of the payload that was already generated

				when you faced the CI failure, which is considerably easier, take a look at the

				failed job, then click in "Deploy Kata" and expand the "Final kata-deploy.yaml

				that is used in the test" section.  From there you can see exactly what you'll

				have to use when deploying kata-deploy in your local cluster.

				> [!NOTE]

				> TODO: WAINER TO FINISH THIS PART BASED ON HIS PR TO RUN A LOCAL CI

				## Adding new runners

				Any admin of the project is able to add or remove GitHub runners, and those are

				the folks you should rely on.

				If you need a new runner added, please, tag @ac in the Kata Containers slack,

				and someone from that group will be able to help you.

				If you're part of that group and you're looking for information on how to help

				someone, this is simple, and must be done in private. Basically what you have to

				do is:

				- Go to the kata-containers/kata-containers repo

				- Click on the Settings button, located in the top right corner

				- On the left panel, under "Code and automation", click on "Actions"

				- Click on "Runners"

				If you want to add a new self-hosted runner:

				- In the top right corner there's a green button called "New self-hosted runner"

				If you want to remove a current self-hosted runner:

				- For each runner there's a "..." menu, where you can just click and the

				  "Remove runner" option will show up

				## Known limitations

				As the GitHub actions are structured right now we cannot: Test the addition of a

				GitHub action that's not triggered by a pull_request event as part of the PR.

				[gh-actions]: https://docs.github.com/en/actions

				[monitor-ex01]: https://github.com/kata-containers/kata-containers/commit/a3fb067f1bccde0cbd3fd4d5de12dfb3d8c28b60

				[monitor-ex02]: https://github.com/kata-containers/kata-containers/commit/489caf1ad0fae27cfd00ba3c9ed40e3d512fa492

									
										30

ci/darwin-test.sh
									
												View File
												
				@@ -7,16 +7,16 @@

				set -e

				cidir=$(dirname "$0")

				runtimedir=$cidir/../src/runtime

				runtimedir=${cidir}/../src/runtime

				build_working_packages() {

					# working packages:

					device_api=$runtimedir/pkg/device/api

					device_config=$runtimedir/pkg/device/config

					device_drivers=$runtimedir/pkg/device/drivers

					device_manager=$runtimedir/pkg/device/manager

					rc_pkg_dir=$runtimedir/pkg/resourcecontrol/

					utils_pkg_dir=$runtimedir/virtcontainers/utils

					device_api=${runtimedir}/pkg/device/api

					device_config=${runtimedir}/pkg/device/config

					device_drivers=${runtimedir}/pkg/device/drivers

					device_manager=${runtimedir}/pkg/device/manager

					rc_pkg_dir=${runtimedir}/pkg/resourcecontrol/

					utils_pkg_dir=${runtimedir}/virtcontainers/utils

					# broken packages :( :

					#katautils=$runtimedir/pkg/katautils

				@@ -24,15 +24,15 @@ build_working_packages() {

					#vc=$runtimedir/virtcontainers

					pkgs=(

						"$device_api"

						"$device_config"

						"$device_drivers"

						"$device_manager"

						"$utils_pkg_dir"

						"$rc_pkg_dir")

						"${device_api}"

						"${device_config}"

						"${device_drivers}"

						"${device_manager}"

						"${utils_pkg_dir}"

						"${rc_pkg_dir}")

					for pkg in "${pkgs[@]}"; do

						echo building "$pkg"

						pushd "$pkg" &>/dev/null

						echo building "${pkg}"

						pushd "${pkg}" &>/dev/null

						go build

						go test

						popd &>/dev/null

									
										2

ci/docs-url-alive-check.sh
									
												View File
												
				@@ -7,6 +7,6 @@

				set -e

				cidir=$(dirname "$0")

				source "${cidir}/lib.sh"

				source "${cidir}/../tests/common.bash"

				run_docs_url_alive_check

									
										184

ci/gh-util.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,184 @@

				#!/bin/bash

				# Copyright (c) 2020 Intel Corporation

				# Copyright (c) 2024 IBM Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				set -o errexit

				set -o errtrace

				set -o nounset

				set -o pipefail

				[[ -n "${DEBUG:-}" ]] && set -o xtrace

				script_name=${0##*/}

				#---------------------------------------------------------------------

				die()

				{

				    echo >&2 "$*"

				    exit 1

				}

				usage()

				{

				    cat <<EOF

				Usage: ${script_name} [OPTIONS] [command] [arguments]

				Description: Utility to expand the abilities of the GitHub CLI tool, gh.

				Command descriptions:

				  list-issues-for-pr     List issues linked to a PR.

				  list-labels-for-issue  List labels, in json format for an issue

				Commands and arguments:

				  list-issues-for-pr <pr>

				  list-labels-for-issue <issue>

				Options:

				 -h                 Show this help statement.

				 -r <owner/repo>    Optional <org/repo> specification. Default: 'kata-containers/kata-containers'

				Examples:

				- List issues for a Pull Request 123 in kata-containers/kata-containers repo

				  $ ${script_name} list-issues-for-pr 123

				EOF

				}

				list_issues_for_pr()

				{

				    local pr="${1:-}"

				    local repo="${2:-kata-containers/kata-containers}"

				    [[ -z "${pr}" ]] && die "need PR"

				    local commits

					commits=$(gh pr view "${pr}" --repo "${repo}" --json commits --jq .commits[].messageBody)

				    [[ -z "${commits}" ]] && die "cannot determine commits for PR ${pr}"

				    # Extract the issue number(s) from the commits.

				    #

				    # This needs to be careful to take account of lines like this:

				    #

				    # fixes 99

				    # fixes: 77

				    # fixes #123.

				    # Fixes: #1, #234, #5678.

				    #

				    # Note the exclusion of lines starting with whitespace which is

				    # specifically to ignore vendored git log comments, which are whitespace

				    # indented and in the format:

				    #

				    #     "<git-commit> <git-commit-msg>"

				    #

				    local issues

					issues=$(echo "${commits}" |\

				        grep -v -E "^( |	)" |\

				        grep -i -E "fixes:* *(#*[0-9][0-9]*)" |\

				        tr ' ' '\n' |\

				        grep "[0-9][0-9]*" |\

				        sed 's/[.,\#]//g' |\

				        sort -nu || true)

				    [[ -z "${issues}" ]] && die "cannot determine issues for PR ${pr}"

				    echo "# Issues linked to PR"

				    echo "#"

				    echo "# Fields: issue_number"

				    local issue

				    echo "${issues}" | while read -r issue

				    do

				        printf "%s\n" "${issue}"

				    done

				}

				list_labels_for_issue()

				{

				    local issue="${1:-}"

				    [[ -z "${issue}" ]] && die "need issue number"

				    local labels

					labels=$(gh issue view "${issue}" --repo kata-containers/kata-containers --json labels)

				    [[ -z "${labels}" ]] && die "cannot determine labels for issue ${issue}"

				    echo "${labels}"

				}

				setup()

				{

				    for cmd in gh jq

				    do

				        command -v "${cmd}" &>/dev/null || die "need command: ${cmd}"

				    done

				}

				handle_args()

				{

				    setup

				    local opt

				    while getopts "hr:" opt "$@"

				    do

				        case "${opt}" in

				            h) usage && exit 0 ;;

				            r) repo="${OPTARG}" ;;

							*) echo "use '-h' to get list of supprted aruments" && exit 1 ;;

				        esac

				    done

				    shift $((OPTIND - 1))

				    local repo="${repo:-kata-containers/kata-containers}"

				    local cmd="${1:-}"

				    case "${cmd}" in

				        list-issues-for-pr) ;;

				        list-labels-for-issue) ;;

				        "") usage && exit 0 ;;

				        *) die "invalid command: '${cmd}'" ;;

				    esac

				    # Consume the command name

				    shift

				    local issue=""

				    local pr=""

				    case "${cmd}" in

				        list-issues-for-pr)

				            pr="${1:-}"

				            list_issues_for_pr "${pr}" "${repo}"

				            ;;

				        list-labels-for-issue)

				            issue="${1:-}"

				            list_labels_for_issue "${issue}"

				            ;;

				        *) die "impossible situation: cmd: '${cmd}'" ;;

				    esac

				    exit 0

				}

				main()

				{

				    handle_args "$@"

				}

				main "$@"

									
										22

ci/install_go.sh
									
												View File
											
				@@ -1,22 +0,0 @@

				#!/usr/bin/env bash

				#

				# Copyright (c) 2019 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				set -e

				cidir=$(dirname "$0")

				source "${cidir}/lib.sh"

				clone_tests_repo

				new_goroot=/usr/local/go

				pushd "${tests_repo_dir}"

				# Force overwrite the current version of golang

				[ -z "${GOROOT}" ] || rm -rf "${GOROOT}"

				.ci/install_go.sh -p -f -d "$(dirname ${new_goroot})"

				[ -z "${GOROOT}" ] || sudo ln -sf "${new_goroot}" "${GOROOT}"

				go version

				popd

									
										109

ci/install_libseccomp.sh
									
												View File
												
				@@ -7,12 +7,9 @@

				set -o errexit

				cidir=$(dirname "$0")

				source "${cidir}/lib.sh"

				script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

				clone_tests_repo

				source "${tests_repo_dir}/.ci/lib.sh"

				source "${script_dir}/../tests/common.bash"

				# The following variables if set on the environment will change the behavior

				# of gperf and libseccomp configure scripts, that may lead this script to

				@@ -24,12 +21,12 @@ workdir="$(mktemp -d --tmpdir build-libseccomp.XXXXX)"

				# Variables for libseccomp

				libseccomp_version="${LIBSECCOMP_VERSION:-""}"

				if [ -z "${libseccomp_version}" ]; then

				    libseccomp_version=$(get_version "externals.libseccomp.version")

				if [[ -z "${libseccomp_version}" ]]; then

					libseccomp_version=$(get_from_kata_deps ".externals.libseccomp.version")

				fi

				libseccomp_url="${LIBSECCOMP_URL:-""}"

				if [ -z "${libseccomp_url}" ]; then

				    libseccomp_url=$(get_version "externals.libseccomp.url")

				if [[ -z "${libseccomp_url}" ]]; then

					libseccomp_url=$(get_from_kata_deps ".externals.libseccomp.url")

				fi

				libseccomp_tarball="libseccomp-${libseccomp_version}.tar.gz"

				libseccomp_tarball_url="${libseccomp_url}/releases/download/v${libseccomp_version}/${libseccomp_tarball}"

				@@ -37,77 +34,79 @@ cflags="-O2"

				# Variables for gperf

				gperf_version="${GPERF_VERSION:-""}"

				if [ -z "${gperf_version}" ]; then

				    gperf_version=$(get_version "externals.gperf.version")

				if [[ -z "${gperf_version}" ]]; then

					gperf_version=$(get_from_kata_deps ".externals.gperf.version")

				fi

				gperf_url="${GPERF_URL:-""}"

				if [ -z "${gperf_url}" ]; then

				    gperf_url=$(get_version "externals.gperf.url")

				if [[ -z "${gperf_url}" ]]; then

					gperf_url=$(get_from_kata_deps ".externals.gperf.url")

				fi

				gperf_tarball="gperf-${gperf_version}.tar.gz"

				gperf_tarball_url="${gperf_url}/${gperf_tarball}"

				# We need to build the libseccomp library from sources to create a static library for the musl libc.

				# However, ppc64le and s390x have no musl targets in Rust. Hence, we do not set cflags for the musl libc.

				if ([ "${arch}" != "ppc64le" ] && [ "${arch}" != "s390x" ]); then

				    # Set FORTIFY_SOURCE=1 because the musl-libc does not have some functions about FORTIFY_SOURCE=2

				    cflags="-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -O2"

				# We need to build the libseccomp library from sources to create a static

				# library for the musl libc.

				# However, ppc64le, riscv64 and s390x have no musl targets in Rust. Hence, we do

				# not set cflags for the musl libc.

				if [[ "${arch}" != "ppc64le" ]] && [[ "${arch}" != "riscv64" ]] && [[ "${arch}" != "s390x" ]]; then

					# Set FORTIFY_SOURCE=1 because the musl-libc does not have some functions about FORTIFY_SOURCE=2

					cflags="-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -O2"

				fi

				die() {

				    msg="$*"

				    echo "[Error] ${msg}" >&2

				    exit 1

					msg="$*"

					echo "[Error] ${msg}" >&2

					exit 1

				}

				finish() {

				    rm -rf "${workdir}"

					rm -rf "${workdir}"

				}

				trap finish EXIT

				build_and_install_gperf() {

				    echo "Build and install gperf version ${gperf_version}"

				    mkdir -p "${gperf_install_dir}"

				    curl -sLO "${gperf_tarball_url}"

				    tar -xf "${gperf_tarball}"

				    pushd "gperf-${gperf_version}"

				    # Unset $CC for configure, we will always use native for gperf

				    CC= ./configure --prefix="${gperf_install_dir}"

				    make

				    make install

				    export PATH=$PATH:"${gperf_install_dir}"/bin

				    popd

				    echo "Gperf installed successfully"

					echo "Build and install gperf version ${gperf_version}"

					mkdir -p "${gperf_install_dir}"

					curl -sLO "${gperf_tarball_url}"

					tar -xf "${gperf_tarball}"

					pushd "gperf-${gperf_version}"

					# Unset $CC for configure, we will always use native for gperf

					CC="" ./configure --prefix="${gperf_install_dir}"

					make

					make install

					export PATH=${PATH}:"${gperf_install_dir}"/bin

					popd

					echo "Gperf installed successfully"

				}

				build_and_install_libseccomp() {

				    echo "Build and install libseccomp version ${libseccomp_version}"

				    mkdir -p "${libseccomp_install_dir}"

				    curl -sLO "${libseccomp_tarball_url}"

				    tar -xf "${libseccomp_tarball}"

				    pushd "libseccomp-${libseccomp_version}"

				    [ "${arch}" == $(uname -m) ] && cc_name="" || cc_name="${arch}-linux-gnu-gcc"

				    CC=${cc_name} ./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static --host="${arch}"

				    make

				    make install

				    popd

				    echo "Libseccomp installed successfully"

					echo "Build and install libseccomp version ${libseccomp_version}"

					mkdir -p "${libseccomp_install_dir}"

					curl -sLO "${libseccomp_tarball_url}"

					tar -xf "${libseccomp_tarball}"

					pushd "libseccomp-${libseccomp_version}"

					[[ "${arch}" == $(uname -m) ]] && cc_name="" || cc_name="${arch}-linux-gnu-gcc"

					CC=${cc_name} ./configure --prefix="${libseccomp_install_dir}" CFLAGS="${cflags}" --enable-static --host="${arch}"

					make

					make install

					popd

					echo "Libseccomp installed successfully"

				}

				main() {

				    local libseccomp_install_dir="${1:-}"

				    local gperf_install_dir="${2:-}"

					local libseccomp_install_dir="${1:-}"

					local gperf_install_dir="${2:-}"

				    if [ -z "${libseccomp_install_dir}" ] || [ -z "${gperf_install_dir}" ]; then

				        die "Usage: ${0} <libseccomp-install-dir> <gperf-install-dir>"

				    fi

					if [[ -z "${libseccomp_install_dir}" ]] || [[ -z "${gperf_install_dir}" ]]; then

						die "Usage: ${0} <libseccomp-install-dir> <gperf-install-dir>"

					fi

				    pushd "$workdir"

				    # gperf is required for building the libseccomp.

				    build_and_install_gperf

				    build_and_install_libseccomp

				    popd

					pushd "${workdir}"

					# gperf is required for building the libseccomp.

					build_and_install_gperf

					build_and_install_libseccomp

					popd

				}

				main "$@"

									
										16

ci/install_rust.sh
									
												View File
											
				@@ -1,16 +0,0 @@

				#!/usr/bin/env bash

				# Copyright (c) 2019 Ant Financial

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				set -e

				cidir=$(dirname "$0")

				source "${cidir}/lib.sh"

				clone_tests_repo

				pushd ${tests_repo_dir}

				.ci/install_rust.sh ${1:-}

				popd

									
										19

ci/install_vc.sh
									
												View File
											
				@@ -1,19 +0,0 @@

				#!/usr/bin/env bash

				#

				# Copyright (c) 2018 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				set -e

				cidir=$(dirname "$0")

				vcdir="${cidir}/../src/runtime/virtcontainers/"

				source "${cidir}/lib.sh"

				export CI_JOB="${CI_JOB:-default}"

				clone_tests_repo

				if [ "${CI_JOB}" != "PODMAN" ]; then

					echo "Install virtcontainers"

					make -C "${vcdir}" && chronic sudo make -C "${vcdir}" install

				fi

									
										49

ci/install_yq.sh
									
												View File
												
				@@ -5,28 +5,57 @@

				# SPDX-License-Identifier: Apache-2.0

				#

				[[ -n "${DEBUG}" ]] && set -o xtrace

				# If we fail for any reason a message will be displayed

				die() {

					msg="$*"

					echo "ERROR: $msg" >&2

					echo "ERROR: ${msg}" >&2

					exit 1

				}

				function verify_yq_exists() {

					local yq_path=$1

					local yq_version=$2

					local expected="yq (https://github.com/mikefarah/yq/) version ${yq_version}"

					if [[ -x  "${yq_path}" ]] && [[ "$(${yq_path} --version)"X == "${expected}"X ]]; then

						return 0

					else

						return 1

					fi

				}

				# Install the yq yaml query package from the mikefarah github repo

				# Install via binary download, as we may not have golang installed at this point

				function install_yq() {

					local yq_pkg="github.com/mikefarah/yq"

					local yq_version=3.4.1

					local yq_version=v4.44.5

					local precmd=""

					local yq_path=""

					INSTALL_IN_GOPATH=${INSTALL_IN_GOPATH:-true}

					if [ "${INSTALL_IN_GOPATH}"  == "true" ];then

					if [[ "${INSTALL_IN_GOPATH}" == "true" ]]; then

						GOPATH=${GOPATH:-${HOME}/go}

						mkdir -p "${GOPATH}/bin"

						local yq_path="${GOPATH}/bin/yq"

						yq_path="${GOPATH}/bin/yq"

					else

						yq_path="/usr/local/bin/yq"

					fi

					[ -x  "${yq_path}" ] && [ "`${yq_path} --version`"X == "yq version ${yq_version}"X ] && return

					if verify_yq_exists "${yq_path}" "${yq_version}"; then

						echo "yq is already installed in correct version"

						return

					fi

					if [[ "${yq_path}" == "/usr/local/bin/yq" ]]; then

						# Check if we need sudo to install yq

						if [[ ! -w "/usr/local/bin" ]]; then

							# Check if we have sudo privileges

							if ! sudo -n true 2>/dev/null; then

								die "Please provide sudo privileges to install yq"

							else

								precmd="sudo"

							fi

						fi

					fi

					read -r -a sysInfo <<< "$(uname -sm)"

				@@ -47,12 +76,15 @@ function install_yq() {

						# If we're on an apple silicon machine, just assign amd64. 

						# The version of yq we use doesn't have a darwin arm build, 

						# but Rosetta can come to the rescue here.

						if [ $goos == "Darwin" ]; then 

						if [[ ${goos} == "Darwin" ]]; then

							goarch=amd64

						else 

							goarch=arm64

						fi

						;;

					"riscv64")

						goarch=riscv64

						;;

					"ppc64le")

						goarch=ppc64le

						;;

				@@ -75,9 +107,8 @@ function install_yq() {

					## NOTE: ${var,,} => gives lowercase value of var

					local yq_url="https://${yq_pkg}/releases/download/${yq_version}/yq_${goos}_${goarch}"

					curl -o "${yq_path}" -LSsf "${yq_url}"

					[ $? -ne 0 ] && die "Download ${yq_url} failed"

					chmod +x "${yq_path}"

					${precmd} curl -o "${yq_path}" -LSsf "${yq_url}" || die "Download ${yq_url} failed"

					${precmd} chmod +x "${yq_path}"

					if ! command -v "${yq_path}" >/dev/null; then

						die "Cannot not get ${yq_path} executable"

									
										66

ci/lib.sh
									
												View File
											
				@@ -1,66 +0,0 @@

				#

				# Copyright (c) 2018 Intel Corporation

				#

				# SPDX-License-Identifier: Apache-2.0

				set -o nounset

				export tests_repo="${tests_repo:-github.com/kata-containers/tests}"

				export tests_repo_dir="$GOPATH/src/$tests_repo"

				export branch="${target_branch:-main}"

				# Clones the tests repository and checkout to the branch pointed out by

				# the global $branch variable.

				# If the clone exists and `CI` is exported then it does nothing. Otherwise

				# it will clone the repository or `git pull` the latest code.

				#

				clone_tests_repo()

				{

					if [ -d "$tests_repo_dir" ]; then

						[ -n "${CI:-}" ] && return

						# git config --global --add safe.directory will always append

						# the target to .gitconfig without checking the existence of

						# the target, so it's better to check it before adding the target repo.

						local sd="$(git config --global --get safe.directory ${tests_repo_dir} || true)"

						if [ -z "${sd}" ]; then

							git config --global --add safe.directory ${tests_repo_dir}

						fi

						pushd "${tests_repo_dir}"

						git checkout "${branch}"

						git pull

						popd

					else

						git clone -q "https://${tests_repo}" "$tests_repo_dir"

						pushd "${tests_repo_dir}"

						git checkout "${branch}"

						popd

					fi

				}

				run_static_checks()

				{

					clone_tests_repo

					# Make sure we have the targeting branch

					git remote set-branches --add origin "${branch}"

					git fetch -a

					bash "$tests_repo_dir/.ci/static-checks.sh" "$@"

				}

				run_docs_url_alive_check()

				{

					clone_tests_repo

					# Make sure we have the targeting branch

					git remote set-branches --add origin "${branch}"

					git fetch -a

					bash "$tests_repo_dir/.ci/static-checks.sh" --docs --all "github.com/kata-containers/kata-containers"

				}

				run_get_pr_changed_file_details()

				{

					clone_tests_repo

					# Make sure we have the targeting branch

					git remote set-branches --add origin "${branch}"

					git fetch -a

					source "$tests_repo_dir/.ci/lib.sh"

					get_pr_changed_file_details

				}

									
										157

ci/openshift-ci/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,157 @@

				OpenShift CI

				============

				This directory contains scripts used by

				[the OpenShift CI](https://github.com/openshift/release/tree/master/ci-operator/config/kata-containers/kata-containers)

				pipelines to monitor selected functional tests on OpenShift.

				There are 2 pipelines, history and logs can be accessed here:

				* [main - currently supported OCP](https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-kata-containers-kata-containers-main-e2e-tests)

				* [next - currently under development OCP](https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-kata-containers-kata-containers-main-next-e2e-tests)

				Running openshift-tests on OCP with kata-containers manually

				============================================================

				To run openshift-tests (or other suites) with kata-containers one can use

				the kata-webhook. To deploy everything you can mimic the CI pipeline by:

				```bash

				#!/bin/bash -e

				# Setup your kubectl and check it's accessible by

				kubectl nodes

				# Deploy kata (set KATA_DEPLOY_IMAGE to override the default kata-deploy-ci:latest image)

				./test.sh

				# Deploy the webhook

				KATA_RUNTIME=kata-qemu cluster/deploy_webhook.sh

				```

				This should ensure kata-containers as well as kata-webhook are installed and

				working. Before running the openshift-tests it's (currently) recommended to

				ignore some security features by:

				```bash

				#!/bin/bash -e

				oc adm policy add-scc-to-group privileged system:authenticated system:serviceaccounts

				oc adm policy add-scc-to-group anyuid system:authenticated system:serviceaccounts

				oc label --overwrite ns default pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline

				```

				Now you should be ready to run the openshift-tests. Our CI only uses a subset

				of tests, to get the current ``TEST_SKIPS`` see

				[the pipeline config](https://github.com/openshift/release/tree/master/ci-operator/config/kata-containers/kata-containers).

				Following steps require the [openshift tests](https://github.com/openshift/origin)

				being cloned and built in the current directory:

				```bash

				#!/bin/bash -e

				# Define tests to be skipped (see the pipeline config for the current version)

				TEST_SKIPS="\[sig-node\] Security Context should support seccomp runtime/default\|\[sig-node\] Variable Expansion should allow substituting values in a volume subpath\|\[k8s.io\] Probing container should be restarted with a docker exec liveness probe with timeout\|\[sig-node\] Pods Extended Pod Container lifecycle evicted pods should be terminal\|\[sig-node\] PodOSRejection \[NodeConformance\] Kubelet should reject pod when the node OS doesn't match pod's OS\|\[sig-network\].*for evicted pods\|\[sig-network\].*HAProxy router should override the route\|\[sig-network\].*HAProxy router should serve a route\|\[sig-network\].*HAProxy router should serve the correct\|\[sig-network\].*HAProxy router should run\|\[sig-network\].*when FIPS.*the HAProxy router\|\[sig-network\].*bond\|\[sig-network\].*all sysctl on whitelist\|\[sig-network\].*sysctls should not affect\|\[sig-network\] pods should successfully create sandboxes by adding pod to network"

				# Get the list of tests to be executed

				TESTS="$(./openshift-tests run --dry-run --provider "${TEST_PROVIDER}" "${TEST_SUITE}")"

				# Store the list of tests in /tmp/tsts file

				echo "${TESTS}" | grep -v "$TEST_SKIPS" > /tmp/tsts

				# Remove previously-existing temporarily files as well as previous results

				OUT=RESULTS/tmp

				rm -Rf /tmp/*test* /tmp/e2e-*

				rm -R $OUT

				mkdir -p $OUT

				# Run the tests ignoring the monitor health checks

				./openshift-tests run --provider azure -o "$OUT/job.log" --junit-dir "$OUT" --file /tmp/tsts --max-parallel-tests 5 --cluster-stability Disruptive --run '^\[sig-node\].*|^\[sig-network\]'

				```

				[!NOTE]

				Note we are ignoring the cluster stability checks because our public cloud is

				not that stable and running with VMs instead of containers results in minor

				stability issues. Some of the old monitor stability tests do not reflect

				the ``--cluster-stability`` setting, one should simply ignore these. If you

				get a message like ``invariant was violated`` or ``error: failed due to a

				MonitorTest failure``, it's usually an indication that only those kind of

				tests failed but the real tests passed. See

				[wrapped-openshift-tests.sh](https://github.com/openshift/release/blob/master/ci-operator/config/kata-containers/kata-containers/wrapped-openshift-tests.sh)

				for details how our pipeline deals with that.

				[!TIP]

				To compare multiple results locally one can use

				[junit2html](https://github.com/inorton/junit2html) tool.

				Best-effort kata-containers cleanup

				===================================

				If you need to cleanup the cluster after testing, you can use the

				``cleanup.sh`` script from the current directory. It tries to delete all

				resources created by ``test.sh`` as well as ``cluster/deploy_webhook.sh``

				ignoring all failures. The primary purpose of this script is to allow

				soft-cleanup after deployment to test different versions without

				re-provisioning everything.

				[!WARNING]

				Do not rely on this script in production, return codes are not checked!**

				Bisecting e2e tests failures

				============================

				Let's say the OCP pipeline passed running with

				``quay.io/kata-containers/kata-deploy-ci:kata-containers-d7afd31fd40e37a675b25c53618904ab57e74ccd-amd64``

				but failed running with

				``quay.io/kata-containers/kata-deploy-ci:kata-containers-9f512c016e75599a4a921bd84ea47559fe610057-amd64``

				and you'd like to know which PR caused the regression. You can either run with

				all the 60 tags between or you can utilize the [bisecter](https://github.com/ldoktor/bisecter)

				to optimize the number of steps in between.

				Before running the bisection you need a reproducer script. Sample one called

				``sample-test-reproducer.sh`` is provided in this directory but you might

				want to copy and modify it, especially:

				* ``OCP_DIR`` - directory where your openshift/release is located (can be exported)

				* ``E2E_TEST`` - openshift-test(s) to be executed (can be exported)

				* behaviour of SETUP (returning 125 skips the current image tag, returning

				  >=128 interrupts the execution, everything else reports the tag as failure

				* what should be executed (perhaps running the setup is enough for you or

				  you might want to be looking for specific failures...)

				* use ``timeout`` to interrupt execution in case you know things should be faster

				Executing that script with the GOOD commit should pass

				``./sample-test-reproducer.sh quay.io/kata-containers/kata-deploy-ci:kata-containers-d7afd31fd40e37a675b25c53618904ab57e74ccd-amd64``

				and fail when executed with the BAD commit

				``./sample-test-reproducer.sh quay.io/kata-containers/kata-deploy-ci:kata-containers-9f512c016e75599a4a921bd84ea47559fe610057-amd64``.

				To get the list of all tags in between those two PRs you can use the

				``bisect-range.sh`` script

				```bash

				./bisect-range.sh d7afd31fd40e37a675b25c53618904ab57e74ccd 9f512c016e75599a4a921bd84ea47559fe610057

				```

				[!NOTE]

				The tagged images are only built per PR, not for individual commits. See

				[kata-deploy-ci](https://quay.io/kata-containers/kata-deploy-ci) to see the

				available images.

				To find out which PR caused this regression, you can either manually try the

				individual commits or you can simply execute:

				```bash

				bisecter start "$(./bisect-range.sh d7afd31fd40 9f512c016)"

				OCP_DIR=/path/to/openshift/release bisecter run ./sample-test-reproducer.sh

				```

				[!NOTE]

				If you use ``KATA_WITH_SYSTEM_QEMU=yes`` you might want to deploy once with

				it and skip it for the cleanup. That way you might (in most cases) test

				all images with a single MCP update instead of per-image MCP update.

				[!TIP]

				You can check the bisection progress during/after execution by running

				``bisecter log`` from the current directory. Before starting a new

				bisection you need to execute ``bisecter reset``.

				Peer pods

				=========

				It's possible to run similar testing on peer-pods using cloud-api-adaptor.

				Our CI configuration to run inside azure's OCP is in ``peer-pods-azure.sh``

				and can be used to replace the `test.sh` step in snippets above.

									
										30

ci/openshift-ci/bisect-range.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,30 @@

				#!/bin/bash

				# Copyright (c) 2024 Red Hat, Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				if [[ "$#" -gt 2 ]] || [[ "$#" -lt 1 ]] ; then

					echo "Usage: $0 GOOD [BAD]"

					echo "Prints list of available kata-deploy-ci tags between GOOD and BAD commits (by default BAD is the latest available tag)"

					exit 255

				fi

				GOOD="$1"

				[[ -n "$2" ]] && BAD="$2"

				ARCH=amd64

				REPO="quay.io/kata-containers/kata-deploy-ci"

				TAGS=$(skopeo list-tags "docker://${REPO}")

				# For testing

				#echo "$TAGS" > tags

				#TAGS=$(cat tags)

				# Only amd64

				TAGS=$(echo "${TAGS}" | jq '.Tags' | jq "map(select(endswith(\"${ARCH}\")))" | jq -r '.[]')

				# Sort by git

				SORTED=""

				[[ -n "${BAD}" ]] && LOG_ARGS="${GOOD}~1..${BAD}" || LOG_ARGS="${GOOD}~1.."

				for TAG in $(git log --merges --pretty=format:%H --reverse "${LOG_ARGS}"); do

					[[ "${TAGS}" =~ ${TAG} ]] && SORTED+="

				kata-containers-${TAG}-${ARCH}"

				done

				# Comma separated tags with repo

				echo "${SORTED}" | tail -n +2 | sed -e "s@^@${REPO}:@" | paste -s -d, -

									
										61

ci/openshift-ci/cleanup.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,61 @@

				#!/bin/bash

				#

				# Copyright (c) 2024 Red Hat, Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				# This script tries to removes most of the resources added by `test.sh` script

				# from the cluster.

				scripts_dir=$(dirname "$0")

				deployments_dir=${scripts_dir}/cluster/deployments

				# shellcheck disable=SC1091 # import based on variable

				source "${scripts_dir}/lib.sh"

				# Set your katacontainers repo dir location

				[[ -z "${katacontainers_repo_dir}" ]] && echo "Please set katacontainers_repo_dir variable to your kata repo"

				# Set to 'yes' if you want to configure SELinux to permissive on the cluster

				# workers.

				#

				SELINUX_PERMISSIVE=${SELINUX_PERMISSIVE:-no}

				# Enable workaround for OCP 4.13 https://github.com/kata-containers/kata-containers/pull/9206

				#

				WORKAROUND_9206_CRIO=${WORKAROUND_9206_CRIO:-no}

				# Ignore errors as we want best-effort-approach here

				trap - ERR

				# Delete webhook resources

				oc delete -f "${scripts_dir}/../../tools/testing/kata-webhook/deploy"

				oc delete -f "${scripts_dir}/cluster/deployments/configmap_kata-webhook.yaml.in"

				# Delete potential smoke-test resources

				oc delete -f "${scripts_dir}/smoke/service.yaml"

				oc delete -f "${scripts_dir}/smoke/service_kubernetes.yaml"

				oc delete -f "${scripts_dir}/smoke/http-server.yaml"

				# Delete test.sh resources

				oc delete -f "${deployments_dir}/relabel_selinux.yaml"

				if [[ "${WORKAROUND_9206_CRIO}" == "yes" ]]; then

					oc delete -f "${deployments_dir}/workaround-9206-crio-ds.yaml"

					oc delete -f "${deployments_dir}/workaround-9206-crio.yaml"

				fi

				[[ ${SELINUX_PERMISSIVE} == "yes" ]] && oc delete -f "${deployments_dir}/machineconfig_selinux.yaml.in"

				# Delete kata-containers

				pushd "${katacontainers_repo_dir}/tools/packaging/kata-deploy" || { echo "Failed to push to ${katacontainers_repo_dir}/tools/packaging/kata-deploy"; exit 125; }

				oc delete -f kata-deploy/base/kata-deploy.yaml

				oc -n kube-system wait --timeout=10m --for=delete -l name=kata-deploy pod

				oc apply -f kata-cleanup/base/kata-cleanup.yaml

				echo "Wait for all related pods to be gone"

				( repeats=1; for _ in $(seq 1 600); do

				  oc get pods -l name="kubelet-kata-cleanup" --no-headers=true -n kube-system 2>&1 | grep "No resources found" -q && ((repeats++)) || repeats=1

				  [[ "${repeats}" -gt 5 ]] && echo kata-cleanup finished && break

				  sleep 1

				done) || { echo "There are still some kata-cleanup related pods after 600 iterations"; oc get all -n kube-system; exit 1; }

				oc delete -f kata-cleanup/base/kata-cleanup.yaml

				oc delete -f kata-rbac/base/kata-rbac.yaml

				oc delete -f runtimeclasses/kata-runtimeClasses.yaml

6

ci/openshift-ci/cluster/configs/selinux.conf Normal file

View File

@@ -0,0 +1,6 @@
 # Copyright (c) 2020 Red Hat, Inc.
 #
 # SPDX-License-Identifier: Apache-2.0
 #
 SELINUX=permissive
 SELINUXTYPE=targeted

									
										34

ci/openshift-ci/cluster/deploy_webhook.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,34 @@

				#!/bin/bash

				#

				# Copyright (c) 2021 Red Hat, Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				# This script builds the kata-webhook and deploys it in the test cluster.

				#

				# You should export the KATA_RUNTIME variable with the runtimeclass name

				# configured in your cluster in case it is not the default "kata-ci".

				#

				set -e

				set -o nounset

				set -o pipefail

				script_dir="$(realpath "$(dirname "$0")")"

				webhook_dir="${script_dir}/../../../tools/testing/kata-webhook"

				# shellcheck disable=SC1091 # import based on variable

				source "${script_dir}/../lib.sh"

				KATA_RUNTIME=${KATA_RUNTIME:-kata-ci}

				pushd "${webhook_dir}" >/dev/null

				# Build and deploy the webhook

				#

				info "Builds the kata-webhook"

				./create-certs.sh

				info "Override our KATA_RUNTIME ConfigMap"

				sed -i deploy/webhook.yaml -e "s/runtime_class: .*$/runtime_class: ${KATA_RUNTIME}/g"

				info "Deploys the kata-webhook"

				oc apply -f deploy/

				# Check the webhook was deployed and is working.

				RUNTIME_CLASS="${KATA_RUNTIME}" ./webhook-check.sh

				popd >/dev/null

									
										13

ci/openshift-ci/cluster/deployments/configmap_installer_kernel.yaml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,13 @@

				# Copyright (c) 2021 Red Hat, Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				# Instruct the daemonset installer to configure Kata Containers to use the

				# host kernel.

				#

				apiVersion: v1

				kind: ConfigMap

				metadata:

				  name: ci.kata.installer.kernel

				data:

				  host_kernel: "yes"

									
										14

ci/openshift-ci/cluster/deployments/configmap_installer_qemu.yaml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,14 @@

				# Copyright (c) 2021 Red Hat, Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				# Instruct the daemonset installer to configure Kata Containers to use the

				# system QEMU.

				#

				apiVersion: v1

				kind: ConfigMap

				metadata:

				  name: ci.kata.installer.qemu

				data:

				  qemu_path: /usr/libexec/qemu-kvm

				  host_kernel: "yes"

									
										9

ci/openshift-ci/cluster/deployments/machineconfig_sandboxedcontainers_extension.yaml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,9 @@

				apiVersion: machineconfiguration.openshift.io/v1

				kind: MachineConfig

				metadata:

				  labels:

				    machineconfiguration.openshift.io/role: worker

				  name: 50-enable-sandboxed-containers-extension

				spec:

				  extensions:

				  - sandboxed-containers

									
										23

ci/openshift-ci/cluster/deployments/machineconfig_selinux.yaml.in
									
										Normal file
									
												View File
												
				@@ -0,0 +1,23 @@

				# Copyright (c) 2020 Red Hat, Inc.

				#

				# SPDX-License-Identifier: Apache-2.0

				#

				# Configure SELinux on worker nodes.

				---

				apiVersion: machineconfiguration.openshift.io/v1

				kind: MachineConfig

				metadata:

				  labels:

				    machineconfiguration.openshift.io/role: worker

				  name: 51-kata-selinux

				spec:

				  config:

				    ignition:

				      version: 2.2.0

				    storage:

				      files:

				      - contents:

				              source: data:text/plain;charset=utf-8;base64,${SELINUX_CONF_BASE64}

				        filesystem: root

				        mode: 0644

				        path: /etc/selinux/config

									
										40

ci/openshift-ci/cluster/deployments/relabel_selinux.yaml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,40 @@

				apiVersion: apps/v1

				kind: DaemonSet

				metadata:

				  name: relabel-selinux-daemonset

				  namespace: kube-system

				spec:

				  selector:

				    matchLabels:

				      app: restorecon

				  template:

				    metadata:

				      labels:

				        app: restorecon

				    spec:

				      serviceAccountName: kata-deploy-sa

				      hostPID: true

				      containers:

				        - name: relabel-selinux-container

				          image: alpine

				          securityContext:

				            privileged: true

				          command: ["/bin/sh", "-c", "

				            set -e;

				            echo Starting the relabel;

				            nsenter --target 1 --mount bash -xc '

				                command -v semanage &>/dev/null || { echo Does not look like a SELINUX cluster, skipping; exit 0; };

				                for ENTRY in \

				                    \"/(.*/)?opt/kata/bin(/.*)?\" \

				                    \"/(.*/)?opt/kata/runtime-rs/bin(/.*)?\" \

				                    \"/(.*/)?opt/kata/share/kata-.*(/.*)?(/.*)?\" \

				                    \"/(.*/)?opt/kata/share/ovmf(/.*)?\" \

				                    \"/(.*/)?opt/kata/share/tdvf(/.*)?\" \

				                    \"/(.*/)?opt/kata/libexec(/.*)?\";

				                do

				                    semanage fcontext -a -t qemu_exec_t \"$ENTRY\" || semanage fcontext -m -t qemu_exec_t \"$ENTRY\" || { echo \"Error in semanage command\"; exit 1; }

				                done;

				                restorecon -v -R /opt/kata || { echo \"Error in restorecon command\"; exit 1; }

				            ';

				            echo NSENTER_FINISHED_WITH: $?;

				            sleep infinity"]

Compare commits

5325 Commits 3.2.0 ... burgerdev/

25 .github/actionlint.yaml vendored Normal file Unescape Escape View File

2 .github/cargo-deny-composite-action/cargo-deny-generator.sh vendored Unescape Escape View File

2 .github/cargo-deny-composite-action/cargo-deny-skeleton.yaml.in vendored Unescape Escape View File

90 .github/dependabot.yml vendored Normal file Unescape Escape View File

8 .github/workflows/PR-wip-checks.yaml vendored Unescape Escape View File

37 .github/workflows/actionlint.yaml vendored Normal file Unescape Escape View File

104 .github/workflows/add-backport-label.yaml vendored Unescape Escape View File

59 .github/workflows/add-issues-to-project.yaml vendored Unescape Escape View File

44 .github/workflows/add-pr-sizing-label.yaml vendored Unescape Escape View File

33 .github/workflows/auto-backport.yaml vendored Unescape Escape View File

412 .github/workflows/basic-ci-amd64.yaml vendored Normal file Unescape Escape View File

147 .github/workflows/basic-ci-s390x.yaml vendored Normal file Unescape Escape View File

132 .github/workflows/build-checks-preview-riscv64.yaml vendored Normal file Unescape Escape View File

130 .github/workflows/build-checks.yaml vendored Normal file Unescape Escape View File

297 .github/workflows/build-kata-static-tarball-amd64.yaml vendored Unescape Escape View File

281 .github/workflows/build-kata-static-tarball-arm64.yaml vendored Unescape Escape View File

267 .github/workflows/build-kata-static-tarball-ppc64le.yaml vendored Normal file Unescape Escape View File

86 .github/workflows/build-kata-static-tarball-riscv64.yaml vendored Normal file Unescape Escape View File

309 .github/workflows/build-kata-static-tarball-s390x.yaml vendored Unescape Escape View File

15 .github/workflows/cargo-deny-runner.yaml vendored Unescape Escape View File

33 .github/workflows/ci-coco-stability.yaml vendored Normal file Unescape Escape View File

34 .github/workflows/ci-devel.yaml vendored Normal file Unescape Escape View File

26 .github/workflows/ci-nightly-s390x.yaml vendored Normal file Unescape Escape View File

19 .github/workflows/ci-nightly.yaml vendored Unescape Escape View File

31 .github/workflows/ci-on-push.yaml vendored Unescape Escape View File

124 .github/workflows/ci-weekly.yaml vendored Normal file Unescape Escape View File

469 .github/workflows/ci.yaml vendored Unescape Escape View File

37 .github/workflows/cleanup-resources.yaml vendored Normal file Unescape Escape View File

100 .github/workflows/codeql.yml vendored Normal file Unescape Escape View File

45 .github/workflows/commit-message-check.yaml vendored Unescape Escape View File

12 .github/workflows/darwin-tests.yaml vendored Unescape Escape View File

25 .github/workflows/docs-url-alive-check.yaml vendored Unescape Escape View File

55 .github/workflows/gatekeeper-skipper.yaml vendored Normal file Unescape Escape View File

53 .github/workflows/gatekeeper.yaml vendored Normal file Unescape Escape View File

50 .github/workflows/govulncheck.yaml vendored Normal file Unescape Escape View File

13 .github/workflows/kata-runtime-classes-sync.yaml vendored Unescape Escape View File

82 .github/workflows/move-issues-to-in-progress.yaml vendored Unescape Escape View File

41 .github/workflows/osv-scanner.yaml vendored Normal file Unescape Escape View File

126 .github/workflows/payload-after-push.yaml vendored Unescape Escape View File

55 .github/workflows/publish-kata-deploy-payload-amd64.yaml vendored Unescape Escape View File

60 .github/workflows/publish-kata-deploy-payload-arm64.yaml vendored Unescape Escape View File

59 .github/workflows/publish-kata-deploy-payload-s390x.yaml vendored Unescape Escape View File

90 .github/workflows/publish-kata-deploy-payload.yaml vendored Normal file Unescape Escape View File

62 .github/workflows/release-amd64.yaml vendored Unescape Escape View File

62 .github/workflows/release-arm64.yaml vendored Unescape Escape View File

79 .github/workflows/release-ppc64le.yaml vendored Normal file Unescape Escape View File

64 .github/workflows/release-s390x.yaml vendored Unescape Escape View File

349 .github/workflows/release.yaml vendored Unescape Escape View File

58 .github/workflows/require-pr-porting-labels.yaml vendored Unescape Escape View File

58 .github/workflows/run-cri-containerd-tests.yaml vendored Unescape Escape View File

93 .github/workflows/run-k8s-tests-on-aks.yaml vendored Unescape Escape View File

115 .github/workflows/run-k8s-tests-on-amd64.yaml vendored Normal file Unescape Escape View File

87 .github/workflows/run-k8s-tests-on-arm64.yaml vendored Normal file Unescape Escape View File

81 .github/workflows/run-k8s-tests-on-ppc64le.yaml vendored Normal file Unescape Escape View File

48 .github/workflows/run-k8s-tests-on-sev.yaml vendored Unescape Escape View File

48 .github/workflows/run-k8s-tests-on-snp.yaml vendored Unescape Escape View File

47 .github/workflows/run-k8s-tests-on-tdx.yaml vendored Unescape Escape View File

144 .github/workflows/run-k8s-tests-on-zvsi.yaml vendored Normal file Unescape Escape View File

146 .github/workflows/run-kata-coco-stability-tests.yaml vendored Normal file Unescape Escape View File

331 .github/workflows/run-kata-coco-tests.yaml vendored Normal file Unescape Escape View File

110 .github/workflows/run-kata-deploy-tests-on-aks.yaml vendored Normal file Unescape Escape View File

69 .github/workflows/run-kata-deploy-tests.yaml vendored Normal file Unescape Escape View File

70 .github/workflows/run-kata-monitor-tests.yaml vendored Normal file Unescape Escape View File

89 .github/workflows/run-metrics.yaml vendored Unescape Escape View File

42 .github/workflows/run-nydus-tests.yaml vendored Unescape Escape View File

54 .github/workflows/run-runk-tests.yaml vendored Normal file Unescape Escape View File

37 .github/workflows/run-vfio-tests.yaml vendored Unescape Escape View File

60 .github/workflows/scorecard.yaml vendored Normal file Unescape Escape View File

32 .github/workflows/shellcheck.yaml vendored Normal file Unescape Escape View File

35 .github/workflows/shellcheck_required.yaml vendored Normal file Unescape Escape View File

20 .github/workflows/stale.yaml vendored Normal file Unescape Escape View File

37 .github/workflows/static-checks-dragonball.yaml vendored Unescape Escape View File

49 .github/workflows/static-checks-self-hosted.yaml vendored Normal file Unescape Escape View File

226 .github/workflows/static-checks.yaml vendored Unescape Escape View File

29 .github/workflows/zizmor.yaml vendored Normal file Unescape Escape View File

3 .gitignore vendored Unescape Escape View File

83 CODEOWNERS Unescape Escape View File

5 Makefile Unescape Escape View File

5325 Commits

3.2.0 ... burgerdev/

25

.github/actionlint.yaml vendored Normal file

View File

2

.github/cargo-deny-composite-action/cargo-deny-generator.sh vendored

View File

2

.github/cargo-deny-composite-action/cargo-deny-skeleton.yaml.in vendored

View File

90

.github/dependabot.yml vendored Normal file

View File

8

.github/workflows/PR-wip-checks.yaml vendored

View File

37

.github/workflows/actionlint.yaml vendored Normal file

View File

104

.github/workflows/add-backport-label.yaml vendored

View File

59

.github/workflows/add-issues-to-project.yaml vendored

View File

44

.github/workflows/add-pr-sizing-label.yaml vendored

View File

33

.github/workflows/auto-backport.yaml vendored

View File

412

.github/workflows/basic-ci-amd64.yaml vendored Normal file

View File

147

.github/workflows/basic-ci-s390x.yaml vendored Normal file

View File

132

.github/workflows/build-checks-preview-riscv64.yaml vendored Normal file

View File

130

.github/workflows/build-checks.yaml vendored Normal file

View File

297

.github/workflows/build-kata-static-tarball-amd64.yaml vendored

View File

281

.github/workflows/build-kata-static-tarball-arm64.yaml vendored

View File

267

.github/workflows/build-kata-static-tarball-ppc64le.yaml vendored Normal file

View File

86

.github/workflows/build-kata-static-tarball-riscv64.yaml vendored Normal file

View File

309

.github/workflows/build-kata-static-tarball-s390x.yaml vendored

View File

15

.github/workflows/cargo-deny-runner.yaml vendored

View File

33

.github/workflows/ci-coco-stability.yaml vendored Normal file

View File

34

.github/workflows/ci-devel.yaml vendored Normal file

View File

26

.github/workflows/ci-nightly-s390x.yaml vendored Normal file

View File

19

.github/workflows/ci-nightly.yaml vendored

View File

31

.github/workflows/ci-on-push.yaml vendored

View File

124

.github/workflows/ci-weekly.yaml vendored Normal file

View File

469

.github/workflows/ci.yaml vendored

View File

37

.github/workflows/cleanup-resources.yaml vendored Normal file

View File

100

.github/workflows/codeql.yml vendored Normal file

View File

45

.github/workflows/commit-message-check.yaml vendored

View File

12

.github/workflows/darwin-tests.yaml vendored

View File

25

.github/workflows/docs-url-alive-check.yaml vendored

View File

55

.github/workflows/gatekeeper-skipper.yaml vendored Normal file

View File

53

.github/workflows/gatekeeper.yaml vendored Normal file

View File

50

.github/workflows/govulncheck.yaml vendored Normal file

View File

13

.github/workflows/kata-runtime-classes-sync.yaml vendored

View File

82

.github/workflows/move-issues-to-in-progress.yaml vendored

View File

41

.github/workflows/osv-scanner.yaml vendored Normal file

View File

126

.github/workflows/payload-after-push.yaml vendored

View File

55

.github/workflows/publish-kata-deploy-payload-amd64.yaml vendored

View File

60

.github/workflows/publish-kata-deploy-payload-arm64.yaml vendored

View File

59

.github/workflows/publish-kata-deploy-payload-s390x.yaml vendored

View File

90

.github/workflows/publish-kata-deploy-payload.yaml vendored Normal file

View File

62

.github/workflows/release-amd64.yaml vendored

View File

62

.github/workflows/release-arm64.yaml vendored

View File

79

.github/workflows/release-ppc64le.yaml vendored Normal file

View File

64

.github/workflows/release-s390x.yaml vendored

View File

349

.github/workflows/release.yaml vendored

View File

58

.github/workflows/require-pr-porting-labels.yaml vendored

View File

58

.github/workflows/run-cri-containerd-tests.yaml vendored

View File

93

.github/workflows/run-k8s-tests-on-aks.yaml vendored

View File

115

.github/workflows/run-k8s-tests-on-amd64.yaml vendored Normal file

View File

87

.github/workflows/run-k8s-tests-on-arm64.yaml vendored Normal file

View File

81

.github/workflows/run-k8s-tests-on-ppc64le.yaml vendored Normal file

View File

48

.github/workflows/run-k8s-tests-on-sev.yaml vendored

View File

48

.github/workflows/run-k8s-tests-on-snp.yaml vendored

View File

47

.github/workflows/run-k8s-tests-on-tdx.yaml vendored

View File

144

.github/workflows/run-k8s-tests-on-zvsi.yaml vendored Normal file

View File

146

.github/workflows/run-kata-coco-stability-tests.yaml vendored Normal file

View File

331

.github/workflows/run-kata-coco-tests.yaml vendored Normal file

View File

110

.github/workflows/run-kata-deploy-tests-on-aks.yaml vendored Normal file

View File

69

.github/workflows/run-kata-deploy-tests.yaml vendored Normal file

View File

70

.github/workflows/run-kata-monitor-tests.yaml vendored Normal file

View File

89

.github/workflows/run-metrics.yaml vendored

View File

42

.github/workflows/run-nydus-tests.yaml vendored

View File

54

.github/workflows/run-runk-tests.yaml vendored Normal file

View File

37

.github/workflows/run-vfio-tests.yaml vendored

View File

60

.github/workflows/scorecard.yaml vendored Normal file

View File

32

.github/workflows/shellcheck.yaml vendored Normal file

View File

35

.github/workflows/shellcheck_required.yaml vendored Normal file

View File

20

.github/workflows/stale.yaml vendored Normal file

View File

37

.github/workflows/static-checks-dragonball.yaml vendored

View File

49

.github/workflows/static-checks-self-hosted.yaml vendored Normal file

View File

226

.github/workflows/static-checks.yaml vendored

View File

29

.github/workflows/zizmor.yaml vendored Normal file

View File

3

.gitignore vendored

View File

83

CODEOWNERS

View File

5

Makefile

View File

12

README.md

View File